[Martin Taylor 960422 1111]
Bill Powers (960420.0300 MDT)
The night-owl calls again, I see...
Words in general underspecify perceptual experience, not so? I'm still
considering a demo in which one person verbally guides another to create
some sort of result on a computer screen; the guide can't see the
screen. I think this demo could show very dramatically what it is about
experience that words always leave out.
It's not exactly what you describe, but I did something rather like it as
part of the same sleep-deprivation study for which we have so much tracking
data. One person was required to describe to another a route on a map.
The map consisted of a sheet of A3 (or 11 x 17) paper on which were
depicted several named landmarks. Each participant had a map, and neither
could see the other participant or the other's map. The two maps differed
in controlled ways: most landmarks were identical and identically placed,
but some landmarks might be named differently on the two maps, or a
landmark might be missing or duplicated on one or other of the maps. The
"giver"s map had a route marked on it that wound its way from a "Start"
that was marked identically on both maps to an "end" marked only on the
giver's map. The "follower"s job was to mark as closely as possible the
same route on his/her map.
Over the course of the sleep deprivation study, 216 separate dialogues
were recorded. All of them have been transcribed, and are available both
as digitized audio and as transcriptions, together with displays of the
giver's and resulting follower's maps, on a set of 12 CD-ROMs available from
the Linguistic Data Consortium at a nominal price (I think $200 US, but
that's from faulty memory).
I had intended (still do) to try analyzing these dialogues using Layered
Protocol Theory, but since both the actual voice and resulting pictures
are available, anyone can use them for any kind of analysis. I know some
people want to look at changes in voice quality with sleep deprivation and
with the various drugs. Other people want to look at "riskiness" (I'd say
control gain) in tolerating possible error, by either giver or follower
or both. And they could be used to look at how the actual visual impression
of the follower's map agrees with the giver's map when both think they are
satisfied (which should be something like Bill's suggestion.
The official announcement of the availability of the corpus follows.
Martin
ยทยทยท
============================
DCIEM SLEEP DEPRIVATION STUDY MAP TASK CORPUS
Defence and Civil Institute of Environmental Medicine
North York, Ontario, Canada
Human Communication Research Centre
University of Edinburgh & University of Glasgow, UK
under the aegis of
NATO DRG Panel 3 Research Study Group 10 (Automatic Speech Processing)
Corpus Copyright 1995 DCIEM; Distributed by HCRC & LDC
Pre-mastering by Speech Data Services Ltd, Great Malvern, UK.
The DCIEM Sleep Deprivation Map Task Corpus is the product of a
collaboration between HCRC and DCIEM. Like its predecessor, the HCRC Map
Task Corpus, the DCIEM Corpus is a large scale balanced elicitation
experiment for spontaneous speech. It consists of 216 unscripted
task-oriented dialogues produced by 35 normal Canadian adults
participating a sleep-deprivation study. Of the 216 dialogues
approximately 60 dialogues were recorded as control material before
sleep-deprivation began, 138 during sleep deprivation, and 18 after
recovery sleep. During the 60-hour work-filled sleepless period which
began on the second day of recording, subjects were assigned to drug
treatment groups (amphetamines, Modafinil, placebo) on a double-blind
regime.
The map task dialogues were direct analogues of those contained in part
of the HCRC Map Task Corpus and the Chiba Map Task Corpus. Pairs of
speakers worked with slightly different schematic maps of imaginary
locations, collaborating to reproduce on one of the maps the route
preprinted on the other. Neither could see the other's map.
Participants knew that their maps differed but not where or how. No
restrictions were placed on what subjects could say. Different pairs of
maps used over the course of the study presented both a range of
phonological material in the form of labels on landmarks and a balanced
set of differences between maps.
All dialogues took place in studio-like conditions and were recorded on
DAT with a separate channel and close-talking microphone for each
participant. Subjects worked either with the same partner throughout the
week-long study or with 2 different partners, each of whom also had 2
partners. All subjects served both as givers and as receivers of
instructions.
The resulting 216 dialogues include over 175,000 word tokens representing
approximately 1,900 different words. For each dialogue, the Corpus
includes a digitized speech file, an orthographic transcription including
the HCRC Map Task Corpus sgml-type annotations and time-stamps at the
onset of each contribution, scanned model and copied maps, a NIST header
file, and a TEI-entry point. All 12 CD-ROMs of the Corpus also contain
explanatory information and a detailed account of the experimental design
and the speaker characteristics. Part 1 of the Corpus includes all
materials for 54 dialogues which comprise a balanced study of drug
treatment and sleep deprivation. Part 2 contains the rest of the
materials.
The materials have been designed to be easily accessible to users with
different equipment and a variety of needs. All the text files should be
readable and printable via most systems which can be connected to a
CD-ROM reader. The maps are intended for printing via PostScript(TM)
printers, and the speech files are provided with human-readable standard
headers, enabling them to be played by a wide range of environments for
processing sampled speech. Local and public domain software for
accessing the speech on a variety of platforms is also included.
The DCIEM Sleep Deprivation Study Map Task Corpus carries no warranty of
any kind. For more information, consult
The old www.cogsci.ed.ac.uk server
or maptask@cogsci.ed.ac.uk