[From Bruce Nevin (2002.22.25 18:02 EDT)]
Rick Marken (2002.10.07.0900)–
While it’s true that a speaker is the most reliable source of the sound
waves that are heard as “rose” I believe it is also possible
that all kinds of physical processes (leaves rustling, brooks babbling,
etc.) could combine to produce the sound “rose”.
For our hypothetical speaker of Japanese the identical sound of the
babbling brook or rustling leaves would be the sound of the word
los meaning “woodtick” (or “wobble”). Where
then is the word rose? Or for that matter the word los? Do
you not see a difficulty with a naive realist position about words?
(“Naive” here is a technical term, not a pejorative.)
I think the way that reality maps into the
perception of words is surprising[ly] well understood, as demonstrated by
the remarkable success of word recognition systems. Of course, those
systems are not perfect; they are speaker dependent and reach only 97%
accuracy on the trained speaker.
These systems do not just rely on the sounds in the speech signal. They
rely upon a categorization of sounds as produced by the particular
speaker for certain words, plus a lexicon mapping their notion of the
phonemes of English onto candidate spellings. A representation of the
sounds produced by one speaker mapped onto letters and constrained by a
list of possibilities. Not quite so simple.
in your discussion of phonetics you say
“An oral stop consonant such as p, t, k, b, d, g is signaled by
silence in the speech signal”. You are saying here that
perception is a function of aspects of the acoustic speech signal; in the
case of stops it’s a silent period.
A silence may also ‘signal’ the end (either end) of an utterance, or a
pause. It can be part of a stop consonant only if followed and/or
preceded by 50ms formant transitions such as those that I described. But
the identical sound (silence plus formant transitions) may be any of
several phonemes. One example is that which Bill quoted from the chapter
I sent him. Another: we perceive a d in rider and a t in
writer but the difference (in many dialects) resides not in the
vicinity of the (brief) silence of the medial flap, or in the formant
transitions, but in the length of the preceding vowel. Where is the
phoneme /d/ or the phoneme /t/? Do you not see a difficulty with a naive
realist position about phonemes? Such a position was most cogently
advanced by Bernard Bloch in papers published in 1948 and 1952. He argued
that the only data for defining phonemes must be the sounds of speech,
and it must be possible to define phonemic contrasts on the basis of
contrasted occurrences of these sounds (their ‘distribution’). This
position was effectively demolished by Noam Chomsky in various papers
encapsulated in 1964 in his book Current Issues in Linguistic
Theory.
The fundamental data for defining the phonemes, as shown by Harris in the
1940s (and agreed by Chomsky, in footnotes at least), are language users’
perceptions that some pairs of utterances are repetitions, and others are
not. Utterances can be segmented into sequential segments and/or
concurrent features, and the differences between the utterances (those
that are not repetitions) can be located in particular segments or
features. By various operations of substitution (and testing again with
native speakers) and shifting of phonetic detail from one segment to
another (as in the vowel length of writer/rider turning out to
distinguish one consonant from another, rather than introducing a
distinction between vowels) one can arrive at a representation for a
least set of ‘phonemes’. These are not real things found directly in the
sounds of speech. They are representations for the distinctions between
utterances that native speakers of the language make in the first place.
It has long been known that there are always alternative ways of
representing these distinctions as “the phonemes” of a given
language (Yuen-Ren Chao (1934), “The non-uniqueness of phonemic
solutions of phonetic systems”, Bulletin of the Institute of
History and Philology, Academia Sinica, Vol. IV, Part 4, 363-97;
Repr. in Martin Joos (1957), Readings in Linguistics). This was a
paradoxical problem for those linguists who assumed that phonemic
contrasts could be found “objectively” in the sounds of a
language. They cannot. They can only be found in the judgements of native
speakers as to which utterances are repetitions and which are not.
Words (and phonemes locating the contrasts between them) exist only as
controlled perceptions. Sounds or writings are means of controlling those
perceptions, but absent the controlling native speaker they are only
noises or meaningless strings of marks like chgi’wa:lujan’u’asjuy, not
words. The noises and marks have, we presume, some reality apart
from our perception of them; the words which language users may control
by means of them subsist only as controlled perceptions. They have no
other reality.
/Bruce
Nevin
···
At 09:02 AM 10/7/2002 -0500, Richard Marken wrote: