contrasts as phonemic elements

bnhpct · June 22, 1993, 5:27pm

[From: Bruce Nevin (Tue 930622 13:16:16 EDT)]

( Bill Powers (930621.1600 MDT) --

Bruce Nevin (Mon 930621 14:44:33 EDT) --

Bill:

>> Your comments on the grape and the elephant seem to assume
>>that the grape and the elephant are distinguished because they
>>REALLY ARE different, whereas phonemes are distinguished
>>because they only SEEM different.

Bruce:
>Nope. There are no phonemes there to be distinguished.
>Phonemes are apparently present only because a perception of
>contrast is controlled.

Try again; we'll get there yet. If you control and perceive
contrast, then what you perceive is contrast. You say, "Wow,
there's some contrast." You may decide that it's too much, and do
something to make it less, or that it's too small, and do
something to make it greater. But when you're done, you still
have only what you started with: contrast.

And that's what phonemes are. They are phonemic distinctions.

You can't make the contrast between phonemes too much. There are
physiological limits to how far you can extend the active articulators,
there are acoustic limits on the effect of some articulatory changes, and
(most importantly) there are limits on combinatorial possibilities for
contrastable articulations that have contrastive acoustic effects, and
for contrastive acoustic effects that result from contrastable articulations.

The contrast might be made less for various reasons. You might not
need to distinguish t from d for recognition of the words in "it edited it",
for example. The contrast might be made more, for example if the person
didn't recognize the intended words first time, in noisy conditions, etc.
An increase in contrastiveness is traditionally called "fortis" articulation
for consonants, and a decrease in contrastiveness is called "lenis"
articulation. This pair of terms does not apply to varying contrastiveness
of vowels, though the terms "peripheral" vs. "centralized" often apply
(referring to distribution in the "vowel triangle" in the acoustic space
defined by making F1 and F2 the axes of a Cartesian graph). However,
making contrast less is generally called "lenition" and increasing it
is sometimes called by the parallel term,"fortition".

If I say "This one is butting out" with the alveolar flap [D] that is
ambiguous between t and d, and you say "where are the buds" (and are
not just making a pun), my clarification is "I said buTTing out, not
buDDing out." Even if I omit the second phrase, it is implicit and
understood. I would never say "I said buTTing out, not bu[D]ing out"
(where [D] reverts to the lenis articulation that confused you), and
you would never understand that as the implicit juxtaposition were I
to omit saying the second phrase. Clearly, what is controlled here is
the contrast and not the individual sound segment or gesture-combination
segment.

The notion that there must be something like the letter t or the letter d
there, occupying a sequential position among other such elements making
up the word "butting" or "budding", is an artifact of our alphabetic way
of representing speech by letters. Visual perceptions do not align
well with the combination of acoustic and articulatory perceptions
that constitute speech. It works because the graphical objects that
are letters are taken as representing the contrasts. Ignoring the
problems of English orthography (which arise in large part because
the language has changed since the writing was standardized), t in
the string "butting" is a representation of the contrast between
"butting" and "budding", "bunning", "bussing", "buzzing", etc.; or,
more compactly stated, between the sound/gesture features that
differentiate those words from one another and the corresponding
features (perceptions) in "butting".

It works, too, because there is precedent in how language is
acquired. In a pre-speech infant, there are sound/gesture combinations
that are just part of its "play," producing behavioral outputs and
taking in their perceptual consequences. At first, the infant can
produce little more than things we might write "ma", "ba", and "pa"
(or m@, b@, p@ with some indistinct, centralized sound for @) and
"nga," "ga," "ka," because the larynx has not descended and there is
little room for movement of the tongue in the mouth independent of the
mandible. (Simians never get much past this, that's an important
reason they don't have speech.) But these are just accidental
cooccurrences of gestures with the lips, the tongue (undifferentiated),
the velum, and the larynx, in various combinations. Later, as
the larynx descends and there is more space for the tongue to move,
other possible combinations appear. During all this time, the
parents and other adults are recognizing syllables and words much
as we recognize horses and bunnies in the shapes of clouds. "She
said `mama,' I know she did!" Sooner or later, the child comes to
recognize some of the adult productions as repetitions of her productions,
and begins to produce utterances as deliberate repetitions of adult
words, with a growing sense of their significance.

However, there are still no phonemes. Every word is an event with no
systematic relation to other words. At some point there comes control
of the contrasts between similar words, and control of the contrasts
themselves--

larynx open (voiceless) vs. larynx at the critical degree
of closure for voicing

velum up vs. velum down (nasalization)

wide opening vs. mid opening vs. critical closure (for turbulence) vs.

eclosure
complete closure for oral articulators (tongue and lips)

tongue tip vs tongue blade vs. tongue back (vs. root in many languages)

teeth vs. alveolar ridge as passive articulator with tongue (e.g.
s-sh contrast in English, part of w-v contrast)

It is not the specific gestures that the child controls--she's been doing
that all along--its the contrasts between them, the significant
differentiating of them along orthogonal dimensions for contrast.
(The specific parameters and values I have listed are roughly those
used at Haskins Laboratories.) The sound/gesture combinations that the
child has been controlling all along in babbling play now become
exploited as ways of representing the contrasts between words. Whereas
before there was a continuum of sound/gesture possibilities, now a
polarization is imposed.

You could argue that the child controls "for" an ideal sound/gesture
for each phoneme-segment, which just happens to contrast with each
other sound/gesture phoneme-segment in the language, and that
control is imperfect because gain varies and because control of
nearby segments disturbs control of the current segment ("coarticulation
effects"). This leaves the ready recognition of divergent pronunciations,
and the systematic variation of one's own dialect according to social
context, still to be explained and somewhat puzzling. If the child is
controlling contrast, you get that explanation for free. Changes from
one dialect to another are systematic, preserving all or most of the
contrasts (sufficient to enable word recognition, or else by definition
the dialects would have diverged into distinct languages).

Incidentally, Chomsky and others never understood this aspect of
Harris's work. The attack on "taxonomic phonemics" by Chomsky and
Halle in the early 1960s applies to many other linguists' work, but
not to Harris's. For the others, contrast was to be defined by
analysis of "the data of pronunciation." For Harris, contrast was
given as an observational primitive by the Test. This makes a very
great difference.

Bruce
bn@bbn.com