Replies to Bill Powers and Bruce Nevin.
···
++++++++++++++++++++++++++++
[From Bill Powers (930528.0930 MDT)]
Tom Bourbon (930528...)
Tom, you're doing a great job of sticking to fundamentals in your
discussions with Bruce Nevin. I particularly like your comment that saying
"no" in a particular way again and again simply reflects the existence of
environmental (and inner) constraints.
To this I would like to add that people do NOT say "no" the same way
every time. It only sounds as if they do because we perceive the sound
with a "no" recognizer.
+++++++++++++++++++++++++++++++++
TB (now):
Bill, I couldn't have been doing a very good job if you had to add that
comment. That was part of what I wanted to say. I discussed variations
in lower level reference signals, and in articulatory actions, and in
acoustic waveforms, while a person supposedly repeats the same speech
sounds. But I did not get to the point that the "no" that matters is
created in the perceiver's sensory functions, and those functions create
"no" out of anything that is "close enough." Perceptual functions create
identical perceptions out of dissimilar inputs, a fact that reduces the
burden on the output side of the system. The system need not produce
exact duplicates of its outputs; they only need to be "close enough" for
the resulting perceptual signal to match the reference signal. If a control
system is to produce a close correspondence between perceptual and
reference signals, the quality of the comparator is more critical than that
of the output function.
+++++++++++++++++++++++++++++++++
+
Bill:
Over quite a range of variation in phoneme shapes, loudness, pitch,
inflection, duration, spatial direction, and background noise the "no"
signal remains significantly larger than any other word-signal. All that's
required for you to hear yourself saying "no" is that all the contributing
factors (including your own speech efforts) leave the result somewhere
in the region that the "no" perceiver responds to in the same way, and
that does not excite any other perceiver unduly. The listener, using similar
but not identical equipment, is subject to similar requirements: not that
a canonical "no" be perceived, but that what is heard not be so far from
the tuning of the "no" perceiver that the perceiver fails to respond with
a significant amount of signal. ...
+++++++++++++++++++++++++++++++++
TB (now):
Right. There is no need to postulate Platonic "forms" lurking in the
background. From critics of PCT, to would-be collaborators, many people
who begin to think more deeply about the theory worry that it *does*
require ideal forms, in the guise of identical reference signals for speech
sounds that somehow occur in all people who use those sounds. Their
belief that the sounds must be identical, and consequently that the
reference signals must be identical, leads them to conclude that only
something like an innate ideal could satisfy those requirements. That is
an idea they will not accept. Their fears are unnecessary. If we think of
people as living control systems, we realize it only matters that each
person satisfy his or her own (perhaps wildly idiosyncratic) higher level
reference signals that are being served by the production and perception
of speech -- speech sounds are not the intended end. All a person need
do to achieve that satisfactory state is produce an input quantity that is
"close enough" -- within the window or the tolerances of the perceptual
function (in the example here, that would be the "'no' detector").
++++++++++++++++++++++++++
Bill:
Suppose that in your (Tom's) three-line experiment you tell the
participants to produce "a triangle." There is clearly a large range of
configurations that would be perceived (classified) as a triangle, so the
final figure that results still has many degrees of freedom at the
configuration level. However, at the level in question, many different
configurations would lead to the same description: "a triangle." So the
participants could agree that they are indeed maintaining a triangle even
though an observer would find that disturbances of shape are not
particularly well resisted. In fact, because the shape is not specified or
agreed upon at the configuration level, both participants would have to
relax control at that level in order to avoid conflict.
+++++++++++++++++++++++++++
TB (now):
Yes, indeed. I have asked people to do that, or to find as many ways as
possible to arrange the lines other than "three in a row." The results
include upright and inverted "triangles" (of various and variable
configurations); a ramp or steps down to the right, or to the left; and a
number of variable patterns, in which the pairs created temporally varying
relationships or sequences. (Some folks are pretty creative at this "simple
tracking task.") The cooperative task we have recently discussed on the
net involves two people, kown as "dyads," in the social literature where
we cannot publish -- I think of it as a duet for two hands on two handles.
A year ago, just before I left the university where I had a nice supply of
willing participants, I was debugging a new program. I asked groups of
four (quartets, using four handles) to "make a shape on the screen."
(Each handle affects X or Y for one corner; two corners are thus affected
by the four people. That arrangement limits the shapes to squares and
rectangles.) Each quartet came to agreement and created a shape that
was defended against programmed random disturbances, in just the
manner described by Bill. When I added a mouse, so that a fifth person
could affect the shape, the quartet still succeeded.
I began those studies (which I had forgotten about in the confusion of my
recent moves) to see if I could still use simple independent PCT models
to represent the four people, or perhaps to replace one or more people.
(I could.) But these studies also bear on the point Bill is making
concerning Bruce's posts on speech: however each person might
conceive the intended shape, many different configurations on the screen
are called by the name of the intended shape and are defended against
disturbances. As goes the hand, so goes the tongue.
Bill clarified that idea in his next lines.
++++++++++++++++++++++++++
Bill:
When an American speaker of English hears a Spanish "no" and then an
English "no," the actual phonemes involved are different enough that a
sound spectrograph or a linguist can easily perceive two different words.
However, the non-linguist English speaker (and probably the Spanish
speaker too) simply hears "no" in each case: the same perceptual signal
results. Speakers of the two languages are satisfying different sets of
constraints on many perceptions involved in speech, not just the sound
of a particular word; these other constraints lead to the use or omission
of the diphthong part of "o". The differences in pronunciation make some
sort of difference to the two speakers, but it is not a difference in what
word is perceived. When the English suitor says "Yes, yes, yes," and the
Spanish object of affection says "No, no, no," the English suitor has no
difficulty hearing the "no," whether or not he chooses to hear the
meaning. ++++++++++++++++++++++
TB (now):
Well put, sir. There is nothing I would add.
+++++++++++++++++++++++++++++
[From Bruce Nevin (Fri 930528 14:59:35)]
After a comment to Bill Powers, Bruce:
( Tom Bourbon (930528.0810) ) --
V
========================== I try to
understand the implication (sometimes
the claim) of great significance in the fact that changes in configurations
of the speech apparatus lead to changes in the sounds that are produced
and perceived. Every time, I come back to the thought that those
configurations vary any way necessary to produce perceptions (requested
by a higher level) of consequences other than the configurations and
sounds themselves. I still cannot see why the facts of configuration-
sound associations are different from, for example, the facts of various
configuration-movement associations we would observe were we to as
carefully monitor the activities of various motor units and joints in hands
and arms during arm waving, stick wiggling, gesturing and typing.
Bruce said:
There is another point that I have been trying to communicate. Whatever
the means may be by which we control the perception of contrast
between non-repetitions--by perception of acoustic invariants, by
imagined (reconstructed) perception of intended gestures, or whatever--it
is essential that we effect that control in a recognizable way. This is
where we get the business about norms being preset in all participants.
++++++++++++++++++++++++++++++
TB (now):
Bruce, when I try to understand what you mean by this, I come to the
same conclusion you quoted above. Perhaps it would help were you to
tell me what you mean when you say, " ... it is essential that we effect
that control in a recognizable way. This is where we get the business
about norms being preset in all participants." Do you mean that we must
effect the control by using our articulators in the same way, which we all
then recognize? If someone tells me, "Don't look down; your fly is
unzipped," is it necessary for me to recognize how the person produced
the sounds, in order to get the point of the message? I need your help,
if I am to see anything here other than one person *using* speech sounds
to inform another, who immediately "gets the point" and, with face
glowing red, tries to be inconspicuous while zipping his fly. To me, it
seems the speaker is not "producing speech" and the listener is not
"perceiving speech," not as the end for either of them. Instead, the
actions and perceptions of speech are means to ends. How does that
entail norms for controlling sounds in a recognizable way? I am lost on
this point.
++++++++++++++++++++
Bruce:
If I say "no," I am controlling contrasts such as those between the
following pairs of words:
no gnaw
no knee
no gnu
no doe
no toe
no mow
no bow
no row
no ...
I am not controlling a particular configuration for the word "no," rather,
I am controlling the distinctness of my pronunciation from the
pronunciation of any other possible word.
+++++++++++++++++++++++++++
Tom (now):
My situation grows more desperate. I get the feeling I must be a believer
in a flat-earth who is ttrapped at the twentieth reunion of Christopher
Columbus' crew -- lost in the wrong view of things. Where you say you
are *controlling* contrasts between pairs of words, I see you
*producing* contrasts that you use to achieve another end. In my state
of ignorance (I am not joking or exaggerating on that, as you probably
concluded long ago) the words you list are different perceptions
associated with other different perceptions. If your intention is to use
words to identify each thing as different from another, you *must* use
a different token for each thing, and I must perceive them as different.
With those rather natural and understandable constraints working on both
sides of the interaction, if either of us is to control his perceptions of
"getting my idea across," the words *must* be different, but *how* they
differ is not in any ultimate sense important.
If your intention is to list pairs in which the two elements differ, then you
will control perceptions of difference for their own sake, but that is a
rather uncommon thing for people to do. In this post, and the previous
one to which I replied, the only instance of you "controlling difference for
its own sake" is in the list above. Wherever else I look, I see you *using*
differences to control other perceptions. *Trying* to control other
perceptions; obviously it is not working very well with me -- I am still
lost.
++++++++++++++++++++++++++++
Bruce said:
Now, if you are controlling for contrasts between many of the same
words as I, then you recognize the contrast between "no" and
"naw/gnaw" regardless of the particular way I pronounce these words.
My "accent" may be quite different, but if you hear me say "toe" and
"saw" and "rhododendron" and various other words, even though the
acoustic consequences are different for me and for you, you can
reconstruct the contrast between intended gestures and intended words.
+++++++++++++++++++++++++++
Tom (now):
Do I perceive only those perceptions I control?
Will I recognize the contrasts *regardless* of the particular way you
pronounce the words? Differences in "accent" seem too small to justify
a claim that I will recognize in spite of *any* way you pronounce. In the
end, if you intend for me to let you know I perceived the difference, you
will adjust your pronunciation any way necessary to achieve that end;
otherwise, you will go on controlling for a particular way of pronouncing,
whether or not I catch the distinction.
+++++++++++++++++++++++++++
Bruce says:
It is this--the fundamental fact of repetition, as distinct from imitation --
that sets control of language apart from control in which one is not
obligated to advertise one's intentions in a reliable way to others.
++++++++++++++++++++++
Tom (now):
I don't know what to make of this remark. On the surface, I have the
impression it is intended to convey something of real import, but I do not
know what that might be. Again, Bruce, I say that in earnest, not as a
taunt or a rejection. I do not understand what you mean by those words.
Repeating words and imitating words are two different ways of using
words. Repeating hand movements and imitating hand movements are
two different ways of using hand movements. But if, before I speak or
move, I am required to advertise my intentions in a reliable way to others,
I do not know where to begin.
+++++++++++++++++++++++++
Bruce says:
How do you model "repeating X" rather than "imitating X"? Sounds like
X is a category. How do you model "person B repeating the X that
person A produced" as distinct from B imitating A's behavioral outputs?
Is it not the case that both A and B must have a prior agreement as to
what behavioral outputs constitute an X? That is, what behavioral
outputs advertise the intention to produce an X?
++++++++++++++++++++++++
Tom (now):
At last, earth beneath my feet! If, for "X," you allow me to substitute
"movement of a cursor up and down on the screen in a triangular
waveform," I do it like this.
A. For Imitating,
There must be something to imitate. Let that which is to be imitated be
the momentary position of a target, the position of which is driven by a
computer program step and which, as a function of time, is the triangular
waveform shown below. Now:
/\ /\ /\
/ \/ \/ \, and so on.
Cursor := Handle + Disturbance (here, let Disturbance = zero);
Handle := Handle - k * ("perceived position of cursor re target" minus
"intended position of cursor re target")
This is a standard, undisturbed pursuit tracking task, which PCT models
with great accuracy. The particpant can make the cursor accurately track
the target, which is to say, the positions of the cursor imitate those of the
target (incidentally, so, too, does the waveform of the participant's
hand movements -- the person's actions).
B. For Repeating:
For Repeating, there must be something to imitate. Let that be the
person's remembered positions of the cursor as a function of time during
the Repeating task. (The target is not on the screen.) Now the model step
for the person changes, because the source of the reference signal is
different, but the cursor is still determined as it was before.
Cursor := Handle + Disturbance (again, let Disturbance = zero);
Handle := Handle - k * ("perceived position of cursor at time x" minus
"remembered position of cursor at time x")
where the reference position at any moment is the appropriate time-
indexed amplitude on the person's remembered (imagined, intended)
waveform. The person can do this easily.
For either Repeating or Imitating, the same actions will occur. So, too.
will the same positions of the cursor as a function of time.
Now let us imagine a new Imitating condition, with a new participant (A),
in which yet another person (B), not a program step, determines the
position of the target at each moment. Let the Disturbance remain zero.
Assume the target-controlling person creates the triangular path shown
above. (I stay with that because I am not good at drawing random paths in
ASCII.) In the Imitating task, which must necessarily occur first, the
result would be exactly the same as before. Given that the cursor remained
close to the position of the target throughout the run, we could say -- .
What *could* we say? Could we say that the participant imitated the actions
of the target-controller? That would be true only if the dynamics of the
devices by which the two people controlled their respective marks on
the screen were identical.
(The person *might* make his or her handle movements match those of
the target-controller. More on that later.)
Assume identical control devices. If a random disturbance had acted on
the target, so that in order for the target-controlling person to create the
triangular path of the target the person was required to move the hand
to oppose the disturbance, then the target-controlling person and the
participant would have created the same patterns of movement for the
cursor, by disparate actions of their hands (their manual articulators),
and I can say: If the triangular pattern is "X", then it is not the case
that both B (the target-controlling person) and A (the participant) must
have a prior agreement as to what behavioral outputs constitute an X.
Neither is it the case that any behavioral outputs advertise the intention
to produce an X.
It is still possible that the participant had been making her or his handle
movements match memories of those by the target-controlling pereson during
the Imitating task. That would still be a perceptual tracking task, but
one that might come a little closer to the contention of the theory that
says when people converse, they duplicate ideas or hypotheses about one
another's actions. We can test for whether the participant is controlling
actions, or a perceived consequence of the actions.
Let the participant once again use remembered triangular waveform of the
cursor during the undisturbed Imitation task as the time-indexed reference
signal for cursor position. (As the reference signal, we could also use
memories of the handle positions of the target-controlling person, which
were identical to the positions the target, which were nearly identical to
the positions of the cursor) Let the Disturbance be non-zero, and a
random function of time. Now the participant produces the same triangular
waveform for the cursor, but by actions that never occurred before --
the particiant's actions must negate the effects of the disturbance *and*
produce the remembered waveform. The model step that duplicates this
result is identical -- absolutely identical -- to the one used before for
the Repeating task.
The actions vary and oppose the disturbance; consequently, the
perceptual consequence of the actions remain as intended by the
participant. The participant imitates and repeats perceptions, not actions.
The PCT model does the same. That is how I would model initating and
repeating.
As goes the hand, so goes the tongue.
+++++++++++++++++++++++++++++++
What a lousy time go away for a week, but that is exactly what I am
about to do. There will be no mail, no television, no electricity, no water
-- and no csg-l. The only item I will miss is the csg-l. When I return, I will
look for a reply in the megabytes of mail that will arrive during that week.
Until (one week) later,
Tom Bourbon
!
Tom Bourbon
Department of Neurosurgry
University of Texas Houston Medical School Phone: 713-792-5760
6431 Fannin, Suite 7.138 Fax: 713-794-5084
Houston, TX 77030 USA tbourbon@heart.med.uth.tmc.edu