Cultures are not control systems nor are they rings in crowds

bnhpct · March 29, 1998, 9:01pm

[From Bruce Nevin (980329.1533 EST)]

Bill Powers (980329.1002 sMST)--

From my point of view you attach too much
importance to the social or cultural aspects, as if there were some agency
outside of all individuals that somehow actively influences the individual.

I thought that was what you were resisting.

···

********************************************************************
* *
* PROCLAMATION: *
* *
* I do not claim that a culture, society, custom, convention, etc. *
* is "an agency outside of all individuals that somehow actively *
* influences the individual." A culture is not a control system. *
* A language is not a control system. *
* *
********************************************************************

Can you reread what I have said about language and culture without making
that assumption?

Here is what I am resisting. You and Rick are asserting that culture,
language, etc. are emergent phenomena, side effects of control by many
individuals, and that the rings and arcs in the gather program are an
example. Perhaps that is so, but rings and arcs are not an example because
they are universal, not culture-specific. Some additional process is going on.

People participating in a ring or arc do not control perceptions of it in
order to form it, nor did they control perceptions of rings or arcs in
crowds at any earlier stage in order to learn how to participate in forming
them. In this, formation of rings and arcs is not a cultural phenomenon.
Rings and arcs are simply emergent phenomena.

To pronounce a word does involve controlling perceptions of it (both what
it feels like to produce it and how it appears publically to others), and
those who know how to pronounce it did control perceptions of it at earlier
stages of their lives in order to learn how to pronounce it. Similarly for
any cultural phenomenon.

You have called this additional process "imitation". Imitation and
repetition are not the same. It appears not to be imitation even in the
very early stages of language acquisition. Even young children are learning
what it feels like privately (the means by which to control a perception of
repeating the "same" word, gesture, etc.) with the public manifestation
wobbling behind at a slower pace.

The means for repetition are the reference values for controlling private
perceptions (e.g. pressure on the tongue tip) that result in behavioral
outputs that are public and conventional, that is, they are intended and
expected to be perceived in a certain way by others (e.g. the phoneme /d/
in English). To learn control of these perceptions in the first place (what
it feels like privately to produce the phoneme, word, gesture, etc.), and
thereafter to maintain them "in tune" as it were (so that others recognize
the public repetition that you intended), you concurrently control
perception of those public aspects of your own behavioral outputs, the
publically perceivable phonemes, words, gestures, etc., even as they result
from controlling the immediately controlled private perceptions, as
byproducts of that control. The more immediate and private perceptions are
the means for producing repetitions of "the same" phoneme, word, gesture,
etc., as it feels to produce it. The perception of what is repeated, more
or less as it appears to others, is the means for recognizing that a
repetition (another token of the same type) has occurred.

Look at my (980329.0705 EST) to Rick's "Green Eggs and Ham", that might be
a clearer statement.
..................................................

Here's an extract from earlier tugging back and forth on what you're
resisting (supra-individual control systems) and what I'm resisting (no
more than rings and arcs):

Me (980319.1325 EST)

When you get people controlling a perception of how they appear to others
in comparison to how others appear to them, then you start to get cultural
phenomena.

individuals can control their manner of doing a thing that "everyone"
does.

Me (980320.1022 EST)

There are two sorts of processes. One sort occurs entirely within the
individual (within ego), control of perceptual input. That which the
individual is observing and modelling is occurring outside individuals, ego
included.

(980320.1648 EST corrected statement)

Observed regularities in or generalizations about behavioral outputs are
byproducts of perceptual control and so have no status as a separate
process, you say. Such regularities are an emergent phenomenon, of no more
significance than the rings and arcs observed in the gather program.

Observed regularities in and generalizations about behavioral outputs do have
status as a separate phenomenon because the individuals involved are
controlling perceptions of them as phenomena in their environments and are
setting their own reference levels relative to them.

(980320.1022 EST again) Individuals forming rings and arcs are not

controlling

perceptions of the rings and arcs that they form. Individuals constraining
their manner of doing things in relation to observable norms are
controlling perceptions of the norms that are thereby reflected in their
behavioral outputs.

You dislike descriptive generalizations because they do not apply to any
individual at any time. But what is happening here is that the individuals
are setting their own reference levels of controlled perceptions according
to their own descriptive generalizations about observed congruities of the
behavioral outputs of their fellows. The congruities emerge because their
fellows are all doing likewise. The generalizations are not without
exception and rarely reach closure (with the caveat that we have little
appreciation today of what it is like to live in a small, stable,
monocultural community).

You (980220.0323 MST)

I'd say you have a
tenable proposition here, so now the question is how to test it to see if
we have to accept it.

I'll keep working on that. Help from anyone would be appreciated.

Bill_Powers1 · March 30, 1998, 12:56am

[From Bill Powers (980329.1505 MST)]
Bruce Nevin (980329.1533 EST)]

********************************************************************
* *
* PROCLAMATION: *
* *
* I do not claim that a culture, society, custom, convention, etc. *
* is "an agency outside of all individuals that somehow actively *
* influences the individual." A culture is not a control system. *
* A language is not a control system. *
* *
********************************************************************

OK.

Here is what I am resisting. You and Rick are asserting that culture,
language, etc. are emergent phenomena, side effects of control by many
individuals, and that the rings and arcs in the gather program are an
example. Perhaps that is so, but rings and arcs are not an example because
they are universal, not culture-specific. Some additional process is going
on.

I think you're trying to make a distinction where there is no distinction.
Arcs and rings become cultural when you measure their radii. In Arabia,
they would be much smaller than they would be in America. In some cultures
they might not occur at all. My point is that emergent phenomena emerge
from the specific variables people perceive and control, and from the means
that are used for control.

The point you seem to be making is that people do not control for making
arcs and rings; they are side-effects of the actual control process. In
language, the control is specifically for the forms and usages of language.
But is anybody controlling for language to come to any specific form? Is
the slow drift of meanings directed? At one level, we can say yes, people
do try to talk like other people and to be understood by them. But as
mis-hearings and misinterpretations occur, we see changes in language that
go unresisted. There is no restoring effort; the drifts just go on
indefinitely. So they are also side-effect of whatever is actually
controlled about language.

To pronounce a word does involve controlling perceptions of it (both what
it feels like to produce it and how it appears publically to others), and
those who know how to pronounce it did control perceptions of it at earlier
stages of their lives in order to learn how to pronounce it. Similarly for
any cultural phenomenon.

You seem to be basing a distinction here on whether more than one person at
a time can perceive the same (or a similar) controlled variable. From my
point of view there is no difference: ALL perceptions are private. There
are no actually public perceptions. When we speak in a certain way to
achieve an audible result, we use our own private perceptual functions to
perceive either our own speech or that of others.

You have called this additional process "imitation". Imitation and
repetition are not the same.

I can see that there is a difference in that imitation requires imagining
how our own actions must look or sound from someone else's point of view,
whereas repetition involves only choosing a past perception as a reference
signal and making a present perception match it. Is that the difference you
have in mind?

It appears not to be imitation even in the
very early stages of language acquisition. Even young children are learning
what it feels like privately (the means by which to control a perception of
repeating the "same" word, gesture, etc.) with the public manifestation
wobbling behind at a slower pace.

The means for repetition are the reference values for controlling private
perceptions (e.g. pressure on the tongue tip) that result in behavioral
outputs that are public and conventional, that is, they are intended and
expected to be perceived in a certain way by others (e.g. the phoneme /d/
in English).

But why leave out the equally valid observation that they are intended to
be perceived (heard) by ourselves in the same way? It seems to me that
you're betting everything on kinesthetic control being the primary aspect
of speech that matters, with auditory control being relegated to an
auxiliary role. Why not say that we use our own kinesthetic control systems
(private) as our means of controlling our own auditory experiences (also
private), and that we use control of our own auditory experiences as a
means of controlling what we privately _imagine_ others to be hearing? That
seems simple and straightforward to me -- much more so than the following:

To learn control of these perceptions in the first place (what
it feels like privately to produce the phoneme, word, gesture, etc.), and
thereafter to maintain them "in tune" as it were (so that others recognize
the public repetition that you intended), you concurrently control
perception of those public aspects of your own behavioral outputs, the
publically perceivable phonemes, words, gestures, etc., even as they result
from controlling the immediately controlled private perceptions, as
byproducts of that control. The more immediate and private perceptions are
the means for producing repetitions of "the same" phoneme, word, gesture,
etc., as it feels to produce it. The perception of what is repeated, more
or less as it appears to others, is the means for recognizing that a
repetition (another token of the same type) has occurred.

That sounds very confused to me. Concurrent parallel control of audition
could result in conflict, whereas control of audition by means of
controlling articulation would not. You are arbitrarily leaving out the
fact that at the same time we feel our articulations, we hear (or at least
imagine hearing) the audible result. The primary criterion for
communication can't be that the articulations feel right; it must be that
the auditory result sounds right, because that is all that others receive
from us. We will alter our articulations to make the sound come out right,
which shows that control of the sound is at a higher level than control of
the articulations. And we will not alter the sound we seek to produce as a
way of making the articulations feel right, which shows the same thing.

It will be interesting to see what you find out about the experiments in
which the phonemes are disturbed.

Best,

Bill P.

Bill_Powers1 · March 30, 1998, 9:20am

[From Bill Powers (980328.1243 MST)]

Bruce Nevin (980329.1533 EST)--

********************************************************************
* *
* PROCLAMATION: *
* *
* I do not claim that a culture, society, custom, convention, etc. *
* is "an agency outside of all individuals that somehow actively *
* influences the individual." A culture is not a control system. *
* A language is not a control system. *
* *
********************************************************************

I was so busy with the "outside agency" aspect of this proclamation that I
overlooked the meaning of the last two sentences. I'm sure you don't mean
either of them. Language and culture are products of control systems INSIDE
individuals: namely, the hierarchy of control systems. Neither could exist
without that hierarchy.

What is seemingly overlooked here is that language (or culture) must be
learned by a particular brain as a result of active processes in that
brain. It is not injected from outside. It may be that some of these active
processes are inborn and similar from one person to another, but that does
not make them any less control processes. We all learn to stand up and walk
in very much the same way, but that does not mean that walking is not a
control process. Nor does it mean that "walking" has any existence apart
from the individual's organization, or that some people learn to walk
because everyone else does, too.

You say that arcs and rings are not examples of the kinds of emergent
phenomena involved in language. This is because the individuals involved in
the arcs and rings phenomenon are not intending to produce arcs and rings,
but are only seeking to avoid collisions while seeing proximity to a common
object. You choose to emphasize the privateness of kinesthetic and
proprioceptive phenomena as opposed to the public nature of vocalization,
in which the kinesthetic control produces the public phenomenon only as a
kind of side-effect. You may be correct about the arcs and rings, although
there is no reason to suppose that people can't notice this phenomenon and
produce it on purpose. But your generalization based on this exception
leads you, I think, to an incorrect conclusion about emergent phenomena in
general.

In the arcs and rings phenomenon, the controlled perceptions are private,
just as the proprioceptive aspects of speech are private. Nobody outside
the individual can see that "proximity" (roughly, apparent area of objects
on the retina) is the actual controlled variable. Collisions, following,
and position seeking, on the other hand, are public. They correspond to the
auditory aspect of the language example.

Collisions are avoided not by avoiding bumping into others, but by keeping
perceived proximity at a low level. Following another person is achieved
not by controlling "following," but by maintaining perceived proximity to
that person at a specific nonzero level. At another level of organization,
of course, I designed these proximity control systems as the _means_ of
avoiding collisions and following another person, as well as allowing one
person (the "guru") to seek a goal position while avoiding collisions. I
played the part of the reorganizing system and the higher-level system that
seeks the "public" goals of collision avoidance, following, and
position-seeking. If I had built that level of control systems into the
demo program, the parallel to the two language levels you talk about would
be complete. The system that perceives collisions would keep the perception
of collisions near zero by setting the reference level for proximity to
most objects at zero. The system that wants to follow another person would
set the reference level for proximity to that person to a nonzero
magnitude. The system that seeks a fixed goal position would set the
reference level for proximity of the goal-object to some very high level,
the maximum. As the program is set up, the user sets these reference levels
by filling in data on a parameter-adjustment screen. But if that process
were included in the program as another level of control, the adjustment of
proximity reference levels would be the behavior of a higher order of
control system, above the level of proximity control. The individuals would
adjust their own proximity reference levels, instead of the user's doing it.

In speech, I propose, there is a similar process going on. A higher system
that wants to produce the sound of "L" acts by setting a positive reference
level for (among other things) the kinesthetic system that controls the
pressure of the tongue against the roof of the mouth (just where depending
on the language one learns). Each auditory phoneme-control system adjusts a
set of kinesthetic/proprioceptive reference signals in this way. At
still-higher levels, a syllable-control system sets consecutive patterns of
reference signals for phoneme-control systems, and a word-control system
sets consecutive syllable reference signals. That's the general idea,
anyway, whether or not these are the right levels.

Now: speech without auditory feedback.

Auditory "intelligibility" does not imply that the kinesthetic control
systems are behaving the same way. We can recognize many different voices
and accents as saying "the same thing" even though the articulations vary
greatly from one speaker to the next. The only real test would be to see
whether a linguist could tell whether another person was speaking with or
without masking of auditory feedback, when the linguist did not know what
was intended to be heard (this rules out your test in which you tried to
recognize your own recorded speech). The actual words might be perfectly
recognizable (at a higher, auditory, level), but the articulations might
prove to be quite different, as different as they are when different people
speak.

This would allow us to propose that there are two ways of speaking, one
through control of felt higher-order patterns, and the other through an
intermediate auditory control system that, in turn, sets the articulatory
patterns. This would be similar to the case of loss of proprioception in
limb control. In fact, even animals can learn to control their limbs
without proprioceptive or kinesthetic feedback. Since the animals can still
reach out and touch a target that they can see, the experimenters conclude
that this kinesthetic/proprioceptive feedback is unnecessary. What they
overlook, of course, is that the _quality_ of the reaching and touching
actions has been radically reduced. In the normal animal, there is
proprioceptive control that produces vastly better results than the purely
visual control in the experimental preparation. But if one is attending
only to the qualitative or categorical result, no difference can be seen.
The target is "touched" in either case.

As a parallel, it is possible that when auditory feedback is masked, the
kinesthetic patterns of speech change greatly, but that when the resulting
speech (recorded) is heard by the auditory systems, no significant
differences are heard. For example, the tongue might press five times as
hard against a somewhat different place on the roof of the mouth, yet a
"perfectly good L" would be heard. Judging only from what is heard, the
observer would guess that the articulation has not changed at all, and thus
that the auditory feedback is unnecessary, or at best a side-effect.

My impression is that a sonic spectrograph can show changes in articulation
much too fine to be heard. This is shown by the range of formant ratios
that the spectrograph can reveal but which are heard as the same phoneme.
If this impression is correct, then my argument is strengthened: what
_sounds_ like the same articulation could actually result from very
different articulations. The best way to judge whether articulation control
is primary would then be to insert disturbances of the heard speech and
see, using a sonograph, whether the articulations immediately change.

I guess this brings us back to the same issue: what are the facts?

Best,

Bill P.

bnhpct · March 30, 1998, 9:36pm

[From Bruce Nevin (980330. EST)]

Bill Powers (980329.1505 MST)--
Bill Powers (980328.1243 MST)--

I find two things that I am controlling here, and I am working to avoid conflict.

1. I resist a response to something that I didn't say, so I try to re-say it until it is understood.

2. I want to get closer to the truth of the matter, so I modify my concepts according to insights and suggestions that come up in our dialog.

I think you're trying to make a distinction where there is no distinction.
Arcs and rings become cultural when you measure their radii. In Arabia,
they would be much smaller than they would be in America.

Also in Latin America, but this follows from a much simpler observation that the distance that is appropriate for them as interpersonal space in all respects is smaller than for them than for us, as seen in the distance apart that people have to be in order to carry on a conversation (I'm sure you've read Hall's entertaining and insightful books on this).

In some cultures they might not occur at all.

Either these people would have no common interests or their need for separation from one another would be so great that we would consider them agoraphopic. The people in one of Asimov's robot stories might qualify, who lived in complete isolation and made contact only by telecommunication devices. I doubt that there are any actual instances, and anyway all you would need would be something of sufficient common interest that control for proximity to it would overcome their control for distiance from one another. A people who were reduced to a population of two would qualify, it takes three to form an arc, but that is hardly a cultural matter, even by your reckoning.

My point is that emergent phenomena emerge
from the specific variables people perceive and control, and from the means
that are used for control.

I understand that, and I agree with it.

The point you seem to be making is that people do not control for making
arcs and rings; they are side-effects of the actual control process.

You may be correct about the arcs and rings, although
there is no reason to suppose that people can't notice this phenomenon and
produce it on purpose.

Of course they can. I can say from experience that they sometimes do. But not normally, and not conventionally so in one culture but not at all in another. The forming of rings and arcs is not something by which people differentiate themselves from "those other people" over there.

In
language, the control is specifically for the forms and usages of language.
But is anybody controlling for language to come to any specific form? Is
the slow drift of meanings directed? At one level, we can say yes, people
do try to talk like other people and to be understood by them. But as
mis-hearings and misinterpretations occur, we see changes in language that
go unresisted. There is no restoring effort; the drifts just go on
indefinitely. So they are also side-effect of whatever is actually
controlled about language.

You are not disagreeing with me. To show that, I'll recast what you've said as affirmations that I might have said:

In language, the control is specifically for the forms and usages of language. But nobody is controlling for language to come to any specific form. The slow drift of meanings is not directed. At one level, we can say yes, people do try to talk like other people and to be understood by them. But as mis-hearings and misinterpretations occur, we see changes in language that go unresisted. There is no restoring effort; the drifts just go on indefinitely. They are side-effects of whatever is actually controlled about language.

You seem to be basing a distinction here on whether more than one person at
a time can perceive the same (or a similar) controlled variable. From my
point of view there is no difference: ALL perceptions are private. There
are no actually public perceptions. When we speak in a certain way to
achieve an audible result, we use our own private perceptual functions to
perceive either our own speech or that of others.

An individual
1. perceives the talk of others
2. makes generalizations about what they are doing
3. learns what it feels like to perform consistently with those generalizations
4. controls perceptions of those feelings
5. observes the performance that results from that control
6. compares own performance with the generalizations about that of others
7. adjusts the references for what it feels like to perform "normally" (continuing 3)

Step (6) is a relationship perception comparing perceptions of the behavioral outputs of others (or generalizations about them) with perceptions of one's own behavioral outputs. These three sets of perceptions are all private, of course. But as far as the perceiver is naively concerned those other people (and their actions) are out there in the environment. Furthermore, one's own actions are out there too, commensurate for comparison in the relationship perception, and, more importantly, "published" as it were in a form that those other people can and do recognize. That is the only sense in which something may be "public" in a universe of perceptions that ultimately are private. The distinction is between perceptions that I can have of myself but not of anyone else (and no one else is privy to mine); and perceptions that I have of others and of myself and that (it appears) others can have of me. Maybe there is a fallacy in identifying one's perceptions of one's own behavioral outputs as being commensurate with one's perceptions of other people's behavioral outputs (my saying of the word "hello" is the same word as your saying of "hello"), but if so it is a fallacy we all naively make, and must do so to live normally. Autism is an alternative.

Accept the distinction without seeing it as a naive realist commitment to "public" perceptions being "really" different in kind. We know better. But we're describing what people do when they don't care, which is most of the time.

You have called this additional process "imitation". Imitation and
repetition are not the same.

I can see that there is a difference in that imitation requires imagining
how our own actions must look or sound from someone else's point of view,
whereas repetition involves only choosing a past perception as a reference
signal and making a present perception match it. Is that the difference you
have in mind?

Yes, but the remembered perception is not a particular past perception, it is a generalization about past perceptions. The difference between imitation and repetition is in step (2) above. Repetition involves another token of the same type, another instance of the same generalization. The child who says "unpark the car" is not producing the word "unpark" from memory, and her rendition of "park" involves phonemes that are generalized across her entire vocabulary, not particularized to the pronunciation of that word as a remembered perception.

The means for repetition are the reference values for controlling private
perceptions (e.g. pressure on the tongue tip) that result in behavioral
outputs that are public and conventional, that is, they are intended and
expected to be perceived in a certain way by others (e.g. the phoneme /d/
in English).

But why leave out the equally valid observation that they are intended to
be perceived (heard) by ourselves in the same way? It seems to me that
you're betting everything on kinesthetic control being the primary aspect
of speech that matters, with auditory control being relegated to an
auxiliary role. Why not say that we use our own kinesthetic control systems
(private) as our means of controlling our own auditory experiences (also
private), and that we use control of our own auditory experiences as a
means of controlling what we privately _imagine_ others to be hearing? That
seems simple and straightforward to me ...

OK, but as far as they are naively concerned, they are not imagining.

[...]

Concurrent parallel control of audition
could result in conflict, whereas control of audition by means of
controlling articulation would not. You are arbitrarily leaving out the
fact that at the same time we feel our articulations, we hear (or at least
imagine hearing) the audible result. The primary criterion for
communication can't be that the articulations feel right; it must be that
the auditory result sounds right, because that is all that others receive
from us. We will alter our articulations to make the sound come out right,
which shows that control of the sound is at a higher level than control of
the articulations. And we will not alter the sound we seek to produce as a
way of making the articulations feel right, which shows the same thing.

There are two empirical claims that I've made:

1. We can do without auditory feedback for a considerable time without any subjective experience of uncertainty or period of accommodation or reduction in quality of control. (You demand "sonagrams" to verify this last clause.)

2. By the time we hear, it's too late to correct the current phoneme. We can correct our speech for heard error only by repetition (N.B. as distinct from imitation).

It will be interesting to see what you find out about the experiments in
which the phonemes are disturbed.

Looks like I'll have to write a letter and/or telephone Houde. No answer yet.

********************************************************************
* *
* PROCLAMATION: *
* *
* I do not claim that a culture, society, custom, convention, etc. *
* is "an agency outside of all individuals that somehow actively *
* influences the individual." A culture is not a control system. *
* A language is not a control system. *
* *
********************************************************************

I was so busy with the "outside agency" aspect of this proclamation that I
overlooked the meaning of the last two sentences. I'm sure you don't mean
either of them. Language and culture are products of control systems INSIDE
individuals: namely, the hierarchy of control systems. Neither could exist
without that hierarchy.

I mean precisely that a culture or a language is not a *control system* with perceptual inputs, reference values for perceptions, comparators, and effectors. The "proclamation" says nothing about how language and culture are related to control of perceptual input by individuals who enact and use them.

Language and culture are *products* of control processes. But no one individual possesses the whole of a language or of a culture. To borrow a metaphor from Hilary Putnam, there are two sorts of tools in the world: there are tools like a hammer or a screwdriver which can be used by one person; and there are tools like a steamship which require the cooperative activity of a number of persons to use. Accepting the tool metaphor for the moment, words, sentences, a language are tools of the latter sort.

Consider the immigrant to the United States from one of the more marginal European countries. She has lived here most of her life, working as a domestic. She never learned to speak English very well; she can no longer remember how to speak Estonian or whatever was the language of her childhood. It is upsetting to her to realize that she virtually has no language at all.

Consider the remnant speaker of a moribund language. There are a few other old people who remember how to say things using the language, but they use English on the rare occasions that they see one another. They can say things to a visitor engaged in the exacting art of salvage linguistics, describe things in the language, tell stories, sing songs, but because they no longer converse and they no longer use the language to accomplish things, there is nothing for young people to witness and get the hang of doing.

What is seemingly overlooked here is that language (or culture) must be
learned by a particular brain as a result of active processes in that
brain. It is not injected from outside.

How to participate in language and in culture must be learned by particular individuals as a result of active control processes within them and looping through their environments, which include one another. For an analogy, the knowledge of how to do contra-dancing is learned by individuals, but the contra-dancing is not inside any one of the people who have learned it.

You have compared the two levels of ring-formation, the control of proximity, and the higher level setting the reference for proximity, to the two levels of pronunciation. There is time to readjust the reference level for proximity in the course of recognizing a potential collision. The comparison might work better for culture-specific manners of making a communicative gesture (for example, pointing with the chin vs. with a finger for example, or saying "no" by (a) rotating the head on a vertical axis, left and right, (b) briefly elevating the chin while rolling the eyes upward and producing an apical click (tsk), (c) rotating the head on a horizontal axis projected through the nose, left and right), since the time scale is slower. The higher level of control for head movements is I guess imagining what the shifting visual field and proprioceptive sensations must be for the people from whom the child learns, and reproducing those perceptions as imagined. There is no corollary of auditory feedback (children do not consult mirrors or observe their shadows to learn these communicative competencies).

In speech, I propose, there is a similar process going on. A higher system
that wants to produce the sound of "L" acts by setting a positive reference
level for (among other things) the kinesthetic system that controls the
pressure of the tongue against the roof of the mouth (just where depending
on the language one learns). Each auditory phoneme-control system adjusts a
set of kinesthetic/proprioceptive reference signals in this way. At
still-higher levels, a syllable-control system sets consecutive patterns of
reference signals for phoneme-control systems, and a word-control system
sets consecutive syllable reference signals. That's the general idea,
anyway, whether or not these are the right levels.

Yes. That's linguistics.

We can recognize many different voices
and accents as saying "the same thing" even though the articulations vary
greatly from one speaker to the next. The only real test would be to see
whether a linguist could tell whether another person was speaking with or
without masking of auditory feedback, when the linguist did not know what
was intended to be heard (this rules out your test in which you tried to
recognize your own recorded speech). The actual words might be perfectly
recognizable (at a higher, auditory, level), but the articulations might
prove to be quite different, as different as they are when different people
speak.

Sorry, this seems to me a bit incoherent.

I was listening for differences of the sort that distinguish one dialect or accent from another. "Saying the same thing" i.e. not slipping to another phoneme or word was not the issue. It was a long text that I read, and an unfamiliar one. You couldn't in fairness say that I missed hearing differences in pronunciation on account of knowing what words were intended: the text was still an unfamiliar one, and in any case word identification was not the issue, dialect-level shifts of pronunciation were the issue. Lacking auditory feedback and having to rely on tactile cues only, I expected that my tactile references would drift by an accumulation of error during the time of the exercise. To my surprise, this did not happen. Remember, I was not then trying to prove the point that I am now defending, I expected the opposite to happen. I was not listening for "a perfectly good L" I was listening for a shift from the way I normally say it: it would still be "a perfectly good L" but sounding perhaps like "L" in a different dialect.

If there was a scatter of articulations around average and mean positions, but the averages and mean positions did not shift through time, why, that is simply normal pronunciation. Looking at sound spectrograms would indicate a change, not if there was a scattering (there always is), but if the envelope of the scatter shifted.

This would allow us to propose that there are two ways of speaking, one
through control of felt higher-order patterns, and the other through an
intermediate auditory control system that, in turn, sets the articulatory
patterns.

I'm a bit lost here. Sounds like you mean three levels?

felt higher-order patterns--you mean syllables, words, constructions?

intermediate auditory control system

articulatory patterns

···

This would be similar to the case of loss of proprioception in
limb control. In fact, even animals can learn to control their limbs
without proprioceptive or kinesthetic feedback. Since the animals can still
reach out and touch a target that they can see, the experimenters conclude
that this kinesthetic/proprioceptive feedback is unnecessary. What they
overlook, of course, is that the _quality_ of the reaching and touching
actions has been radically reduced. In the normal animal, there is
proprioceptive control that produces vastly better results than the purely
visual control in the experimental preparation. But if one is attending
only to the qualitative or categorical result, no difference can be seen.
The target is "touched" in either case.

As a parallel, it is possible that when auditory feedback is masked, the
kinesthetic patterns of speech change greatly, but that when the resulting
speech (recorded) is heard by the auditory systems, no significant
differences are heard. For example, the tongue might press five times as
hard against a somewhat different place on the roof of the mouth, yet a
"perfectly good L" would be heard. Judging only from what is heard, the
observer would guess that the articulation has not changed at all, and thus
that the auditory feedback is unnecessary, or at best a side-effect.

My impression is that a sonic spectrograph can show changes in articulation
much too fine to be heard. This is shown by the range of formant ratios
that the spectrograph can reveal but which are heard as the same phoneme.
If this impression is correct, then my argument is strengthened: what
_sounds_ like the same articulation could actually result from very
different articulations. The best way to judge whether articulation control
is primary would then be to insert disturbances of the heard speech and
see, using a sonograph, whether the articulations immediately change.

I guess this brings us back to the same issue: what are the facts?

Bill_Powers1 · March 31, 1998, 2:27pm

[From Bill Powers (980331.0638 MST)]

Bruce Nevin (980330.2235 EST)--

Suffice it to say that I agree with you in many points (which to my
surprise might surprise you), that I think you have misunderstood me in
some other points, and that in a few I think you are wrong. I can send all
that if you like. I think maybe it's better to put that aside and to try to
get the issues grounded in a few concrete examples.

The last is probably the best idea yet.

You have compared the two levels of ring-formation (the control of
proximity and the higher level setting the reference for proximity) to the
two levels of pronunciation. By the time you hear he acoustic cue for a
consonant, it is too late to change it, and quite possibly this is true of
vowels too unless they are prolonged. There is time to readjust the
reference level for proximity in the course of recognizing a potential
collision.

When I say 'k', (as in "cot") there is a perceptible time during which I
can feel my mouth getting ready to say it. The reference level for that
feeling is, I propose, set as a means of hearing 'k'. In principle, this is
no different from setting a reference level for how it feels to throw a
ball with a given velocity and in a given direction, and then throwing it
in order to see it going toward the intended target. In both cases, the
amount of real-time control possible is limited to what one can affect
prior to release -- of the consonant or the ball. Refining the control
requires repetition of the action, with small adjustments being made on
each repetition.

An easy answer to our main problem occurs to me. Suppose that an
experienced word is composed of both auditory and kinesthetic sensations
(as a steak is composed of both taste and sizzle). Speaking a word is done
by momentarily setting a reference signal for experiencing that word to a
nonzero magnitude. The error signal dynamically adjusts the kinesthetic
reference signals for articulation.

Since both auditory and kinesthetic feedback make up the perception of "a
word," these modalities provide redundant information. If either modality
is lost, it can be filled in through imagination while the other becomes
the real basis for control. So when you can't feel your mouth as you leave
the dentist, you can still say "thanks."

For each of us, our perceptions of these means are private. But the means
(the English language in use today, with its variability and a certain
amount of indeterminacy) -- the means are in our environment.

Again, fine, as far as it goes. Let's get even more specific. _Where_ in
our environments is the English langage with its variability and
indeterminacy? By this, I mean _literally where_ is it, that you could
point to?

An individual
1. perceives the talk of others
2. makes generalizations about what they are doing
3. learns what it feels like to perform in a way that is consistent with
those generalizations
4. controls perceptions of those feelings
5. observes the performance that results from that control
6. compares own performance with the generalizations about that of others
7. adjusts the references for what it feels like to perform "normally"
(continuing 3)

Step (6) is a relationship perception comparing perceptions of the
behavioral outputs of others (or generalizations about them) with
perceptions of one's own behavioral outputs. These three sets of
perceptions are all private, of course.

Agreed.

But as far as the perceiver is
naively concerned those other people (and their actions) are out there in
the environment.

I think this is the key to our disagreements. We are not, or at least don't
intend to be, naive observers. What you describe from this point on is the
illusion to which the naive observer is subject, not the theoretical
explanation we are trying to construct:

Furthermore, one's own actions are out there too, and
one's perceptions of the one are commensurate with one's perceptions of the
other for comparison in the relationship perception. More importantly,
one's own performance of talking is "published" as it were in a form that
those other people are expected to recognize, can recognize, and do
recognize (with lapses that are a separate issue). That is the only sense
in which something may be "public" in a universe of perceptions that
ultimately are private. The distinction is between private perceptions (I
can have them of myself but not of anyone else and no one else is privy to
mine); and public perceptions that I have of others and of myself and that
(it appears) others can have of me. Maybe there is a fallacy in identifying
one's perceptions of one's own behavioral outputs as being commensurate
with one's perceptions of other people's behavioral outputs (my saying of
the word "hello" is the same word as your saying of "hello"), but if so it
is a fallacy we all naively make, and that we must make if we are to live
normally. Autism is an alternative.

Autism is NOT the alternative. A correct theoretical understanding is. What
we can see, as theorists, is that all these very convincing appearances on
which most people agree, even linguists, are _illusions_. No competent
analysis of language can rely on illusions; that would be like analyzing
vision in terms of sight-rays going from the eye to the object seen. What
you're telling me in the preceding paragraph is why people seem to think
that culture and language have some sort of real, physical, existence. They
project their own private experiences into a world that they are convinced
exists independently of them. And they are convinced of this not because
they have worked out, to the best of their ability, what the evidence and
logic is, but simply because they're never thought about it. It has never
occurred to most of them that other people, the physical world, rules and
regulations, and even a sense of self, are _perceptions_ and not the things
on which perception is based.

Accept the distinction without seeing it as a naive realist commitment to
"public" perceptions being "really" different in kind. We know better. But
we're describing what people do when they don't care, which is most of the
time.

This is my point. We can accept that people explain their behavior to
themselves in terms of naive realism, but one point of PCT is to explain
why this is an illusion. Whatever people believe about reality certainly
affects how they behave, but a theory of behavior has to be couched in
non-illusory (as far as we know) terms: PCT has to explain how these
beliefs influence behavior, but it doesn't offer those beliefs as a theory
of behavior.

My intentions are crumbling, here I go fixing one misunderstanding:

* I do not claim that a culture, society, custom, convention, etc. *
* is "an agency outside of all individuals that somehow actively *
* influences the individual." A culture is not a control system. *
* A language is not a control system. *
* *
********************************************************************

I was so busy with the "outside agency" aspect of this proclamation that I
overlooked the meaning of the last two sentences. I'm sure you don't mean
either of them. Language and culture are products of control systems INSIDE
individuals: namely, the hierarchy of control systems. Neither could exist
without that hierarchy.

I mean precisely that a culture or a language is not a *control system*
with perceptual inputs, reference values for perceptions, comparators, and
effectors. The "proclamation" says nothing about how language and culture
are related to control of perceptual input by individuals who enact and use
them. It is a purely negative statement to clear away something you were
resisting that you *thought* I had been saying.

Language and culture are *products* of control processes. But no one
individual possesses the whole of a language or of a culture. To borrow a
metaphor from Hilary Putnam, there are two sorts of tools in the world:
there are tools like a hammer or a screwdriver which can be used by one
person; and there are tools like a steamship which require the cooperative
activity of a number of persons to use. Accepting the tool metaphor for the
moment, words, sentences, a language, are tools of the latter sort.

Why not just leave out the metaphors? A hammer can be used effectively by
only one person at a time. A steamship, if it is to be used at all, must be
operated by many people, to help each of them achieve private purposes
(like keeping a date with an iceberg, or being captain, or emigrating, or
earning a paycheck -- the list is as various as people are).

What is seemingly overlooked here is that language (or culture) must be
learned by a particular brain as a result of active processes in that
brain. It is not injected from outside.

How to participate in language and in culture must be learned by particular
individuals as a result of active control processes within them and looping
through their environments (which include one another). For an analogy, the
knowledge of how to do contra-dancing is learned by individuals, but the
contra-dancing is not inside any one of the people who have learned it.

All right then, where is it? Can you point to it? Does it exist anywhere
but inside each of the people who can do it? The point I'm trying to make
here is so incredibly simple that I am always amazed when people who are
smarter than I don't get it.

Best,

Bill P.

Bill_Powers1 · April 4, 1998, 7:51pm

[From Bill Powers (980404.1247 MST)]

Bruce Nevin (980403.1207 EST)--

What I was unclearly going after was a process whereby perceptions of other
people talking are generalized and remembered as reference perceptions I
have drawn these most vaguely as loops around the left and top labelled
"Generalizations" to avoid claiming anything about the form of this process
beyond that statement, and also because it looks to get pretty complicated.
This was my claim about that loop:

I believe this means that there is a control system in the person that
compares perceptions of one's own performance with (generalizations about)
perceptions of other people's performance and adjusts the above-mentioned
reference levels as means for controlling one's own performance.

I think that's pretty reasonable, but you still need to be able to account
for more specific and faster auditory feedback effects. Delaying auditory
feedback by more than about 0.2 seconds really disrupts speech. That
implies that there is close to real-time auditory feedback.

Let's wait for more data to become known.

Best,

Bill P.

···

This means distinguishing between the auditory input labelled "self
talking" and the auditory input labelled "other people talking" and
comparing them. As you say:

There is an input function that creates a perception of the similarity of
our own performance to that of other people. We can adopt a reference
condition of completely unlike, mostly different, somewhat different,
somewhat similar, mostly similar, or identical. So the perceptual variable
is _degree of similarity-difference_. If the perceived degree of similarity
is too high or too low relative to the adopted reference level, we adjust
our performance to correct the error.
<<<<

Distinctions are also made between different people by making
generalizations about their speech (among other things), so self is just
one among many distinctions. And generalizations are made about different
occasions of speaking and what kind of talk is associated with each. This
is part of the shared illusions called society and culture.

Some people want to be just like certain other people,

"Certain" is a key word here.

especially in adolescence.

This is when people choose most intensively who they think they are. People
hyperarticulate the norms of their chosen dialect. (I've referred you to
Labov's research on this as "the social motivation of sound change" in
language.) A key point here, however, is that the "other people" are
imagined creatures, generalizations about remembered experiences of others
in relation to one's own performance. They are illusions, and to understand
language and culture we have to talk about how we construct these illusions
and make them real by enacting them and by recognizing others' enactments
of them, which is how they come to be the shared reality of the people who
speak a given language and live socially.

Bruce Nevin