A Test of "Collective Control" Theory

Actually, you did have that problem for one of the examples in your book. The data presented in “The social motivation of a sound change” (Labov 1963) crucially depend upon control of perceptions as high as the system concept of self-image. You limited your model to the lowest-level data that he presented. You modeled a convergence of numerical values, analogous to the ‘flocking’ of boids or the rings and arcs of the crowd demo. Labov handed you numerical data related to control at the relationship level. The data about higher levels were not handed to you as quantitative values, so you ignored them.

You may not have noticed that in doing this you exemplified one way of quantifying higher-level perceptions, even though in this case it was applied to perceptions at the relationship level. But the ‘centralization index’ that you took as data is not a measure of an output quantity Qo. It is not even a measure of individual behavior.

The over-all degree of centralization for each speaker is expressed by the mean of the numerical values of the grades of each instance listed on the chart. Thus on Figure 4, the centralization index for /ai/ (CI /ai/), is 0.75, and the index for /au/ (CI /au/), is 0.39. We can then find the mean CI for any group of persons by averaging the CI for the members of the group. ([Labov1963:291](http://languagelog.ldc.upenn.edu/myl/Labov1963.pdf))

So by accepting Labov’s centralization data you were using statistical methods much as Brian D’Agostino used statistical methods to identify correlated reference values at the principle level controlled as inputs to a self-perception system concept.

(See the first addendum to this post, below, for some of the issues with the ‘centralization index’ measure.)

One is above, another is below.

I’ve given other examples which you’ve rejected outright because you require me to give data first, but the methodological problem we’re talking about here is how to get quantified data about these higher levels. One way is statistical measures like those used by D’Agostino and (above) by Labov (and therefore by you). Another way is a form of Turing test; more on that below.

Here is a discussion of where we were 28 years ago in February of 1994:

“As the model grows,” he said, and the model has grown since then “to encompass more of what is observed and experienced” at higher levels.

I am asking you to participate, rather than resisting. Your model of the lowest level of Labov’s data is a good start. Keep going. For anyone who understands what Labov did, it’s kind of an embarrassment, until it’s clear that you reported only the first steps toward a model.

OK, on to a different example.

The methodological problem we’re talking about here is how to get quantified data for higher levels of control so that a model of this type can be built.

You’re the experimental psychologist. You’ve written as an authority on methodology of psychology. You have skills and experience obtaining quantitative measures of the observable aspects of behavior. I should think you would relish the challenge of extending the ‘Little Man’ program to run two instances at once, and instead of one hand tracking a cursor have each hand control the location of the other’s hand. Leave out for now the problems of programming a hand that can sense and grasp configurations, such as the configuration of another hand; simply bringing the two hands into (some representation of) contact is enough for now.

What higher-level perceptions are controlled by two autonomous agents bringing their hands in contact and producing the appearance of a hand-shake?

How do you compare model performance to living performance? You don’t need quantitative data about how people move their hands in a handshake to validate the performance of a simulation, a kind of Turing test by the observer is sufficient: you know it when you see it. Trial and error until it looks right.

Then inquire, and submit to public inquiry here, whether or not the posited higher-level CVs and their references are plausible. Inquire whether control of other higher-level variables might have the same observed appearance of a handshake. We all have memories of diverse experiences of shaking hands with another person.

Quantitative I/O and d measures are straightforward for motor control but the higher we go in the hierarchy quantities are notoriously more and more difficult to obtain (or pretend to obtain, statistically or otherwise) for perceptual input, output affecting that input, and disturbances affecting that input. I am proposing that something like a Turing test is an appropriate methodological solution to this problem. Until you as the author of textbooks on methodology can propose how we can get quantified data at higher levels of control, this is the only recourse that I see. Until you or someone else extends our methodology so we can produce such quantified data, and satisfy the criteria that you are demanding before you recognize observations as legitimate phenomena, then our PCT methodology at higher levels is to create a model that we think will pass a ‘Turing test’ and then see if it does.

Would you consider the achievement of a handshake cooperative? It is controlled by two participants (and by observers, who may be controlling perceptions of its higher-level significance). So it is a collectively controlled perception. There is no conflict; it is achieved by mutual avoidance of conflict. As the grip is achieved a handshake may be made conflictual, e.g. to control a perception of a dominance/subservience relationship or a perception that a demand for such a relationship has been communicated. Mutual avoidance of conflict communicates an intention to avoid conflict when controlling in a common environment (cf. Bateson “A theory of play and fantasy”).

It sometimes happens that you end up modeling something different from the phenomenon that you intended to model. Your ‘Marken effect’ is an example.

Your methodological requirement that you lay on me is that one should start out with a phenomenon and the model comes later. In this case, the phenomenon emerged unexpectedly from performance of a model of controlling with a slight disturbance. You wanted it to model an ‘inanimate’ disturbance, such as wind blowing, but you used a low-gain controller to generate the disturbance.

Since the disturbance was in fact generated by a ‘computer generated control system’ or “CGCS”, which is not in fact an ‘inanimate’ source of disturbance, to make sure the results were valid you recorded the outputs of the CGCS in a table and ran the experiment again with the canned data as the successive values of the disturbance variable. Unexpectedly, the effect went away.

When the CGCS was controlling in very slight conflict with the main controller (either you or a model control system replicating your tracking behavior), all was well, but when the very same numerical data was stored in a table and then input from the table in a subsequent run, it did not work.

Bill suggested that you include a value for a transport lag in your model of your performance.

Without the lag the stabilization of control is a consequence of two control systems in conflict, one with low gain. Bill described this case as follows:

In general, the “dynamics of the output” of a control system mirror the ‘dynamics’ of a disturbance. When the source of disturbance to the subject’s control is another control system, then because the outputs of the subject are reciprocally a disturbance to the disturbing control system the “dynamics of the output” of the disturbing control system mirror the “dynamics of the output” of the subject. Bill’s phrase “as much as” is not numerically true (because of the difference in gain), but it is true as a kind of synonym of “in the same way as”.

This is true of any conflict that does not go into runaway escalation. In this case, the low gain of the disturbing controller prevents runaway escalation of the conflict. The low-gain controller is unable to control completely, but does have some effect.

It follows that varying the disturbance at values stored in a table would result in different outputs from the subject or the model of the subject, because this reciprocal interaction is not present.

It appears that this is the (or a) mechanism by which collective control increases stability of control.

I wonder if it is essential that the disturbing control system “tried to keep the controlled variable constant”. I bet it would not matter if it had a varying reference value.

The same stabilizing effect should result if the higher-gain system controls from time to time or on a schedule and the disturbing system controls more frequently or (mostly) at other times. Likewise if there is a population of low-gain controllers which control in the aggregate with greater frequency than the higher-gain system, even if they control at different values, or if they control different aspects of a complex higher-level variable.

The stabilization should result if the low-gain controllers do not all control the same aspect of a complex perceptual variable which the high-gain controller is controlling as a whole.

The relevance to collective control is evident.

Earlier, I quoted three things that you had written and said that they were examples of your claim that conflict is a prerequisite to collective control. You denied that. You were correct about one of the three:

In haste and late at night I didn’t cut and paste this in a different location as I had intended. My intention was to agree with you about the complexity of what is perceived, and to point out that different participants in collective control may stabilize some but not all the physical variables of which the observer’s perception is a function. Indeed, what one agent controls may only intersect the set of physical correlates that another agent’s controlling affects, or the set of lower-level perceptual inputs of one agent’s perception of a collectively controlled variable may only intersect another agent’s corresponding lower-level perceptions. They may be controlling different higher-level perceptions which happen to have perceptual inputs in common.

Dave crosses at the crosswalk to get to the bookstore; Alice crosses the other way on her way to the college campus; Jane’s mother driving down Main street gets stopped by a policeman for failure to stop at the crosswalk; Jane’s father, irate, books a separate visit to Northampton to contest the ticket in court, presents photos showing that the crosswalk lines have been worn to near invisibility; the judge agrees and dismisses the ticket; Ben in the DPW has the repainting of crosswalks and bike lanes on his schedule during the college break; he gets a call from the judge’s secretary telling him he’d better get the crosswalks done sooner. The painted lines on the street are collectively controlled.

Quantify that variable.

Addenda follow.


Centralization index as a datum:

In the case of ‘The social motivation of a sound change’, you took the ‘centralization index’ quantity to be a measure of the central position of the tongue within the oral cavity. Bill Labov did not measure the height at which speakers held their tongues. He measured the height of the second cluster of undamped harmonics in sound spectrograms of audio recordings sampled at points where he (as a native user of English) recognized that the speaker was producing a word with a diphthong in it that was under investigation. Such a cluster of undamped and resonance-reinforced harmonics in speech sounds is called a formant. The perceptual input function for the second vowel of the diphthong recognizes relationships between formants across the audible harmonic spectrum, or perhaps a configuration, if that is a signal that the auditory system generates—I am not certain which—and their relationships to the formants for the preceding a vowel. As I am sure you recall, the formants themselves are bands of harmonics with greater amplitude separated by damped regions of the audio spectrum; the frequency of a harmonic at the visually estimated center of the formant is taken to represent the frequency of the formant. These ‘center frequencies’ vary from speaker to speaker, and from one utterance to another with a given speaker, the ranges of the formant values for any one vowel intersect those for adjacent vowels, and those for the centralized vowel sounds of English especially intersect with one and another and with their neighbors. (Examples of centralized vowels: cup, butter, cigarette, random, etc.). What is constant is the configuration or relationships, whatever the absolute pitches. Various kinds of investigations have ascertained that the height of the first formant (graphed on a log or Mel scale; the Mel scale correlates physically measured frequency to perceived frequency perceptions, see the addendum at the end of this post) correlates to how open or closed the narrowest aperture in the oral cavity is (high=open), and the height of the second formant correlates with how far front or back the narrowest opening is (high=front). These variables define the auditory space and the correlated articulatory space within which vowels can be produced and perceived. The ‘apical’ or ‘quantal’ vowels at the extremes of the auditory space and of the correlated articulatory space so defined are

  • i (seed) F1 at its lowest, F2 at its highest, tongue any higher produces zh.
  • u (food) F1 at its lowest, F2 at its lowest, tongue any higher produces gh.
  • a (baah!) F1 at its highest, F2 at a midpoint, opening the oral cavity wider, if you can, makes no difference; variation in F2 produces a front-back range for ‘ah’-like vowels, making the picture of the acoustic space mapped onto the articulatory space a vowel quadrilateral rather than a vowel triangle.

A ‘centralized’ vowel is somewhere in a rather ill-defined central region of the acoustic space and articulatory space relative to these three apices, or two apices and the “ah” open vowel boundary.

Values from some speech synthesis work:
ə F1 399 Hz F2 1438 Hz (a centralized vowel)
a F1 708 Hz F1 1517 Hz
ɑ F1 703 Hz F2 1074 Hz

Bill Labov measured formant frequencies at the peak of the first formant during the transition from a to u o and from a to i for two productions, as shown in Bill’s Figure 2, below

image

The first (on the left) has a more open first vowel in the diphthong of ride, and the second (on the right) has a more centralized vowel. These are not from two speakers representative of the two populations; “a North Tisbury fisherman” produced both within a single sentence (p. 290, an example of stress as a phonetic condition for centralization). Nevertheless, generalization is possible:

“Despite the differences in· vowel placement, these seven speakers utilize the same dimension to produce the effect of centralized or open vowels: widely separated formants for centralized vowels, adjacent formants for open vowels.”
Labov (1963:288)

This “dimension” involving the relationship or configuration of two formants is an example of collectively controlled perceptual variables that constitute language. The difference between the configuration (or relationship) on the left and that on the right is the collectively controlled variable that is to be modeled at this lowest level of a model of the phenomenon of “a socially motivated sound change”.


Mel scale:

The responses of human listeners to even “ simple” nonspeech stimuli like sinusoidal signals is not simple. Psychoacoustic “scaling” experiments show that judgements of the relative pitch of two sinusoids are not equivalent to their arithmetic frequency ratio (Beranek, 1949; Fant, 1973; Nearey, 1976,1978). In other words, if you let a human listener hear a sinusoid whose frequency is 1000 Hz and then let him adjust the control of a frequency generator until he hears a sound that has twice the pitch of the 1000 Hz signal, he will not set the control to 2000 Hz. He will instead select a sinusoid whose frequency is about 3100 Hz. Judgement of relative perceived pitch can be related to the physical measure of frequency by the use of a “Mel” conversion scale. […] [T]he perceptual ratio between two frequencies depends on the absolute magnitude of the frequencies. […] A sinusoid whose frequency is 1000 Hz thus has a Mel value of 1000 Mel and is twice the pitch of a sinusoid having a pitch of 500 Mel. The frequency of a sinusoid having a pitch of 500 Mel is 400 Hz. A sound whose pitch is 3000 Mel will have twice the perceived pitch of 150C Mel but the frequency ratio of these two sounds is 9000/2000. The Mel scale is of particular value in regard to some of the acoustic relations that structure the phonetic theory of vowels…
(Lieberman & Blumstein 1988:154; a discussion of categorical perception begins on the next page).