This topic begins with a description of how to test for the controlled variable, as specified by Bill Powers and Phil Runkel. The purpose of recapitulating it there is to provide a convenient reference for people interested in doing PCT research, especially in naturalistic research (Phil’s particular interest).
Your demo shows that the program can identify which avatar the user is moving. The user has no direct insight into what the computer is doing—it could be just reading off user input. But as the user assumes good faith, this is a demonstration that it is possible to identify which of several like variables is controlled. This might encourage further investigation and learning, but it does not show or teach how to do it. It is not analogous to a human experimenter for several reasons, one of which you stated:
The rest speaks for itself. I scarcely recognize what I wrote, snippets ripped out, forced into dog collars, dragged from their contexts, and leashed to rhetorical posts to have stones thrown at them. Readers should refer to the first post of this topic to understand how to apply the Test in their research.
Here is a diagram intended to help think through what the Experimenter and the Subject are controlling when the Experimenter has succeeded in identifying the perceptual variable that the Subject is controlling. Control is represented as a single loop, omitting the complexity of lower levels (unless the CV is an Intensity perception). Higher levels setting the reference Rs in the subject and Re in the Experimenter are also not shown but can readily be imagined. Systems setting Re are especially essential for understanding the TCV how the Experimenter introduces and varies the disturbance D.
Elsewhere, I have said that the disturbance (usually represented by d, but D in this diagram) is a perceptual variable controlled by the Experimenter. This is false with respect to this TCV loop, because the output quantity of a control loop, Q.o, is not what is controlled in that loop. But the Experimenter has other control loops (not shown), among them those which measure and record the experimental variables, and in respect to those systems the Experimenter’s Qo=D is a controlled perceptual variable.
There is no contradiction here. What is measured as Qo is not the aggregate of behavioral output, but only that aspect which affects the state of the controlled input Q.i. Other effects, if any, are extraneous side effects (not shown).
The Environment function may be thought of as a quantitative representation of the relevant (affected and perceived) aspects of whatever is Really Real in the observed time and space of the experiment. For experiments with motor control, laws of physics are assumed, but it is a radical simplification to reduce those mathematical formulae (F=MA, etc.) to a couple of ad hoc constants Kd, Kf, the products KdD and KfQo, and their sum KdD + KfQo.
For higher-level perceptual variables such quantification becomes quite indirectly related to the physical effects of behavioral outputs, and the Environment function represents each person’s ‘projection’ of perceptions as though they were realities in the environment. In the Roberts et al. experiments with self image, words like “You’re a liar!” certainly do have quantifiable physical attributes, but varying those attributes slightly does not make them less or more a disturbance. There are problems of experimental design here which have not been addressed.
BN: This topic begins with a description of how to test for the controlled variable, as specified by Bill Powers and Phil Runkel. The purpose of recapitulating it there is to provide a convenient reference for people interested in doing PCT research, especially in naturalistic research (Phil’s particular interest).
BN: Your demo shows that the program can identify which avatar the user is moving.
RM: Yes, it does that using the Test to determine which avatar is being moved intentionally.
BN: The user has no direct insight into what the computer is doing—it could be just reading off user input.
RM: Any user can learn how the Mind Reading program works. But whether or not they know how it works, it works – it successfully reads their mind, determining which avatar is currently being moved around on purpose – as long as the user is controlling the intentionally moved avatar.
BN: But as the user assumes good faith, this is a demonstration that it is possible to identify which of several like variables is controlled.
RM: Actually, the quality of the user’s faith doesn’t affect the possibility of identifying which variable (position of an avatar) is being controlled. The Test that is carried out in the demo works as long as the user is controlling the position of one the avatars. It’s all about detection of control.
RM: Thanks for all this information about what is involved in doing the Test. But I do find your discussion a bit abstract. Since this level of understanding must be based on a considerable amount of experience with Testing for Controlled Variables I wonder if you could give some nice, concrete examples of how you did the Test.
No, not observer’s point of view. Rather it is analyst/theorist’s point of view.
An observer, say an anthropologist, could say that ancient Egyptians were controlling for sun rise from the east every future morning by building a pyramid as a tomb for their pharaoh so that he could help the sun god to fight against the powers of darkness during the night. Or that the Aztecs were controlling for sun rise every morning by offering human sacrifices to the gods. That is a point of view of an attentive observer. Instead, a more scientifically minded analyst or theorist would say that they just – deeply but erroneously – think and believe that they are controlling for the sun rise, but in truth, they cannot be controlling for the sun rise because the sun rise is not something that humans could be controlling.
I agree that someone can want and try to control for sun rise and that someone can also think and believe that she is controlling, but that does not mean that they really control.
If the environmental feedback path is missing, then the loop is not closed. and it cannot be a case of perceptual control.
The full control loop must contain both p-control (of the perceptual variable) and e-control (of the corresponding environmental
Every CS block diagram is abstract in the sense that you are invoking.
It’s OK, we all get forgetful.
The Test is typically performed in linguistics by repeating an experimental utterance while making a substitution for some identified part of it. An early exposition:
http://www.iapct.org/festschrift/nevin.html
A more recent exposition begins on p. 378 of my chapter in LCS IV.
On pp. 383-386 of that chapter is a diagram for a PCT account of experimental work done by Katseff et al. in the phonology lab at UC Berkeley. I did not perform the test in that instance, not having the specialized hardware on loan from the Otolarangeology Department at USF as they did, but I did propose it on CSGnet in 1991-1992.
I have provided you data on populations of people controlling self-concept by differentiating their pronunciations of words from the way “those other people” pronounce them. While such interactions are not instances of the Test, they provide naturalistic observational data in which consistent ‘correction’ of disturbing pronunciations demonstrates control of certain thereby identified variables of speech (which may be reported quantitatively in terms of the center frequencies of vowel formants and are experienced subjectively as different ‘vowel qualities’). Bill Labov provided references for much more such data, and the associated literature is quite large.
RM: This is an edited version of the reply I sent earlier:
BN: The Test is typically performed in linguistics by repeating an experimental utterance while making a substitution for some identified part of it. An early exposition:
RM: I would appreciate it if you would explain how you carried out these examples of testing for controlled variables.
BN: On pp. 383-386 of that chapter is a diagram for a PCT account of experimental work done by Katseff et al. in the phonology lab at UC Berkeley. I did not perform the test in that instance, not having the specialized hardware on loan from the Otolaryngology Department at USF as they did, but I did propose it on CSGnet in 1991-1992.
RM: The Katseff et al study is very close to being a good example of a test for the controlled variable. What’s missing is any effort to test an alternative definition of the controlled variable. The fact that the subject’s output didn’t return the formant to the undisturbed value suggests that the formant frequency is not the controlled variable. If the vowel sounded ok nevertheless then that’s evidence that they were controlling for the desired perception ok, it just wasn’t the absolute formant frequency that they were controlling for.
RM: But Katseff et al’s method of doing the Test contradicts your model of how the Test is done. Your model has E in conflict with S over the state of the hypothetical controlled variable. In the Katseff et al study there was no conflict; the computer (E) did not increase its disturbance to the formant when S varied her output to compensate for it. The disturbance (which was a change in the feedback connection between S’s output and input) was applied completely passively.
BN: I have provided you data on populations of people controlling self-concept by differentiating their pronunciations of words from the way “those other people” pronounce them.
RM: That was a great study but I didn’t see it as a Test of “self concept”. Indeed, I didn’t see anything in that paper that looked like the Test. Rather, I saw some excellent data collection. I created a simple model that predicted this data by assuming that individuals are controlling for imitating the pronunciation of those with whom they interact. So the model implicitly tests to see if one controlled variable in this situation is pronouncing diphthongs the way others pronounce them. The model worked pretty well so it is evidence that people do control for this variable. This was a model-based version of the Test for the Controlled Variable. Inasmuch as this was the case, my approach to testing for the controlled variable is nothing like the way you say it must be done – with E and S in conflict. There really should be no conflict between E and S in any properly done version of the TCV.
Discussion of linguistics and the variables that constitute languages is off topic here. You can reply to the topic I have placed in the Science/Language subcategory. You might want to reconsider your conclusion that the Katseff et al. work shows that formants are not controlled.
What’s relevant to the present topic is that while testing for controlled variables one must be alert to the possibility of hidden or non-obvious conflict as the reason for the appearance that a variable is not controlled. As Katseff & Houde put it (Lab. Phon. 11):
These results suggest that both acoustic and sensorimotor feedback are part of one’s lexical expectation. Because auditory feedback is altered while motor feedback is not, feedback from these two sources can conflict. For small shifts in auditory feedback, the amount of potential conflict is small and the normal motor feedback does not affect compensation. But for large shifts in auditory feedback, the amount of conflict is large. Abnormal acoustic feedback pushes the articulatory system to compensate, and normal motor feedback pushes the articulatory system to remain in its current configuration, damping the compensatory response.
On another note, I agree that by varying the disturbance we can demonstrate that control outputs are quantitatively equal but of opposite sign (or vector). Neither Bill nor Phil mentions this as essential to the definition of the Test, but it is important. You are incorrect to say, however, that Katseff et al. did not vary the disturbance. After drawing a parallel to work on reaching with and without prismatic lenses, they say (op cit.):
In speech adaptation, subjects wear a headset. They speak into the microphone and hear their speech played back to them through the earphones. The auditory version of the task used here involves four stages: baseline, ramp, plateau, and adaptation. Subjects repeat a single word, in this case, ‘head’, over a large number of trials. In “reaching” sensorimotor adaptation experiments, subjects initially see the object on the table in its true position. In speech adaptation experiments, subjects hear their voices unaltered during the baseline stage. During each trial in the ramp stage, auditory feedback is altered a small amount until it reaches a maximum value. Feedback alteration is held at that maximum value during the plateau stage. In this experiment, there are 5 sets of ramps and plateaus, after which feedback drops suddenly back to normal for the adaptation stage.
… subjects generally change their speech to oppose the auditory feedback change. For example, when F1 in auditory feedback is raised, making their /ɛ/ sound more like an /a/, subjects compensate by speaking with a lower F1; the vowels they produce sound more like /ı/. Similar experiments show that subjects will compensate for alterations in F0, F1, and F2 feedback, indicating that all three of these formants are important to a speaker’s representation of the target utterance… .
(I don’t know why the publication has /a/ as in “haha” where it should have /æ/ as in “had”. To a phonetician or phonologist it’s an obvious typographical error.)
RM: OK, I’ll go over there when I get the chance. And I didn’t say that the Katseff et al. work shows that formants are not controlled. I said it “suggests that formant frequency is not the controlled variable”. Other possibilities include the value of the disturbed formant frequency relative to the frequencies of the other relevant formants.
BN: What’s relevant to the present topic is that while testing for controlled variables one must be alert to the possibility of hidden or non-obvious conflict as the reason for the appearance that a variable is not controlled.
RM: I think the idea that the observed failure to compensate for the disturbance results from a conflict between auditory and motor feedback simply reflects Katseff et al’s lack of understanding of how hierarchical control works. There can be no conflict involving different types of variables…
BN: On another note, I agree that by varying the disturbance we can demonstrate that control outputs are quantitatively equal but of opposite sign (or vector). Neither Bill nor Phil mentions this as essential to the definition of the Test, but it is important.
RM: They don’t mention it because the only variables you really need to monitor in the Test are the disturbance and controlled variable. The output variable is often difficult to specify let alone measure.
BN: You are incorrect to say, however, that Katseff et al. did not vary the disturbance.
RM: No, I said that the disturbance in their study was a change in the feedback connection between S’s output and input. This was a fineway to introduce a disturbance to the hypothetical controlled variable (formant frequency). I was just noting that it wasn’t equivalent to what is called the “disturbance variable”, d, in PCT. In PCT, d is a variable that has an independent effect on the hypothetical controlled variable – “independent” in the sense that the controller has nothing to do with its value. So a PCT type disturbance in the Katseff et al study might have been an externally generated noise waveform added to the sound produced by the speaker.
RM: In the Katseff et al study the disturbance was a kind of “smart filter” inserted between speaker output (what was said) and input (what was heard). Again, there was nothing wrong with this approach to disturbing the hypothetical controlled variable. Indeed, it was a very clever study and I should have put it in my methods book. But the problem was that the results were not clear because the speaker was no able to completely resist the disturbance (the change in the feedback function); the speaker was apparently never able to vary their output in a way that allowed them (and the tester) to hear the word they were meant to say. This could have been because the disturbance was too great or because there was some other variable being controlled. But the study was certainly a good start at the Test.
The “it” that they don’t mention is (continuously) varying the disturbance. (Step 3 in Phil’s statement is “Apply various amounts and directions of disturbance directly to the variable.”) I was imagining what perception you were controlling by your emphasis on the importance of varying the disturbance when you said the following:
I was imagining that the reason you emphasized a need to vary the disturbance was to gather data for a quantitative demonstration (output opposed to disturbance). But of course, as I see now, you were inveighing against my statement in other posts that, in the Test, the experimenter is in conflict with the subject. You were saying that if the computer were in conflict with the subject it would increase the disturbance diametrically to the subject’s output so as to maintain the CV (formant frequency) at the computer’s reference value.
I agree that the computer is not a control system in conflict with the subject. The computer altered the formant frequency produced by the subject in the course of the speech signal passing from microphone input to headphone output. Your point seems to be that this was a disturbance in the manner that a crosswind is a disturbance to the driver of a car. The wind is not a control system controlling the relationship of the car to the road, so the disturbance is not a conflict. This is true, unless there is some control system using some extension of its control outputs to determine the influence of the wind on the car. There is a control system using the computer as an extension of her motor control outputs to determine the influence of the disturbance on the formant frequency heard by the subject.
Much as the car is not a control system but its steering linkage, etc. extend the motor control outputs of the driver, who is a living control system, the computer extends the control output of the experimenter. By means of the computer, the experimenter controls disturbances to the relationships of formants as heard by the subject. The computer is not a control system, but the experimenter employing the computer as an instrumental extension of her motor output functions is a control system. By using the computer to introduce measured disturbances the experimenter is in conflict with the subject, a conflict which is tempered by higher level systems which are controlling perceptions of testing to identify and verify controlled variables.
I believe that your unstated assumption is that when two control systems have conflicting reference values for the same variable, the output of each is a disturbance to the other, and each increases its output until one or both reach maximum output capacity. As discussed many times elsewhere, this is true unless, as in a living control system, a higher-level control loop intervenes in some way. In the Test, higher levels of control constrain the extent to which the experimenter’s output d conflicts with the subject’s control because at those higher levels the experimenter is using the disturbance as means to verify what variable the subject is controlling, and an explicit condition of the Test is that the disturbance should be “gentle” and should not overwhelm the subject’s control. In a prior post, earlier in this topic, I have presented a partial block diagram that includes both subject and experimenter to illustrate this interdependency. Here it is again:
Bingo. But to control a relationship R between A and B you must control A as an input to R and you must control B as an input to R. Formants are controlled as means of controlling the relationship between formants which is heard as a given vowel. (There are also temporal differences, see the discussion on pp. 14-15 of Katseff’s dissertation. Page references in this post are to that work.) Note that subjects met disturbances to one formant by changing more than one formant (p. 47).
The relationship is called phonemic contrast. Consider an array of relationship comparators, one for each of the phonemically distinct vowels that one must control in order to speak and understand English. Because of differences in vocal tracts (especially length), dialect, social register (e.g. formality), stressed and unstressed syllables and phrases, and other kinds of variation, there is overlap. To visualize this, consult figs. 2.4 and 2.5 in Katseff’s dissertation. Presumably by innervation from different parts of the cochlea, current input about frequency concentrations during the highest-amplitude parts of utterances (the vowels) goes to comparators for vowels which are adjacent in acoustic space, as exemplified by hid, head, and had, and also hayed, heard, and the hud of “Hudson”. Control of contrast in heard speech is relative to other vowels heard from the same speaker. The experimental situation is simpler because the subject knows what her own speech sounds like and is accustomed to controlling it, but the general case of social variation is consistent with numerous demonstrations that the subject perceives “the same” vowels over considerable ranges of variation, with the boundaries intersecting and discrimination very context dependent. If that seems too abstract to make sense, hang on, it will get more specific shortly.
Katseff (pp. 19f) reviews some of the numerous studies of what speakers do when disturbances of various kinds are introduced, demonstrating that speakers control both auditory and ‘somatosensory’ perceptual inputs concurrently. (One must disregard disturbances to our PCT sensibilities when the authors of these studies use prevalent vocabulary such as ‘plan’, 'reaction, and ‘compensation’, reflecting ignorance of principles of control.) An example:
When a paddle is briefly applied to a speaker’s upper lip during the production of a sequence like [aba], the most straightforward response would be to increase the force pulling the lower lip up, opposing the force from the lip paddle. Speakers do not do this. Instead, they lower their upper lip, maintaining the bilabial closure [references elided]. There are two ways to account for compensation by the upper lip. One explanation is that subjects are trying to maintain the acoustics of the intended syllable, and the best way to make the /b/ in /aba/ sound like a stop is to lower the upper lip. Another explanation is that subjects have somatosensory expectations for intended syllables; that is, they know what those syllables are supposed to feel like. In the case of /aba/, they know that their jaws should feel open during the vowels and that their lips should come in contact during the /b/. Maintaining somatosensory information about this syllable would likewise require compensation for the lip paddle in /aba/.
In some experiments the shape of the oral cavity was deformed by a kind of bladder against the palate which in random trials was either flat or inflated, with changes smooth or abrupt, with or without masking of auditory feedback.
The fact that the reaction times were very short, and that substantial compensation occurred with masked feedback, suggests that somatosensory feedback initiates initial adjustments to articulation in the presence of the inflatable palate. That intelligibility improves after the first token suggests that auditory feedback is also used to adjust articulatory plans.
The references for tactile and kinesthetic perceptions in the oral cavity are adjusted so as to improve the sound that is produced.
The categorial nature of phonemic contrast is evident:
For small feedback shifts, the light gray shape (formants heard as a result of the feedback shift) falls entirely within the dashed shape (the subject’s baseline range), indicating that the vowels that the subject heard were all within his baseline region and that compensation was complete. As the amount of feedback shift increases (the dark solid shape), compensation is less and less complete. (p. 54)
[…]
for large feedback shifts, the re-synthesized vowel fell outside of the subject’s natural vowel region. There is some evidence that different brain regions are recruited to deal with feedback perturbations that fall outside of the intended vowel region. (p. 57)
However,
Because the relationship between articulation and acoustics is nonlinear, subjects may be compensating more or less steadily than they appear to be from their vowel formants alone.
You objected that subjects didn’t hear the disturbance.
But the vowel didn’t sound OK to the subject, as evidenced by their resistance to the disturbance. I suppose you’re getting this idea from e.g. p. 43:
Post-session interviews indicated that subjects did not notice either formant shifts or delays.
As you know, awareness is not necessary for control, and control outputs reduce the ‘noticeability’ of the effects of disturbances.
Your point that subjects were not controlling the absolute frequency of the disturbed formant (assuming that’s your meaning) is better illustrated when disturbance to one formant was resisted by changing two formants so as to approximate the target relationship of formants. But we already know that speakers do not control absolute formant frequencies, because we know they (we) control phonemic contrasts when the formants are at very different frequencies, e.g. shifted higher or lower by variable length of the vocal tract, for regional and social-class dialects, and so on. Speaking with helium in the lungs is a fun example.
By “different types of variables” I assume you mean different levels. What is involved here is different sensory modalities providing input to control of phonemic contrast. The auditory perceptions and the tactile/kinesthetic perceptions are at the same level, they are just in different sensory modalities. Phonemic contrast keeps words and syllables that are partially alike distinct from each other so that the hearer knows which the speaker intends. (If other context makes that obvious, gain on pronunciation is lower and auditory ambiguity results.) Perceptual inputs for controlling phonemic contrast are in two sensory modalities: kinesthetic/tactile perceptions involved in articulation of speech, auditory perceptions of the sounds thereby produced. We are concerned here only with the first two. (The McGurk effect and other evidence shows that visual input from the speaker’s face is also important, but that is not a variable in consideration here.)
As noted above the range of inputs that are perceived as one vowel overlap the range of inputs that are perceived as an adjacent vowel. This is so both for the sounds and for the ‘feel’ of articulation. In the diagram repeated below, auditory sound is labeled S and tactile/kinesthetic articulation input ls labeled A. It should be obvious that we our ability to control the sounds of speech in real time is very limited, because once a syllable has become audible it is impossible not to have already produced it. We correct a mispronunciation by saying the intended utterance again. The tactile and kinesthetic perceptions are controlled in real time. The former are controlled by setting references for the latter over time.
To resist disturbance to the sound, the subject changes references for articulation. When disturbance is relatively small, resistance to it (“compensation”) is complete. The articulatory perceptions move farther into the intersection of two ranges but are still close enough to ‘canonical’ values so that the intended vowel is perceived.
There are three adjacent vowels, call them V1, V2, and V3. The subject’s intended vowel is V2 and V1 is the target toward which the disturbance is moving the sound perception S. As the disturbance is increased, so that the sound is more like V1, and the articulation is changed so as to maintain the perception of the sound of V2, at some point the articulatory perceptions A are farther out of range to be perceived as the intended vowel V2 and begin to be perceived as the opposite adjacent vowel V3. (Compare the mathematical notion of cusp catastrophe which non-control theorists invoke to explain shifts in categorial perception.) Perceiving V3 increases error output in the system which has a reference (from the word-perception system at the top of the diagram) for perceiving V2.
Control of articulatory perceptions A is the means of controlling the sound perceptions S. Each accepts input that departs from the canonical value for V1 into the intersection of its range with the range of values for V2. What is measured is the formant frequency which is not completely “compensated”. What is not measured are the positions of the articulators which also are not completely “compensated” to where they would be for producing the sound V1, absent the disturbance. This is one of the ways in which the experimental data are incomplete.
Articulatory perception of the ‘feel’ of pronouncing V3 creates error in control of the intended perception of the sound of V2. By means of the disturbance, the experimenter is in conflict with the subject’s control of the word which contains V2. When the disturbance is small, the subject maintains good control. As the disturbance increases, the subject’s control is compromised in each sensory modality, sound and ‘feel’. So long as the perception S and the perception A are both within their respective ranges (as one modality extends into the intersection with V3 and the other into the intersection with V1) the subject does not notice.
RM: This was quite a long post, Bruce. I’m just replying to what I see as the substance of the two main points in your post: 1) that the Test involves conflict and 2) the Katseff et al was a good example of using the Test in linguistic research.
RM: Katseff et al’s method of doing the Test contradicts your model of how the Test is done. Your model has E in conflict with S over the state of the hypothetical controlled variable. In the Katseff et al study there was no conflict; the computer (E) did not increase its disturbance to the formant when S varied her output to compensate for it. The disturbance (which was a change in the feedback connection between S’s output and input) was applied completely passively.
BN: I believe that your unstated assumption is that when two control systems have conflicting reference values for the same variable, the output of each is a disturbance to the other, and each increases its output until one or both reach maximum output capacity. As discussed many times elsewhere, this is true unless, as in a living control system, a higher-level control loop intervenes in some way. …In a prior post, earlier in this topic, I have presented a partial block diagram that includes both subject and experimenter to illustrate this interdependency. Here it is again:
RM: The higher level system of which you speak is the system carrying out the Test. It is the system that is controlling the program that constitutes doing the Test. When you do the Test you are controlling for for seeing whether disturbances, however applied, have the expected effect on the hypothesized controlled variable. If they do, then the variable is not controlled; if they don’t, then the variable is likely to be controlled but continued Testing is necessary.
RM: This Test should not involve creating a conflict like the one you show in the diagram above. This is because the higher level system doing the Test would not be able to reliably tell whether the hypothetical controlled variable, Q.i, was actually controlled. If Q.i were not a controlled variable then E would have no difficulty bringing Q.i to the reference state, r.E, and the higher level could correctly conclude that Q.i was not under control. But this would also be the case if Q.i were under control and S’s reference for it, r.S were equal to r.E. In this case E would also have no difficulty bringing Q.i to E’s reference for it and the higher order system would again conclude that S is not controlling Q.i when, in fact, it is.
RM: I think it’s best to see E as controlling, not for the same variable that S is controlling for but, rather, for detecting an effect of a disturbance that should have an effect on the hypothetical controlled variable if that variable is not under control. If such an effect is perceived (by the higher level system doing the test) it is evidence that the hypothetical controlled variable is not under control; if such an effect is not perceived – or the effect is much smaller than expected – then it is evidence that you are hot on the trail of a controlled variable.
RM: I didn’t say that the Katseff et al. work shows that formants are not controlled. I said it “suggests that formant frequency is not the controlled variable”. Other possibilities include the value of the disturbed formant frequency relative to the frequencies of the other relevant formants.
BN: Bingo. But to control a relationship R between A and B you must control A as an input to R and you must control B as an input to R. Formants are controlled as means of controlling the relationship between formants which is heard as a given vowel.
RM: But they didn’t Test to see if this is what was going on. That’s my only complaint about the Katseff et al research. They didn’t do a complete version of the Test; they didn’t identify a controlled variable.
BN: The relationship is called phonemic contrast.
RM: The idea that the S’s were controlling “phonemic contrasts” is a conclusion based on the observations of conventional acoustical phoneticians who didn’t understand speech as the control of input. It is not based on the results of the Katseff et al research.
BN: Katseff (pp. 19f) reviews some of the numerous studies of what speakers do when disturbances of various kinds are introduced, demonstrating that speakers control both auditory and ‘somatosensory’ perceptual inputs concurrently.
RM: Yes, concurrently but certainly not in parallel. The somatosensory (actually, proprioceptive is probably a better term) variables that, when, controlled result in the articulatory configurations of the vocal tract, are the output functions that result in the acoustic sounds that we hear as speech. Variations in articulation are the means by which speech sounds are controlled.
RM: If the vowel sounded ok nevertheless then that’s evidence that they were controlling for the desired perception ok, it just wasn’t the absolute formant frequency that they were controlling for.
BN: But the vowel didn’t sound OK to the subject, as evidenced by their resistance to the disturbance. I suppose you’re getting this idea from e.g. p. 43: Post-session interviews indicated that subjects did not notice either formant shifts or delays.
RM: I think I got that idea from you. I think you said that with the partially compensated disturbance the word they wanted to say. The main thing the people in the experiment were controlling was the word wasn’t what they wanted to say; I think you sad they said something between “hid” and “head”. This strongly suggests to me that the Ss were unable to find a way to change their articulation so that they were saying “head”; they just managed to change their their articulation so that the word sounded more like “head” than it did without compensation for the disturbance.
BN: Your point that subjects were not controlling the absolute frequency of the disturbed formant (assuming that’s your meaning) is better illustrated when disturbance to one formant was resisted by changing two formants so as to approximate the target relationship of formants. But we already know that speakers do not control absolute formant frequencies, because we know they (we) control phonemic contrasts when the formants are at very different frequencies, e.g. shifted higher or lower by variable length of the vocal tract, for regional and social-class dialects, and so on. Speaking with helium in the lungs is a fun example.
RM: It would have been nice if Katsoff et al had demonstrated this using control of input methodology.
RM:: I think the idea that the observed failure to compensate for the disturbance results from a conflict between auditory and motor feedback simply reflects Katseff et al’s lack of understanding of how hierarchical control works. There can be no conflict involving different types of variables
BN: By “different types of variables” I assume you mean different levels. What is involved here is different sensory modalities providing input to control of phonemic contrast. The auditory perceptions and the tactile/kinesthetic perceptions are at the same level, they are just in different sensory modalities.
RM: Control of articulatory perceptions is the means of control of auditory perceptions. The idea that speaking involves producing auditory and articulatory perceptions simultaneously (the “motor” theory of speech) was developed by people who thought speech was a generated output. It’s not. It’s a controlled input.
BN: To resist disturbance to the sound, the subject changes references for articulation.
RM: Correct.
BN: There are three adjacent vowels, call them V1, V2, and V3. The subject’s intended vowel is V2 and V1 is the target toward which the disturbance is moving the sound perception S. As the disturbance is increased, so that the sound is more like V1, and the articulation is changed so as to maintain the perception of the sound of V2,
RM: Couldn’t have said it better myself. Although I would leave out the “and” that I have bolded. That “and” boded ill as indicated but this incorrect addition:
BN: at some point the articulatory perceptions A are farther out of range to be perceived as the intended vowel V2 and begin to be perceived as the opposite adjacent vowel V3.
RM: This makes no sense. If the articulation has been varying so as to keep the vowel sounding like V2 despite the disturbance then if the articulation is still doing it even if the compensatory articulation would have produced V3 if there were no disturbance it is still the articulation that keeps the sound under control. The speaker couldn’t care less if the articulation would have produced V3; if the articulation keeps the controlled variable under control then it’s the right articulation.
BN: Control of articulatory perceptions A is the means of controlling the sound perceptions S.
RM: Right, and the means are varied as needed to produce the intended output. Variations in articulation are like the mouse movements in a tracking task, with cursor position, like speech, as the controlled result. Variations in the mouse movements depend on variations in the disturbance and/or feedback function; sometimes leftward movement moves the cursor left and sometimes right. Same with articulation; sometimes articulation that produce one vowel (V3) will produce another (V2) when the disturbance is taken into account.
BN: Articulatory perception of the ‘feel’ of pronouncing V3 creates error in control of the intended perception of the sound of V2. By means of the disturbance, the experimenter is in conflict with the subject’s control of the word which contains V2. When the disturbance is small, the subject maintains good control. As the disturbance increases, the subject’s control is compromised in each sensory modality, sound and ‘feel’. So long as the perception S and the perception A are both within their respective ranges (as one modality extends into the intersection with V3 and the other into the intersection with V1) the subject does not notice.
RM: I think this is just the motor theory of speech being dragged into PCT.
Unavoidable delay–IAPCT business to carry forward, wife’s medical issues, refinancing the house, birthday, requests from my indigenous friends for help coining vocabulary that their elders didn’t have, a couple of end-of-month reports to get out which I should be working on instead of this … I’m sure you will complain about the length and respond only to convenient bits, but I haven’t had time to make it shorter and it seems important to keep the record straight.
There is no suggestion in any of this that speech is generated output. It’s called the motor theory of speech perception and as such is not concerned with speech production. Here’s a summary statement:
The three main claims of the theory are the following: (1) Speech processing is special (Liberman & Mattingly, 1989; Mattingly & Liberman, 1988); (2) perceiving speech is perceiving vocal tract gestures2 (e.g., Liberman & Mattingly, 1985); (3) speech perception involves access to the speech motor system (e.g., Liberman et al., 1967).
Galantucci, B., Fowler, C. A., & Turvey, M. T. (2006). The motor theory of speech perception reviewed. Psychonomic bulletin & review, 13(3), 361–377. https://doi.org/10.3758/bf03193857
A theory of perception is obvious relevant because perception are what is controlled in speech as in all other purposeful behavior. The closest thing to a motor theory of speech production is the gestural model developed at Haskins Lab (at Yale) for computer synthesis of speech. That is not in play here.
Advocates of the motor theory of speech perception appear to have tried to make the articulatory aspect of speech perception account for everything about speech perception. I think that’s actually a misrepresentation, but that’s why it has been so disregarded within linguistics and phonology, because it seems to disregard the importance of the auditory aspect. It has gained traction mostly outside of linguistics. In some CogSci quarters there may have been efforts to recruit it for an account of generated outputs.
Liberman, Turvey, Galantucci, and the other developers of the motor theory of speech perception used the term ‘motor perceptions’ for the tactile and kinesthetic perceptions in the vocal tract which are controlled to pronounce language-compliant syllables and words. Auditory perceptions are also controlled, and you have agreed with me that the controlled speech variables in the auditory perceptual modality are controlled by means of setting the references for those ‘motor perceptions’ (proprioceptive and tactile perceptions) which produce speech sounds.
To recap, the speech sounds cannot be controlled in real time because by the time you hear them they have already been produced, any more than the archer can correct the arrow’s flight, once audible they are as it were ballistic in the air, and the best you can do is repeat the intended utterance and articulate it as intended, perhaps with higher gain the second time. The articulatory references are adjusted over repetitions of the given phoneme and syllable just as the archer adjusts the references for diverse perceptions so that arrows more reliably hit intended targets.
Any resemblance to the Test is accidental. They are not trying to identify the controlled variables. It has long been known that formants are variables controlled in both the production and recognition of spoken syllables and words. Research over many decades has teased out the perceptual variables which, when their values are disturbed, determine whether one phoneme or another is perceived.
That article focuses on the auditory modality. It has also long been known that articulatory perceptions are also controlled in speech production, and of course the ‘motor theory of speech perception’ affirms their role in perception of phonemic contrasts of syllables and words. The physics (acoustics and physiology) of the relationship between the auditory perceptions and the articulatory perceptions is well understood.
However, people working (understandably) in a CogSci framework have (equally understandably) been unable to integrate the two sensory modalities within a CogSci conception of plans and execution. Katseff wanted to lay some groundwork for such integration (pp. 140ff of her dissertation) while specifically focusing on the puzzle disclosed by prior experimentation with disturbing formants in real time (work she cited), namely, why such disturbances to auditory perception were incompletely ‘compensated’ by changes in articulation.
To introduce disturbances to specific formants without appreciable delay to the feedback of speech sounds in headphones requires specialized computer chips. It would be possible to perform the Test with such equipment, as I proposed doing on CSGnet in 1991-1992, but (as I said) lacking that equipment, I haven’t done it.
However, for an audience of linguists and phonologists, it would be a waste of time to prove that formants and relationships of formants are controlled perceptions. It would only be new information for PCT researchers who don’t believe that anybody else identifies controlled variables by observing their reference values, disturbing them from those values and observing what subjects do to ‘compensate’ the disturbance.
The experimenter purposefully applies a disturbance to the CV. You say that purposefully applying a disturbance to a perceived variable is not control. We disagree.
Repeating Bill’s brief description:
Phonemic contrast is a categorial perception. Bill proposed that a perceptual input function for category perception receives input from a plurality of diverse sources, not all of which need be present for the category to be perceived. For category perception, then, it is not quite so simple. This statement works for a simple case such as the relationship between cursor and target:
BP:
The test is completed by showing that preventing the other system from perceiving the variable destroys control, and that the reason for the small effect of the disturbance is opposition by an action of the controlling system.
For categorial perception, it is necessary to identify all the sources of input. For speech, we have auditory and articulatory perceptions. We have auditory perception through at least two channels, atmospheric and bone conduction. For articulation of vowels we have proprioception of the tongue muscle, jaw, and lips, and we have tactile perceptions where the tongue may touch the teeth, gingiva, and palate (or not), and where the corners of the lips approximate. With so many sources of input, various subsets of which suffice to produce the category perception, the project of blocking sensory input is scarcely feasible. And to attempt to pinpoint *the* controlled variable among them is a fool's errand.
BP:
We apply disturbances to something in the environment, with an expectation of how the disturbance would change it if there were no control.
Is that expectation a reference value in some control loop? If not, what is an expectation? And specifically, what is the PCT account of this particular expectation?
Bill suggested that an expectation is a reference level. Here’s that quote again:
My understanding is that an expectation of a perception is our experience of a reference level for that perception. Whether we experience it as an expectation, a fear, or a purpose (among perhaps other possibilities) depends upon additional context that we could talk about in another place. And as you have pointed out, we can have a reference level (presumably a reference signal) for a perception without immediate means of controlling that perception.
When we perceive what we call consternation we may be seeing outputs of systems attempting to regain control, like the twisting of limbs and body that we see in a free-falling animal. Those seemingly futile agitations may actually reorient the body so as to land in a safer manner, and of course cats at least do this to land on their feet.
But setting that aside and staying on topic, this suggests a form of the Test that could identify reference values for a perceptual variable even when for some reason the subject does not act to resist a disturbance to it. It could be lack of output function. It could be lack of sufficient output capacity, as recognized by a higher system which therefore in some way ‘turns off’ the reference signal that would call for that output. It could be that a higher-level system does so because of perceiving some other contingency. Modeling how a living control system refrains from doing something should certainly be considered in another place.
The subjects were able to hear the word they intended to say. The sound of the vowel was in the intersection of the acceptable ranges for one pair of vowels including the intended one, and the ‘feel’ of pronouncing it was in the intersection of the acceptable ranges for the other pair of vowels including the intended one. Perceptual control was compromised in both modalities, in opposite directions, with the two control systems in effect splitting the difference between them.
You say the results were not clear. Katseff’s results were clear. She already knew that the speaker would not completely resist the disturbance. This was a prior finding in other labs, which she was replicating. So don’t think that those were her results. Her result, her conclusion, was that the reason for incomplete ‘compensation’ was indeed, as you say, “because there was some other variable [or rather, variables] being controlled” at the same time as the relationships of formants. She referred to the relationship between the auditory and the articulatory as “interference”. This relationship can be seen by going up to higher levels of control. That’s what the diagram is for.
Some off-topic material is in the Science|Language category. There’s other off-topic stuff here that I should have put there as well, but like I said, I would have if I’d had more time.
A nice natural example of the TCV is in Science, 377, Aug 22 2022, pp 764-768. The researchers were interested in the flight patterns of a large moth, called colloquially the Death’s head hawk moth. This moth migrates by night, but where and how it goes as it travels south into the Alps has been a mystery. These moths are large enough to carry a transmitter that allowed them to be tracked by a small plane overhead, and that’s what the researchers did with 14 of these moths. They tracked each moth and though they analyzed each track individually and reported some of them, their statistical approach was as a group.
For the TCV, the individual tracks are important. Across the stretch from the release point to where a moth might wind up the next morning, the winds are far from continuously in a constant direction or speed, and yet each maintained a closely constant direction and ground speed throughout to night. The individual tracks they display show slightly wiggly straight lines from the release point to the morning resting place, as the moths compensated for being blown in different directions at different times. They were clearly controlling for something that correlated well with the ground bearing, and not for anything they directly sensed from the air flow.
It seems as clear a demonstration of the TCV as one is likely to find outside the experimental laboratory.
For part of the TCV. Still don’t know what perception it is controlling or how it perceives it, only that it is
Did they not mention geomagnetism as an hypothesis? Given the extensive evidence for this in other animals, I’d be surprised if they didn’t.
I wonder if the transmitter load could also include apparatus that somehow ‘jams’ the geomagnetic input (ideally on and off as controlled by the observers). That would be a more complete TCV.
The TCV shows that the moths control a perception of ground direction (with quite different individual reference values, which I find interesting in itself).
For the purposes of the TCV, we don’t need to know what perceptions at lower levels are controlled in the supporting hierarchy, though indeed in this case the researchers were interested in what senses were used by the moth to allow it to perceive ground direction of flight. Irrelevant to the Test, however, just as the nerve structure of the eye is irrelevant to a test for the controlled variable in a laboratory tracking study.
The same would apply to a TCV for, say, the political support for a party. Is a voter controlling a perception of the quality of a candidate or for the party the candidate claims to represent? You wouldn’t need to know what media the voter uses to decide, or the mechanics of how they cast their vote. You could disturb by changing candidate and seeing the likelihood of the voter changing preference (easier in a multi-party system than in the USA).
When control of any perception not directly facing the external environment is tested, the sensory-motor processes are irrelevant except for the part of the test that asks whether the perception proposed requires unavailable source data. In this case, the researchers proposed a possibility, magnetic sensing, and I suppose a complete Test would check whether the moths actually do sense magnetic field direction relative to their bodies. I haven’t read far enough to find out whether this was, or has elsewhere been, checked.
I think this is a great example of what I describe in Chapter 3 of The Study of Living Control Systems as a first step in trying to identify the variable(s) being controlled by an organism. The fact that the moth’s path is straight – or at least much straighter than expected due to the known but unobserved disturbances to that path by variable wind forces – suggests that the moth is controlling some variable that allows it to do this. Some reasonable hypotheses, perhaps tested in the research, are orientation relative to the position of the sun or to the earth’s magnetic field.
This seems to be another excellent example of “conventional” research, such as that done by McBeath et al (1995), that can be the basis for a more detailed tests for the controlled variable(s).
Interesting. Yes, as Fred noted in 2016, B:CP does not define ‘controlled variable’, it defines ‘controlled quantity’, but as Rick has often emphasized perceptions at all but the very lowest level correspond to a set of variables {v1 … vn} which in principle are amenable to measurement, disturbance, and experimental concealment (though often not so amenable in practice). Complicating this, at higher levels input to the PIF for the controlled perception may include signals from memory that are not presently derived from environmental input.
That part of the Test would seem to require tracing all the way to intensities of {v1 … vn}. Maybe all that is needed is a sufficient but not exhaustive demonstration, sufficient to assure that the higher-level p does not depend upon anything imperceptible. Could easily get kind of murky. Perceptible (has input functions) but unperceived (not in current input of subject, whether or not in current perceptual input of the observer) seems more difficult.
Of course it is conventional research, inasmuch as all the analyses done by the researchers is on group data and uses measures such as standard error of inter-individual measurements. That they presented data for individual flights (night flights,by the way, so sun angle cannot be a supporting controlled perception) is what allows us to say at least that the moths controlled for flight direction and for flight ground speed. Both of those were maintained no matter in which direction and strength relative to the ground the wind was blowing. By the way, the individual moths flew at different altitudes relative to the ground, but I don’t know whether they controlled for that or just, as they say, went with the flow vertically.
If, by definition, the TCV necessarily requires the determination of all the controlled and uncontrolled perceptions contributing to the one being under test, then I doubt whether any reasonably high-level perception can ever be determined to be under the “subject’s” control.
What really needs to be determined is whether the “subject” can perceive the environmental correlate of the putative controlled variable, and can act to change it. The problem is that this is a circular question, in that the evidence is in the fact that when the putative controlled variable is disturbed, as in the case of the moths the direction would be by a side-gust (as it would be in the often used “keeping the car in its lane” example), the perceptual error is corrected. What is substituted in most descriptions of the TCV, including Runkel’s one quoted by Bruce at the start of this thread, is a hypothesis about some supporting atenfel and/or input data and process.
I don’t want to get into a technical discussion of the TCV, at least not here and now. I simply wanted to provide a reference for a natural case in which it is clear that some environmental variable (the moth’s flight direction) is being controlled, and therefore a perceptual correlate of that variable is being controlled.