Social control systems; H(S); evolution

[From Bill Powers (930613.2200 MDT)]

Bob Clark (930613.2145 EDT) --

When "functioning" is abnormal we have assorted problems, some
minor, some major. Sometimes conflict is avoided, sometimes
it is ignored, sometimes it is "repressed" or "suppressed" (I
don't like these terms myself, but they are used).

I don't like such terms, either. They sound like explanations but
actually explain nothing: they define a cause in terms of its

You seem distressed with the situation:

In all social hierarchies of which I know, the use or credible
threat of physical force is the main means of maintaining the
structure. This is a fundamental design defect; it creates
conflict automatically.

Yes, Bill, of course. I don't like these effects either. But
neither you nor I created these social hierarchies.

What does not liking them, or who created them, have to do with
anything? A design defect is, in this case, a feature of the
social organization that continually threatens to destroy it, and
almost inevitably will. Not what you call a steady-state

They are created by people cooperating in an attempt to achieve
common goals. Whatever else?

I guess that as an engineer, I am pained when people cooperate in
an attempt to achieve common goals using means that will almost
certainly, in the end, frustrate the attempt.

You're invoking your own criterion of effectiveness, not the
generally accepted one.

Dictionary: "effectiveness, n. derived from effective, adj.
1. adequate to accomplish a purpose; producing the intended or
expected result."

Yes, that's the meaning I assumed. What most people mean by
"effective" is doing whatever it takes to get the job done right
now. Hitler was greatly admired by many Americans in the 1930s as
an effective leader. Perhaps the distinction I'm trying to make
is between short-term and long-term effectiveness, as you
suggest. I don't see many signs of long-range effectiveness as a
goal in government or organizations: more like, how much money
can we make by next week?

Speaking literally, social CONTROL systems don't actually
exist, and never have.

This seems rather odd to me. Do you mean they "don't exist"
BECAUSE of their ultimate "reliance on coercion?" For the one
"being coerced" it surely seems pretty real -- no matter how
unjustified, unwise, or whatever.

See Tom Bourbon's diagram of how a "social control system" has to
be implemented. I simply mean that the only control systems in a
social organization are the individuals in it. If coercion is
applied, it is applied by somebody to somebody. People often try
to use the concept of social rules and social demands as a way of
avoiding responsibility for the things they do with their own
arms, legs, and voices. They like to pretend that they are
governed by something outside themselves; in that way they can do
what they really want to do, but when problems or objections
arise they can pass the buck. "I'm only following orders."

Regarding the "reality" of social control systems, I think it
is pertinent to ask "where" they are. Their physical existence
is sometimes very tenuous, often being expressed in the form of
marks on pieces of paper -- or equivalent.

I am thinking very literally about this. Marks on pieces of paper
can do nothing at all but be marks on pieces of paper. They are
not control systems: they have no perceptions, no comparators, no
outputs, no goals. They can be used by human beings in the course
of human control behaviors, but by themselves they can do
nothing. I think we will make the most progress in understanding
social systems if we deal with them the same way we model
individual behavior: with models that we proposed to be
literally, physically, correct.

The only important place where they exist is in the minds of
the affected individuals, especially the participating

Yes, exactly.

"Real" or not, I think we can learn a great deal about how
people think about and control their environments by studying
their Social Control Systems.

That's the way social systems have traditionally been studied: in
terms of metaphors. As long as we have a new theory of human
nature that is based on a model intended to be literally true,
why not try that same philosophy with social systems? If we keep
insisting that each individual is responsible for all that
individual's actions, maybe some day people will start to believe

Of course when you say "their" social control systems, you
probably mean each person's conception of a social control system
-- so my proposal agrees with yours.


Allen Randall (930611.1700) --

Bill, I have a real problem with this whole log(D/r) thing. I'm
not sure where you got this definition, but I do not think it
can be right.

I got it from Martin Taylor. D is the range of the disturbing
variable d, and r is the resolution with which the disturbance is
measured (or perceived open-loop). This allows for D/r different
values of d, with an information content of log(D/r). I'm just
following orders. I don't agree with them, but I'm just trying to
do what they seem to say.

If two signals have wildly different scalings of amplitude, but
are otherwise identical, then I can write a very short program
to convert one to the other, no matter how long the signals are
(they differ, after all, by only a simple scaling).

Suppose you have the relationship X(t) = Y(t)/10. In your
computer program, Y(t) will be some input waveform, and X(t) a
similar output waveform with one tenth the amplitude. No matter
how you write the program, when you compute X from Y you will get
a waveform with less relative resolution than there was in Y. If
you multiply X by 10, you will not get Y back; you will get a
coarser representation of the original waveform, with about 1/10
of the original resolution. Whenever you scale down a variable
having a finite resolution, significant digits are lost forever;
you can't get them back unless you're measuring the scaled-down
variable with smaller resolution units than you're using to
measure the full-sized variable. That's all I was trying to say.

To clarify this issue, so I understand exactly what your
understanding is, tell me how to compute log(D/r) for the
following sequence S:

.0 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .2 .7 .7
.7 .7 .7 .7 .7

Before I could do that, you would have to tell me the size of r.
There isn't any inherent "resolution" in an arbitrary sequence of
numbers like the one above. What they mean depends on the
physical situation they are taken to represent. If this is a
series of measurements of some physical variable, then we have to
talk about the measuring device's resolution. If the least
significant unit of a measurement is 0.001, then D/r for the
above series is 800. If the least unit is 0.01, it's 80. The
upper limit on r would be 0.1, because if it were any greater you
couldn't distinguish both odd and even numbers of exact tenths.
But the fact that the series happens to contain only exact tenths
of a unit doesn't imply that a measurement is only accurate to
0.1. The above could be an unlikely-seeming sequence to obtain
with a device that can measure to 0.001, but probability-wise
it's no more unlikely than the sequence .001 .533 .533 etc. A
specific waveform described by the sequence with three-digit
resolution would also be described, at lower resolution, by
numbers accurate to 0.1.
You say

You assume because something is hard for you to understand,
given your own background and prior knowledge, that it is
therefore wrong or misguided - that it is being made complex
for the pure love of complexity. Judge not, lest ye be judged.

Fair enough, but I think there's something more than this going

My problem is that I can't find any link between the
manipulations you talk about and any PHENOMENON. You seem to be
setting up arbitrary hypothetical examples and then applying
mysterious calculations to them for no reason I can figured out
(except that it can be done).

I think you express my problem very well when you say

The problem with computing H(S) as given above is that there is
really no such thing as H(S). Remember that *something* is
always assumed. So we really compute H(S) with respect to some
language, or model, or probability distribution, L: H(S|L).

This seems to explain something. When I've asked for instructions
about how to compute information-related quantities for a given
set of experimental data, the only answer I've received so far is
"Well, that depends on what you assume." There seems to be no
rationale for making any particular assumption that would lead to
a definite prediction. It seems that no matter how you conceive
of the situation to define a probability, there's always another
equally plausible way to conceive it, leading to a completely
different value of probability, H(S), or whatever. How do you
decide what is the most plausible way to set it up for a control
system model of a specific example of behavior? So far I'm
drawing a complete blank on that. And, apparently, so are you.

In a control-system model, we make assumptions about the form of
the model, and within those assumptions we calculate the values
of the parameters that fit the model to data from experimental
runs. So we have a criterion for comparing different assumptions:
how well the best-fit model having the assumed organization
predicts the data. If we assume a simple linear system with an
integrator in the output, we find a certain best degree of fit.
If we add an input delay to the model, we get a measurably better
fit, for every subject. So this tells us that the model with the
delay is better than the model without it. We already know that
many other models are ruled out because they don't behave in any
way resembling the real behavior -- a positive feedback model, or
an on-off model, for example.

What is it that you can use to decide which assumptions about the
situation yield the most realistic or predictive measures of
H(S)? What's your criterion for deciding that one set of
assumptions is better than another? If the value of H(S) that you
calculate depends on your assumptions, then you have to have a
way to say that one number is a more plausible representation of
the real situation than another, so you can decide that one set
of assumptions is better than another. If you can't say that, all
you have is an all-purpose calculation that can be applied to any
data set and that can yield any number you like today. You're not
constrained to the real universe; you can't tell whether your
results apply in this universe or only in an imaginary one.

In your example of computing H(S), you set up an arbitrary
situation in which one sequence of numbers occurs 1/8 of the
time, and a different sequence the rest of the time. Naturally,
you calculate that there are 3 bits of information. It doesn't
matter what the events are, whether sequences or just occurrances
of a single event. So if you're given the rule, you can calculate
H(S). After all, the way you set up the example completely
defines the probabilities, so it is easy to apply the

But how do you get from the physical situation to the correct
definition of probabilities? If you consider only examples in
which it's already been decided what the probabilities are, it
doesn't seem much of a feat to then proceed to calculate a
measure that's a function of the probabilities. If you happen to
get the rule wrong, you'll still be able to calculate H(S). Maybe
the actual probability isn't 1/8, but 1/7, or even t/8, where t
is the elapsed time. Or maybe it's rand(t)/8. The possibilities
are infinite; you need some way to determine the probabilities
(if that's even the right conception of the physical situation)
before you can do any calculations of H(S). Of course even if the
distribution isn't actually probabalistic but follows some
perfectly regular rule, you can use the measures to form a data
set and then compute H(S). But how will you know that this is an
inappropriate treatment of the data? Your calculations won't tell
you that, will they?

You and Martin don't seem to care much for actual experimentation
with real systems. This purely mathematical approach, however, is
full of pitfalls. In a mathematical approach you're setting up
premises (axioms) and then following out their implications. But
without experimentation, you can't tell when you select axioms
having no counterpart in, or even contradicting, experiencable
reality. A conclusion that is provably true in mathematical terms
may not apply to any real systems, because the underlying axioms
may be false in this universe (even though you can accept them as
ground rules for the game). If you're doing the mathematics
simply because you enjoy that sort of intellectual pursuit, fine
by me, have fun. But if you want to apply the mathematics to any
real systems, you have to be selective about the axioms and make
sure they aren't false to fact. The only way to do that is to do
experiments to see what constraints nature imposes. You can't
reason them out.

Bruce Nevin (Wed 930609 07:11:43 EDT) --

Harris's test for contrast/repetition is a form of the Test for
a Controlled Variable. Ask a native speaker to control a
perception of "repetition" or "same word(s)". Produce two
pronunciations (perhaps recorded) that are alike, except that
in the second some portion is replaced by a similar portion of
a different utterance (perhaps by splicing segments of
recordings). The native speaker assents that they are
repetitions, or dissents, saying that they are different words.

Here's my problem. If I present you with a picture of a grape and
a picture of an elephant, you can distinguish between them; the
perceptual input that allows you to perceive the grape does not
respond to the elephant, and vice versa. So I have established
that you have two perceptual functions, one for each picture
(sort of). Am I then justified in saying that you are perceiving
something called "contrast" between the grape and the elephant?
Or is the notion of contrast an interpretation by the observer,

The segments are relevant because they represent the contrasts
between utterances and locate the points of contrast within

But isn't it really that the listener's perceptual functions make
a distinction, rather than that there are objective contrasts in
the sentences?

But because each such alternative system is a representation of
the contrasts between words in the language, they are
interchangeable in respect to our observational primitives, the
contrasts. In each case, substituting one element for another
yields the representation of a different word. Conversely, any
two words that native speakers perceive as different are
represented differently from each other, no matter which system
of representation is used.

When you say "any two words", how does anyone (including the
experimenter) know that they are distinct words? By applying the
same kinds of input functions that the subject applies. The
method you describe is ingenious; it even seems objective. But it
still is defining "segments" in terms of human perceptual
functions, not the other way around. What grates on my tender
sensibilities is speaking of the contrast as if it existed in the
utterance, or pairs of utterances. Maybe this can't be helped.
But it still reminds me of studying stimuli to see what they have
in common, instead of studying perceptual systems to see how they
create variables out of inputs. >Bill, you suggested
(930528.1930 MDT) that contrast is a non- >phenomenon. Easier to
say that the phoneme is a non- >phenomenon.

Not that it isn't a phenomenon; just that it isn't necessarily
what the subject perceives when stating that one perception is
not another one, or that two perceptions are actually only one,
repeated. If I have two experiences and they actually involve
only one perception, I classify this situation as "same." If
there is more than one perception, I classify it as "different."
But before I can perceive which category of situation it is, I
must know already whether one or two perceptions were involved: I
must know that both experiences came from _this_ perceiver, or
that one came from _this_ and the other from _that_. If this sort
of discrimination hasn't already been made, perceptually, there
is no way to decide on the category "same" or "different."

It's on this basis that I maintain that "contrast" isn't an
explanatory term, but only a descriptive one. We don't perceive
that two things are the same or different because they ARE either
the same or different. We can only make that judgment after the
discrimination has been made at lower levels. If we perceive two
things via two input functions, we conclude that they are
different; if both perceptions come from the same input function
we call them the same.

It seems to me, therefore, that Harris' study does reveal
something about the "tuning" of input functions, but that the use
of the term "contrast" objectifies something that is really a
judgment after the fact. Also, as in the case of phonemes, the
actual degrees of freedom of the segments remain unknown. All we
end up knowing is that when segments from different words are
combined, a person either perceives something new or does not.
What makes the difference remains undetected. The hypothesis that
the differential responses are due to something called "contrast"
isn't very informative.
My citation of Land's experiment wasn't meant to be taken very
literally. I was simply trying to find an example that might
support the concept of perception through contrasts, by proposing
a mechanism for normalizing a set of perceptual signals to make
differences between them more important than their absolute
values. After normalization, one perception is in effect being
judged in relation to others, because the other perceptions have
influenced the average. Recognizing a melody played in different
keys would be a simple example. The relationship of one note to
another is vastly different in two repetitions of the melody
(different frequency ratios) unless you somehow factor out the
change of key. Just an idea.

But it turns out that there are no reliable invariants in the
acoustic signal.

That is an instrumentation and theory-based conclusion.
Obviously, there ARE invariants, because we derive subjective
invariants of perception from the acoustic signal. It's just that
these invariants aren't derived from the signal in any way that
anybody had guessed so far. If you use the wrong input function,
you won't detect the invariants. We have to conclude that
research so far has been constructing invariants by a method that
doesn't yield the same invariants that real human input functions
yield. Maybe the sound spectrograph isn't organized the way
auditory input functions are organized. From what I read, it
seems that self-organizing neural nets CAN find the invariants
that are phonemes. They probably don't look anything like what we
would guess by using an instrument like a sound spectrograph.

When you object to the proposal that toddlers are able to
achieve social agreements, you seem to be limiting the notion
to verbally negotiated agreements.

At some early age, infants are unable to perceive sequences such
as patty-cake. Perhaps they don't perceive or control any kind of
sequence as a sequence, intentionally. However, an observing
adult who takes sequentiality as a given property of the world
can easily see sequences, even controlled sequences, in the
movements the infant habitually makes while exploring the world.
The adult says "But look, first the baby drops the toy, then it
reaches for it, then it cries. That's a perfectly clear and
repeatable sequence!" The only trouble is that the baby knows
nothing of that sequence or any other. That's just how things
happen to work out, usually.

I think it's important to search assiduously for the LOWEST level
at which to interpret behavior. If you can explain what you see
in terms of controlling relationships, don't invoke control of
categories or anything higher. The only reason to use a higher-
level explanation is that something is left unaccounted for
without it.

To seek and achieve an agreement requires being able to perceive
an agreement as such, and to act to create agreement if none
exists. Whether the agreement is stated verbally is irrelevant;
this is an abstract kind of perception which, I think, requires
higher-level systems than a toddler has yet developed. I can see
toddlers controlling relationships, such that when you say
something the toddler says it back, or says something else that
completes the relationship, and I can see an adult interpreting
this to mean that the toddler perceives this as an agreement to
be sought and carried out. The adult sees a real agreement there,
but it's only metaphorical: the toddler is behaving _as if_
seeking an agreement.

To find out whether toddlers actually control for something
called an agreement would require some pretty ingenious
experiments. I don't think they've been done yet.

Second pass:

You propose that the appearance of word-contrast, syllable-
contrast, etc. is a byproduct of phoneme recognition.

That's not my proposal, although I believe that what you say is
true. I do think that words are recongized as functions of sets
of phonemes and phoneme transitions -- although not any functions
that anybody has been able to define yet. But that wasn't my

I'm saying that what is perceived is a syllable or a word, not a
contrast. We might try to explain the fact that an input function
responds specifically to a given word or a given syllable by
looking for a contrast with other words or syllables, but in PCT
terms that would be, or could be, a mistake. Two input functions
respond differently to inputs because they compute different
functions of the inputs. When one input combination occurs, one
function produces a larger perceptual signal than any other
(ideally). The functions do not respond by reporting "I perceive
a contrast." If that were so, they would all respond the same
way. Each input function responds or doesn't respond, or responds
more or less.

Also, when you speak of contrasts, which contrast do you mean?
Every possible discriminable segment differs from all other
discriminable segments, simultaneously. If you have 3000
discriminable syllables or words in the working set, you have
4,498,500 dyadic contrasts in the set. Does each perceptual
function have to search through 2999 contrasts to decide what
syllable or word is being heard? Is a person using a working
vocabulary of 3000 words actually working with 4 million

The problem with perception of anything in terms of contrasts is
really a logical one: this would mean that you couldn't perceive
anything unless something contrasting were simultaneously
present. You couldn't hear "dog" unless someone were
simultaneously saying "dig" or "dug" or "Dag." The fact that you
_can_ find contrasts between such words doesn't mean that any one
of them is recognized because of contrasts. The contrasting words
are not present at the same time. How can you perceive a contrast
between one perception?

I think it quite likely that there are phoneme-recognizers tuned
to respond to the same phonemes that a person recognizes. I also
think that words are probably detected by input functions that
receive many phoneme signals and compute functions of them that
are word-perceptions. The negative results from linguistic
searches for invariants simply show that the wrong invariants
have been tested.

Suppose that the second proposal is true. Then it follows that
one should be able to determine the contrasts between words in
a language by examining the acoustic record of utterances in
the language, or by examining a trace of articulatory movements
used in producing them.

Neither of these conclusions is right unless you add a rider:
unless we use the right function of the acoustic record obtained
from the right instrument, and unless we examine the right
function of measures of articulatory movements. You have to put
these measures through the right perceptual functions before you
can get any indication of "contrasts." Using the wrong perceptual
function (or just the raw data itself) will just yield mush.
There's no reason to think that the weightings given to such
measures, or even the unweighted measures themselves, will prove
relevant to the problem. Why are sound spectrographs used? Not
because there's any reason to think they represent sound the same
way the ear does; they probably don't. They're used primarily
because somebody invented them and nobody invented anything else.
When it was found that you can't obtain phoneme invariants from
those instruments, somebody should have concluded that they were
the wrong instruments, or that their output was being wrongly
transformed, instead of concluding that such invariants don't

Generations of linguists worked very hard and with great skill
at the problem of representing unambiguously the appearance of
word contrast in various languages. The aim is a set of
graphical cues from which anyone could produce variable
repetitions of words as a native speaker does, rather than
imitiations of the original pronunciations of which the
graphical cues are a representation.

Generations of psychologists have worked very hard and with great
skill (or at least persistence) at the problem of showing what
stimuli give rise to what responses. So what? If you're using the
wrong model, all the results mean something other than what you
think they mean, if they mean anything at all. Anyway, who gets
to judge how skillful the linguists were? Other linguists?

I'm not disputing the data. I'm just suggesting that the term
"word contrast" is probably the wrong way to think about thems --
like looking at all the valleys and not seeing the hills that
create them (or vice versa).. The data might well contain
evidence about human perceptual functions. It would be nice to
see PCT being applied to them, instead of conventional concepts.

I think that if you really want to do PCT and not just use PCT
translations of existing linguistic concepts, you have to start
looking for alternative explanations of the data. I keep trying,
but you're the one with the data.
Martin Taylor (930609 11:10) --

Nice observation about control systems conserving resources.

There's a related evolutionary reason for control being the
organization of choice. Natural selection takes place through
effects of the environment on organisms. Any organism capable of
controlling some of these effects would be less likely to be
selected out than an organism which had no control over them.
Ergo, there is a strong evolutionary bias in favor of creating
control systems.
Well, in a very sketchy and somewhat sleepy way that more or less
catches me up with some of the posts I've been ignoring, so I can
get back to Version 4. As my boss Allan Hynek once said to Dick
Aikens and me when we were in the middle of explaining some
project to him in his office, "Keeping talking, I'll be right
Best to all,

Bill P.