Martin to Rick on Shannon

[Martin Taylor 921217 10:40]
(Rick Marken 921216 sometime)

I really can't let Rick's latest non-sequitur go unchallenged:

Bill P.'s
important observation was that perception is one link in a
causal loop in which we are all locked -- a loop that obviates
Shannon's concerns about reliable transmission since, in living
systems, perception is not the "start" of communication (or the
"end", for that matter); it is both the start and the end at the
same time.

Every piece of the loop is constrained by Shannon's
observations/theorems. If (like Rick) you don't understand them,
you are doomed not to uinderstand PCT.

I know Rick has been out of town (L.A.) for a few days so he
probably has not seen your post. It probably will puzzle him as
much as it does me. In the past, both of us wondered how,
specifically, Shannon's ideas, or any of the major concepts from
information theory, would improve any of the quantitative
predictions we make with our simple PCT models.

May I repeat and elaborate on an invitation I made about a year
ago? I will send you one of my programs in which two systems (two
people, the two hands of one person, two models, or a person and a
model) interact and produce controlled relationships. The version
I will send includes a procedure that uses data from a first run to
calculate values of parameters for two PCT models. Then the
program calculates predictions of performance during a second run
under altered conditions. Correlations between the predicted and
actual values of all variables reach or exceed the now-familiar
.99+ and rms error in the predictions is low.

What I am about to say is a serious offer. It is not intended to
be malicious or deceptive -- some reviewers of "Models and Their
Worlds" believed Bill and I had bad intentions when we made a
similar offer.

I invite you to add to the PCT model any features of information
theory that you believe must be there. If necessary, use features
from information theory to replace those from PCT. If your changes
improve the predictions by the model, there will be no argument and
no complaint: You will have demonstrated that a person who does
not understand Shannon or information theory does not understand
PCT. Please, invite participation by any of your colleagues who
are interested. All of us are trying to better understand the
phenomenon of control. If a better understanding comes from a
model different from the present version of PCT, that is fine.
Until later,

Tom Bourbon e-mail:
Magnetoencephalography Laboratory TBOURBON@UTMBEACH.BITNET
Division of Neurosurgery, E-17 TBOURBON@BEACH.UTMB.EDU
University of Texas Medical Branch PHONE (409) 763-6325
Galveston, TX 77550 FAX (409) 762-9961 USA

···

From: Tom Bourbon (921218 14:38 CST)
Subject: Re: Rick on (off) Shannon

[Martin Taylor 921218 18:30]
(Tom Bourbon 921218 14:38)

May I repeat and elaborate on an invitation I made about a year
ago? I will send you one of my programs in which two systems (two
people, the two hands of one person, two models, or a person and a
model) interact and produce controlled relationships. ...

I guess I'd better try to describe, as I did a year or so ago, wherein
information theory helps in the understanding of PCT. I didn't succeed
in getting across then, and I'm not sure I'll do any better now. I should
think that the prediction for your proposed system would be no better and
no worse than you would get without it, because you are dealing with a
transparent system of one control level. The understanding you get with
information theory is not at the level of setting the parameters.

If I were to try to develop a model to make predictions in your experiment,
I expect it would look essentially identical to yours, because the key
elements would be the gain and delays in the two interacting loops.

Now consider the interchanges of a week or two ago about planning and
prediction, continued in Bill's post of today to Allan Randall. In those,
the situation is greatly different. The information required from the
lower level for the upper level to maintain control through a hiatus
in sensory acquisition depends greatly on the accuracy of control maintained
at the lower level. Where does that come from, and where does it go?

We come back to the fundamental basis of PCT. Why is it necessary, and is
it sufficient? Let's take two limiting possibilities for how a world might
be. Firstly, consider a predictable world. PCT is not necessary, because
the desired effects can be achieved by executing a prespecified series of
actions. No information need be acquired from the world. From the world's
viewpoint, the organism is to some extent unpredictable, so the organism
supplies information to the world. How much? That depends on the probabilities
of the various plans as "perceived" by the world.

At the other extreme, consider a random world, in which the state at t+delta
is unpredictable from the state at t. PCT is not possible. There is
no set of actions in the world that will change the information at the
sensors.

Now consider a realistic (i.e. chaotic) world. What does that mean? At
time t one looks at the state of the world, and the probabilities of the
various possible states at t+delta are thereby made different from what
they would have been had you not looked at time t. If one makes an action
A at time t, the probability distributions of states at time t+delta are
different from what they would have been if action A had not occurred, and
moreover, that difference is reflected in the probabilities of states of
the sensor systems observing the state of the world. Action A can inform
the sensors. PCT is possible.

In a choatic world, delta matters. If delta is very small, the probability
distribution of states at t+delta is tightly constrained by the state at t.
If delta is very large, the probability distribution of states at t+delta is
unaffected by the state at t (remember, we are dealing with observations and
subjective probabilities, not frequency distributions--none of this works
with frequentist probabilities; not much of anything works with frequentist
probabilities!). Information is lost as time goes by, at a rate that can
be described, depending on the kinds of observations and the aspect of the
world that is observed.

The central theme of PCT is that a perception in an ECS should be maintained
as close as possible to a reference value. In other words, the information
provided by the perception, given knowledge of the reference, should be as
low as possible. But in the chaotic world, simple observation of the CEV
provides a steady stream of information. The Actions must provide the same
information to the world, so that the perception no longer provides any
more information. Naturally that is impossible in detail, and the error
does not stay uniformly zero. It conveys some of the information inherent
in the chaotic nature of the world, though less than it would if the Actions
did not occur. The Action bandwidth determines the rate at which information
can be supplied by the world, the nature of the physical aspect of the world
being affected, and the delta t between Action and sensing the affected CEV
determines the information that will be given to the sensors (the unpredicted
disturbances, in other words), and the bandwidth of the sensory systems
determines how much information can be provided through the perceptual
signal. Any one of these parts of the loop can limit the success of
control, as measured by the information contained in the error signal.

So far, the matter is straightforward and non-controversial, I think.
Think of a set of orbits diverging in a phase space. The information
given by an initial information is represented by a small region of
phase space as compared to the whole space. After a little while, the
set of orbits represented by the initial uncertainty has diverged, so the
uncertainty has increased. Control is to maintain the small size, which
means to supply information to the world.

Things become more interesting when we go up a level in the hierarchy.
Now we have to consider the source of information as being the error
signals of the lower ECSs, given that the higher level has no direct
sensory access to the world, and that all lower ECSs are actually controlling
(both restrictions will be lifted later, especially the latter). Even
though the higher ECSs may well take as sensory input the perceptual
signals of the lower ECSs, nevertheless the information content
(unpredictability) of those perceptual signals is that of the error, since
the higher ECSs have information about their Actions (the references supplied
to the lower ECSs) just as the lower ones have information about theis
Actions in the world. The higher ECSs see a more stable world than do
the lower ones, if the world allows control. (Unexpected events provide
moments of high information content, but they can't happen often, or we
are back in the uncontrollable world.)

What does this mean? Firstly, the higher ECSs do not need one or both of
high speed or high precision. The lower ECSs can take care of things at
high information rates, leaving to the higher ECSs precisely those things
that are not predicted by them--complexities of the world, and specifically
things of a KIND that they do not incorporate in their predictions. In
other words, the information argument does not specify what Bill's eleven
levels are, but it does make it clear why there should BE level of the
hierarchy that have quite different characteristics in their perceptual
input functions.

It is that kind of thing that I refer to as "understanding" PCT, not the
making of predictions for simple linear phenomena. Linear models are
fine when you have found the right ECS connections and have plugged in
model parameters. I am talking about seeing why those models are as they
are. Look, for example, at the attention and alerting discussions, which
come absolutely straight from the Shannon theory. But the results of the
(almost) a priori argument agree with the (despised) results of experiments
in reading that I discussed in our 1983 Psychology of Reading. had I known
about PCT then, I could have made a much stronger case than I did, but only
because of Shannon. The whole notion of Layered Protocols in intelligent
dialogue depends on Shannon, and demonstrates the impossibility of simple
coding schemes (which some people have claimed as the basis of Shannon
information).

I invite you to add to the PCT model any features of information
theory that you believe must be there. If necessary, use features
from information theory to replace those from PCT.

Have I done that to your satisfaction?

If your changes
improve the predictions by the model, there will be no argument and
no complaint: You will have demonstrated that a person who does
not understand Shannon or information theory does not understand
PCT.

An electricity meter reader does not need to understand the principles
of electromagnetism to get an accurate meter reading. This challenge
is misdirected. If there are places where I think the prediction would
be improved, they are likely to be structural, such as in the division
of attention, monitoring behaviour or some such. What should be improved,
in general, is understanding, not meter reading.

I said that if you don't understand Shannon, you won't understand PCT. I
didn't say you won't be able to use PCT to make predictions.

···

================
Having said all that, I might soften a bit, and take the analogy of the
use of the mathematically ideal observer in psychophysics. The ideal
observer is assumed to take whatever information is in principle available
in the signal, and to use it to determine whether or not some specified
class of even has occurred. Trained psychoacoustic listeners often perform
rather like an ideal observer who is presented with a signal some 3 or 4 dB
weaker than the actual one. We say they are within 3 or 4 dB of ideal.

Now we take the observer and add or eliminate possibilities for getting
information about the event. For example, we may let the observer know
what the waveform of the event would be if the event actually occurred,
by presenting it to the other ear. They now approach the performance of
an ideal observer who knows the waveform. Can they do this if the "cue"
is delayed? It depends how much delay, and by looking at the performace
over a variety of delays, we can tell something about what information
the real observers are losing. Is it phase, frequency, or amplitude?
Change the cue and do some more variation, and determine what the ideal
observer might be capable of doing if it lacked this or that kind of
information.

By analogy, it might be possible to make ideal controller predictions,
given different kinds of disturbance or sensory prediction aids. An
example comes to mind (not of an ideal controller). A submarine reacts
very slowly to changes in its control surfaces, but in stable water there
are reasonably simple algorithms to determine where it will be if nothing
changes in the control surfaces over the next few minutes (the chaotic world
doesn't provide much information under these circumstances). So it is
possible to make a display that shows a line indicating where the submarine
will go if the steerer does nothing. If that's where it should go, fine.
Otherwise the steerer moves the controls until the line goes where the
submarine should go. But there are currents and so forth (the world provides
information), so the submarine does not go where the line said it would.
However, the line still predicts where it would NOW go if nothing happens,
so the helm can still control that future position to some extent. How
far should the line go? That depends on the information rate of the world.
If the currents are swirling and unpredictable, probably it should not go
very far, but if they are steady, they provide little information, and the
line can compensate.

Too long. I must go home. I hope that this has been helpful.

Martin

Re: [Martin Taylor 921218 18:30]
Replying to (Tom Bourbon 921218 14:38)
On: Martin to Rick on Shannon

Make a simple offer ...!

···

From: Tom Bourbon (921221 10:15 CST)

********************************************
   Portions of Tom's original message to Martin:
May I repeat and elaborate on an invitation I made about a year
ago? I will send you one of my programs in which two systems (two
people, the two hands of one person, two models, or a person and a
model) interact and produce controlled relationships. ...

I invite you to add to the PCT model any features of information
theory that you believe must be there. If necessary, use features
from information theory to replace those from PCT.

If your changes improve the predictions by the model, there will be
no argument and no complaint: You will have demonstrated that a
person who does not understand Shannon or information theory does
not understand PCT.

********************************************
   Martin to Tom, after lengthy discussion:
Have I done that to your satisfaction?

********************************************
   Tom to Martin (in the present);
No.

********************************************
   More from earlier sections of Martin's reply to Tom:
I guess I'd better try to describe, as I did a year or so ago,
wherein information theory helps in the understanding of PCT. I
didn't succeed in getting across then, and I'm not sure I'll do any
better now. I should think that the prediction for your proposed
system would be no better and no worse than you would get without
it, because you are dealing with a transparent system of one
control level. The understanding you get with information theory
is not at the level of setting the parameters.

If I were to try to develop a model to make predictions in your
experiment, I expect it would look essentially identical to yours,
because the key elements would be the gain and delays in the two
interacting loops.

********************************************
   Tom to Martin (in the present:
All you need do in the case of coordination between two systems is
show me how, working back from Shannon's principles, you end up
with two interacting PCT systems, each with the features Bill
postulated for a single system. If the models that emerge from
Shannon's principles are identical to those presently envisioned in
PCT, your point is made. But I expect to sit down in front of a
PC, grasp a control device, and interact with a Shannon-system in
real time, with results at least as good as those when I coordinate
with a PCT model. And when, without warning to my virtual partner,
I alter my intentions in mid-run, I expect the Shannon-system to do
the things another person or a PCT model would do. If the
situation for modeling coordination is all that transparent, the
task of working from first principles and creating the Shannon-
controller should not be terribly difficult.

As for the single control level, that is merely the form in which
I have published on interacting systems: In several other programs
my interacting systems are hierarchical (two levels). In ARM, Bill
and Greg have programmed a much more elaborate coordinated system,
with a hierarchy of PCT loops and with loops in parallel.

********************************************
    Martin continuing his reply to Tom:
An electricity meter reader does not need to understand the
principles of electromagnetism to get an accurate meter reading.
This challenge is misdirected. If there are places where I think
the prediction would be improved, they are likely to be structural,
such as in the division of attention, monitoring behaviour or some
such. What should be improved, in general, is understanding, not
meter reading.

I said that if you don't understand Shannon, you won't understand
PCT. I didn't say you won't be able to use PCT to make
predictions. ================

********************************************
   Tom to Martin (in the present):
Let me be sure I follow you correctly: Not only do PCT modelers
not understand PCT (your original claim); now modelers are akin to
meter readers and have no need to understand PCT.

********************************************
     More from Martin's reply to Tom (after a long discussion):
It is that kind of thing that I refer to as "understanding" PCT,
not the making of predictions for simple linear phenomena. Linear
models are fine when you have found the right ECS connections and
have plugged in model parameters. I am talking about seeing why
those models are as they are.

********************************************
   Tom to Martin (in the present):
Neither the model of the control system or the environmental
phenomena with which it interacts need be linear. Bill has
published and posted on introducing nonlinearity into the PCT model
and into the environment. So has Rick. I haven't, but I have
tested the effects of nonlinearities in the coordinated systems:
The models continued to function at the same level of realism. I
will try to put together a post on that topic, in the style of my
post a few days ago on adding disturbances to various signals in
the control system.

In the meanwhile, I wonder why so many people continue to assert
that PCT models are necessarily linear and cannot explain and
predict events when there are nonlinearities in the system or the
environment. Where do these ideas come from? Why won't they go
away? (Dennis Delprato: If you are starting a collection of false
assumptions and assertions about PCT, this certainly is one. We
should compare collections -- mine goes back a few years.)
Everyone who clings to that assumption should read Bill's
"spadework" paper in Psych. Review (1978 -- 14 years ago folks)
where he discussed various blunders in the history of cybernetics.
That is also where he quantitatively demonstrated the ease with
which a PCT model maintains control in the presence of
nonlinearities.)

********************************************
    Martin replying to Bill Powers' reply to Martin's reply to Tom:
(TB: What tangled webs we weave ...!
[Martin Taylor 921218 19:45]
Bill to Tom Bourbon (921218.1438 CST) --

********************************************
   Bill quoting Tom:
In the past, both of us wondered how, specifically, Shannon's
ideas, or any of the major concepts from information theory, would
improve any of the quantitative predictions we make with our simple
PCT models.

********************************************
   Bill replying to Tom:>
This is the right question about information theory -- not "does it
apply?" but "what does it add?" The basic problem I see is that
information theory would apply equally well to an S-R model or a
plan-then-execute cognitive model -- there's nothing unique about
control theory as a place to apply it. Information theory says
nothing about closed loops or their properties OTHER THAN what it
has to say about information-carrying capacity of the various
signal paths.

********************************************
   Martin replying to Bill:
You are right, but that "OTHER THAN" is a pretty big place to hide
very important stuff. I had not previously realized that you
wanted me to use Shannon to differentiate between S-R and Plan-
then-execute. I think I did incidentally make that discrimination
in my posting in response to the same posting by Tom. At least I
think I showed how applying Shannon demonstrated that neither S-R
nor Plan-then-execute could be viable. But we knew that already,
so I didn't play it up.

********************************************
   Tom to Martin (in the present):
Perhaps Bill and I both missed it, but I did not see where you used
Shannon to demonstrate that neither S-R or Plan-then-execute models
would be viable. Your remark leads me to think that you also used
Shannon to demonstrate that a PCT model would be viable. Was that
the case?

********************************************
   More of Bill replying to Martin's reply to Tom:
In all these cases, something has to be added to get a workable
system. And I don't think that this something comes from the
abstract principles involved, however convincingly one can prove
that they apply.

********************************************
   Martin replying to Bill:
Yes, that's absolutely right. Natural laws are no use without
boundary conditions to describe particular situations. But if you
understand the abstract principles, you can make better
[bridges/kettles/radios/control systems].

********************************************
   Tom to Martin (in the present):
Your concluding remark brings us back to where we started. That is
what I asked you to do in my original post: Use Shannon's natural
laws to build a better PCT model of coordination among independent
systems acting in an environment where conditions change in ways
the systems cannot predict. Do that and let me interact with one
of the systems in real time, then I might change my mind.

Until later,

Tom Bourbon e-mail:
Magnetoencephalography Laboratory TBOURBON@UTMBEACH.BITNET
Division of Neurosurgery, E-17 TBOURBON@BEACH.UTMB.EDU
University of Texas Medical Branch PHONE (409) 763-6325
Galveston, TX 77550 FAX (409) 762-9961
USA

[Martin Taylor 921221 15:30]
(Tom Bourbon 921221 10:15)

  Tom to Martin (in the present:
All you need do in the case of coordination between two systems is
show me how, working back from Shannon's principles, you end up
with two interacting PCT systems, each with the features Bill
postulated for a single system. If the models that emerge from
Shannon's principles are identical to those presently envisioned in
PCT, your point is made.

Is that what you want? I had misinterpreted you to mean that you wanted
numerical predictions that came from a different source but were as good
as your predictions. I said that was an inappropriate challenge because
your model would be at least as good, coming as it does from a presumably
correct structure, as would a model of the same structure derived from an
information-theoretic background.

The challenge you now pose is worth trying. And it might help shed some
light on the arguments that were going on a month or two ago between Bill
and Greg about social control.

···

=============

   Martin replying to Bill Powers' reply to Martin's reply to Tom:
(TB: What tangled webs we weave ...!

It happens when we get a discussion of more than two people. But no-one here
is trying to deceive, as far as I can see. If they are, it's pretty well
done... I much prefer multi-party discussions to discussions by X + Bill P.

  Tom to Martin (in the present):
Perhaps Bill and I both missed it, but I did not see where you used
Shannon to demonstrate that neither S-R or Plan-then-execute models
would be viable. Your remark leads me to think that you also used
Shannon to demonstrate that a PCT model would be viable. Was that
the case?

Yes. And in trying your challenge, I suspect I will have to go over
that same ground again, so maybe I will be able to make it intelligible.

What I think I will have to do is to write what amounts to a serious
paper rather than a set of impromptu come-backs on-line. I have not
seriously considered the two-person version in light of information
theory hitherto, so it will take a little thought to get it right. I
was dealing only with why PCT was both necessary and (usually) sufficient
for a living organism. ("Usually", because accidents can kill.) Now
we must deal with another controller. I don't know whether this will
change the analysis or not.

I said that if you don't understand Shannon, you won't understand
PCT. I didn't say you won't be able to use PCT to make
predictions. ================

********************************************
  Tom to Martin (in the present):
Let me be sure I follow you correctly: Not only do PCT modelers
not understand PCT (your original claim); now modelers are akin to
meter readers and have no need to understand PCT.

I should know better than to tease people with serious agendas (can
one pluralize a plural knoun?).

But yes, when the meter has been designed and built, the meter reader
doesn't need to know how Maxwell's equations work. When Bill has
designed the PCT structure, it doesn't take genius to fill in the
parameters without understanding the beauty and power of the theory.

I retract my original statement, which was mainly intended to provoke
Rick (he provokes me often enough that I felt entitled). There are
many kinds of understanding, and one approach will suit one person while
being quite obscure to another person. For me, Shannon theory explains
unambiguously why PCT works and why higher levels work more slowly on
average than lower levels, etc., etc. It provides a rationale based on
fundamental facts of nature, rather than in the finding that PCT works, or
that we can see how evolution could have produced control hierarchies.

I like (Occam's razor) things that are consequences of other things we
know about nature better than things that stand off on their own, needing
new basic foundations. It is in this sense that I say that one needs
Shannon if one is to understand PCT. But there may be other ways of tying
PCT to fundamental natural laws. If so, I would say that one could not
understand PCT without understanding those ties, either.

======================

  Tom to Martin (in the present):
Neither the model of the control system or the environmental
phenomena with which it interacts need be linear. Bill has
published and posted on introducing nonlinearity into the PCT model
and into the environment. So has Rick. I haven't, but I have
tested the effects of nonlinearities in the coordinated systems:
The models continued to function at the same level of realism.

Mea Culpa. As I replied today to Bill, my problem was more of sloppy
wording than of poor understanding about nonlinearity. You challenged
me with a linear problem, so I was thinking in those terms. As you
can tell from reading that and other postings, I know very well that
nonlinearity is essential in the input (at least of a multilevel
control system) and not very relevant in the output.

In the meanwhile, I wonder why so many people continue to assert
that PCT models are necessarily linear and cannot explain and
predict events when there are nonlinearities in the system or the
environment. Where do these ideas come from? Why won't they go
away?

I can guess. It is because every formal didactic presentation uses linear
equations to describe the behaviour of an ECS. The methods used to
solve the equations and demonstrate control would not work in a non-linear
system. It is a natural step, for someone who is introduced to PCT through
the equations, to believe that PCT works only in linear systems. The
ideas you get first are the hardest to dispel, but demos and the like
can help get rid of this one.

I'll think about your challenge.

Martin