information blah blah perception blah blah disturbance blah...

[From Bill Powers (960617.1500 MDT)]

Martin Taylor 960617 15:45 --

What was said in 1992 seems to be mostly a matter of what you now think
were your most important points then, and thus what you select from the
stream of communications to show that you were right all along. One
great difficulty is that your arguments then did not impress me any more
than they do now, yet you cite them as if they settled the matter once
and for all -- and as if I had agreed that they did. Let's just look at
a few of them.

     In fact, [the discussion] arose out of my 921218 analysis of the
     informational basis of PCT:

     "The central theme of PCT is that a perception in an ECS should be
     maintained as close as possible to a reference value. In other
     words, the information provided by the perception, given knowledge
     of the reference, should be as low as possible."

This is not and never was the central theme of PCT as I see it. The
central theme of PCT is that organisms control their perceptions by
acting on the environment. How well they control them depends on the
parameters of the control system. There is no "should" involved.
Organisms control as well as they control, neither better nor worse.

I have no idea what you mean by saying "the information provided by the
perception, given knowledge of the reference". Information and knowledge
are not the same thing, and anyway what is there in a control system
that can evaluate the information in a perception, with or without
"knowledge" of the reference? You're talking gobbledygook.

     Later, dogmatic assertions were made that there is no information
     about the disturbance in the perceptual signal, assertions that we
     proved false, using experimental simulations agreed to be effective
     for the purpose.

They were not adequate for the purpose except in your own mind. You and
Randall agreed they were effective; Rick and I did not. In fact there is
no way to tell what the disturbing variable is from knowledge of the
variables in the control loop (perception, reference, error, output) or
from the forms of the functions in the loop (perceptual, comparison,
output, feedback). The reason is very simple: exactly the same
perturbation of the loop can arise from an infinity of different
disturbing variables acting (singly or together) through an infinity of
different disturbance functions.

In your first demonstration, you employed a step-disturbance acting
through a unity disturbance function. This led to a step-change in the
perceptual signal, which you then assumed represented the true
disturbance. But it did not. The same step-change in the perceptual
signal could have been created by an infinity of different disturbances
acting through different disturbance functions. There is no possibility
that one could work backward from knowledge of the perceptual signal to
deduce the nature of an unknown disturbing variable or variables acting
via unknown functions. There is simply no information (in any sense of
the word) about the disturbing variables in the perceptual signal.

Every now and then you seem to wake up and say "Oh, OF COURSE there is
no information about the CAUSE of a perturbation in the perceptual
signal. How could you ever have thought I would suggest such a silly
thing? Please read what I say and you will not attribute such foolish
ideas to me." And then you turn right back to the same theme and claim,
as above, that there really is information in the perceptual signal
about the disturbance, and that you proved it.

I can't account for this except by guessing that you are shifting
meanings of "disturbance" between one set of statements and the other.
One sense refers to the proximal perturbation of the input to the
perceptual function that results from whatever distal disturbing
variables happen to be acting. The other sense (which I always mean by
"the disturbance") refers to the changes in the distal disturbing
variables themselves. Your statement about information in the perceptual
signal about "the disturbance" cannot apply to the distal disturbing
variable. It applies trivially to the proximal variable, because the
proximal variable is exactly what we mean by a CEV. To say that the
perceptual signal contains information about the state of the CEV is a
tautology, because that relationship defines the nature of the
perceptual input function. As I tried to point out four years ago, if y
is the sum of a, b, c ... d, then there is no way to work backward from
knowledge of y to the state of a, b, c, and so on. You could have
exactly the same value of y arising from an infinity of combinations of
a, b .. d. A control system can control y if it can vary one of the
variables on which y depends. To do so, it does not need to know
anything about the states of the other variables on which y depends.
Nothing. NADA.

     At least, they were agreed to be effective until the results showed
     the dogma to be false. Then, and only then, were irrelevant
     objections raised.

This somewhat scurrilous allegation rests on our initial difference in
conceiving the conditions of the "challenge." Rick and I were assuming
that you would be given only the state of the perceptual signal. You
then proceeded to use your own assumptions about the forms of all the
functions, including the disturbing function, and the values of all the
variables and signals, including the reference signal, to deduce the
only remaining unknown, the disturbing variable.

Rick sent you some lists of numbers on several occasions, representing
the state of the perceptual signal in a working control model, and
challenged you to deduce the behavior of the disturbing variable from
knowledge of the behavior of the perceptual signal (I see that he is
offering to do this again). If the perceptual signal had contained
information about the disturbance, you should have been able to use that
information to deduce the behavior of the disturbance. Obviously, you
could not do this. Rick's challenge should have been completely
sufficient to show you how we conceived of the challenge in general.
What you did was to permit yourself to use all kinds of knowledge that
Rick and I were ruling out. Our objections were quite relevant to our
understanding of the phrase "information in the perceptual signal about
the disturbance."

You citing you:
     The fact that the fixed functions were the output function and the
     feedback function of the control loop is neither here nor there.
     The fact that they don't vary as a function of the waveform of the
     disturbance is what matters. The only varying item used was the
     perceptual signal.

You citing me:

You forgot to mention the form of the input function, the function
relating the disturbing variable to the controlled variable, and the
setting of the reference signal, all of which you must also know.

You now:
     And could you now, after three years of consideration, tell me
     which of these varies in a manner coordinated with variations in
     the disturbing influence on the CEV? If you can correctly assert
     that any one of these contains information about the fluctuations
     of the disturbance, then and only then can you criticize the
     demonstration experiment and the derived conclusion.

Wait a minute. You're saying that I can't criticize your experiment and
its conclusion if I can't correctly assert that any variable or function
in the control loop but the perceptual signal "varies in a manner
coordinated with variations in the disturbing influence on the CEV." If
I've untangled this set of nested negatives correctly, you're saying
that the perceptual signal _does_ vary in a "coordinated" way (whatever
that means) with the disturbing influence.

But this is exactly what I am trying to tell you is your primary
mistake. The perceptual signal does NOT vary in a way that correlates
with any particular disturbing variable. At one moment there might be a
single disturbing variable acting through a simple linear function; at
the next there might be twelve disturbing variables acting through a set
of functions ranging from square to square root to exponential. The
control system will behave no differently in any case. It simply senses
the controlled variable and acts according to deviations of its
perception from the momentary setting of the reference signal.

Furthermore, given complete knowledge of everything in the control loop,
but not of the environment beyond the input quantities themselves, you
could certainly deduce the state of a hypothetical disturbing variable
based on assuming a hypothetical disturbance function. But this would be
a complete fiction; it would not be a "reconstruction" of the true
disturbing variable. Your chances of guessing correctly what the actual
number of disturbances is, and what their individual waveforms are, and
how each one is linked to have an effect on the controlled variable, are
essentially zero. And the control system can't do this, either.

     But (as I said those long years ago as well), is it not absurd to
     ask the control system, which has but a single scalar value for its
     perceptual signal, to _know_ (perceive, understand,...) anything
     other than the value of the CEV. Is it not a red herring to suggest
     that anything in the discussion hinges on this absurdity?

I use "know" in a loose way, to be sure. I say that a system "knows"
about something outside it if there is a variable inside the system that
covaries with the external something. A photocell "knows" about light
intensity, but not about color. In a simple control system, the only
"knowledge" that exists is the perceptual signal. And it is "knowledge"
only in the sense that it represents the value of a function of some set
of input quantities.

Since this is the only knowledge that the system itself has, it is
absurd to say (as you have said) that the system "uses information" that
is "contained in" the perceptual signal. All the control system needs is
the perceptual signal itself. It does not have to perform any operations
to detect or manipulate measures of information. So who is being absurd
here?

     You should be stating that "as the precision of opposition to the
     disturbance increases, so the information about the disturbance
     remaining in the perceptual signal decreases" and then you would
     see it as a perfectly straightforward, self-evident proposition, in
     place of a paradox contrary to reason.

But that is contrary to the idea that the control system uses the
information in the perceptual signal to construct an output that
precisely opposes the effects of the disturbance on the input quantity.
The paradox lies in claiming that control -- the precise opposition to
the effects of an unknown disturbing variable or variables -- relies on
information in the perceptual signal, and also to say that the better
the control, the less information there is in the perceptual signal. In
the limit, according to this way of looking at the system, control would
be perfect if there were NO information in the perceptual signal. But in
that case, what would be the basis for constructing the output?

···

---------------------------------
Well, let's move on.

     Firstly, consider a predictable world. PCT is not necessary,
     because the desired effects can be achieved by executing a
     prespecified series of actions.

I thought this was silly in 1992 and I still do. If the world is
predictable, this does not mean that any organism is capable of
predicting it. Furthermore, as I pointed out back then, even if the
world is predictable, a control system is still the fastest and least
complex way to control it. Suppose the muscles were calibrated perfectly
and the organism somehow could carry out the calculations necessary to
generate the muscle tensions required to produce any position of the
limbs. Yes, in principle one could do an open-loop calculation involving
all the inverse kinematics and dynamics, but at what cost? Probably a
large portion of the brain would have to be devoted to performing this
calculation over and over in real time. But the same result can be
achieved, for all practical purposes, using a few very simple negative
feedback control systems which do only a few elementary calculations. So
even in a perfectly predictable world, the control system is still the
system of choice. To say that the world is predictable is not to say
that it is simple or that a given organism is capable of predicting it.
Your assumption is not tenable. Unfortunately, you insist that it is
correct, and go on from there.

     At the other extreme, consider a random world, in which the state
     at t+delta is unpredictable from the state at t. PCT is not
     possible. There is no set of actions in the world that will change
     the information at the sensors.

There is no information at the sensors. Information, as you have said a
number of times, depends on the nature of the receiver. It does not
exist independently in the environment. If the receiver is monitoring
the mean noise level of the sensor signals, acting at random can raise
or lower that noise level, since random acts imposed on a random world
will add in quadrature to the net effect. Control would still be
possible, if not very useful.

     Now consider a realistic (i.e. chaotic) world.

Fine. But you are assuming at this point that PCT would not be necessary
in a predictable world, which is false. That vitiates the strength of
this orderly argument. You are equating "predictable" with "simple" or
"understandable." In fact, you are attributing predictableness to the
environment, as if it were a property of the environment and not a
function of the organism's capacities to predict.

     At time t one looks at the state of the world, and the
     probabilities of the various possible states at t+delta are thereby
     made different from what they would have been had you not looked at
     time t. If one makes an action A at time t, the probability
     distributions of states at time t+delta are different from what
     they would have been if action A had not occurred, and moreover,
     that difference is reflected in the probabilities of states of the
     sensor systems observing the state of the world. Action A can
     inform the sensors. PCT is possible.

When you start talking like a quantum physicist you lose me. This whole
way of dealing with phenomena strikes me as awkward and ugly. And
anyway, I don't have to follow your arguments any further, since you
have made a basic mistake in saying that in a predictable world, PCT
would not be necessary.
----------------------------------
     Things become more interesting when we go up a level in the
     hierarchy. Now we have to consider the source of information as
     being the error signals of the lower ECSs, given that the higher
     level has no direct sensory access to the world

Not the error signals: the perceptual signals. These are not the same
thing, even though you try to make them the same:

     Even though the higher ECSs may well take as sensory input the
     perceptual signals of the lower ECSs, nevertheless the information
     content (unpredictability) of those perceptual signals is that of
     the error, since the higher ECSs have information about their
     Actions (the references supplied to the lower ECSs) just as the
     lower ones have information about their Actions in the world.

This is patching up your argument as you go. The error is the difference
between the reference signal and the perceptual signal. If the higher
system is in the imagination mode, it is not receiving the perceptual
signal. If it is in the action mode, it is not receiving a copy of its
own output. When you try to design a system can can operate in both
modes at once, you run into all sorts of problems. But I don't expect
that such niggling details will deflect you.

     (Unexpected events provide moments of high information content, but
     they can't happen often, or we are back in the uncontrollable
     world.)

So you are still assuming that disturbances have to be predictable for
control to work?

     What does this mean? Firstly, the higher ECSs do not need one or
     both of high speed or high precision. The lower ECSs can take care
     of things at high information rates, leaving to the higher ECSs
     precisely those things that are not predicted by them--complexities
     of the world, and specifically things of a KIND that they do not
     incorporate in their predictions. In other words, the information
     argument does not specify what Bill's eleven levels are, but it
     does make it clear why there should BE level of the hierarchy that
     have quite different characteristics in their perceptual input
     functions.

If information theory could really, out of its own premises, come up
with these predictions, that would be impressive. But it can't because
it didn't. You're solving a problem to which you already know the
answer, and throwing in all the assumptions needed to make your
"prediction" come out right. Those assumptions are not contained in
information theory. What does information theory have to say about
"kinds" of perceptions? Nothing.
---------------------------
Another item

     In your comment, you take it to refer to how a functioning ECS is
     to be designed, and that the perceptual bandwidth should be low.
     If the perceptual bandwidth is low, then the ECS will have
     difficulty matching the perceptual signal to the reference signal,
     and thus the error signal will have high information content.

First I have never said that the perceptual bandwidth should be low.
They are what they are. And second, if the perceptual bandwidth is low,
the ECS will have an easier time in matching the perceptual signal to
the reference signal, and the error signal, in your parlance, will have
a low information content. Your deduction here is exactly the opposite
of what would happen. Of course if the reference signal varied rapidly,
the error signal would also vary rapidly and contain more information --
but why would a reference signal from a higher, slower system vary more
rapidly than the perceptual signal of a lower, faster system?

     Now it is true that if the perceptual signal has lower bandwidth
     than the reference signal and the same resolution, then the error
     signal will in part be predictable, thus having lower information
     content than would appear on the surface. But I had the
     presumption that we are always dealing with an organism with high
     bandwidth perceptual pathways, so I forgot to insert that caveat.

By your argument, a completely random error signal would have the lowest
predictability of all, and thus contain the most information. But so
what? The control system would not work with a random error signal.

     Well, given last year's experience, I didn't expect my information-
     theory posting to be understood, and I wasn't disappointed in my
     expectation. Is it worth trying some more?

No, it is not. You don't have a clear and rigorous argument that can be
built up from basic principles without any outside assumptions to carry
you across the rough spots. If you knew what you were talking about, you
would be able to explain it clearly.
------------------------------
Lastly:

     The situation is different if we take a full-blooded outside view
     of the action of a CEV. It is from this kind of view that we argue
     that the disturbance provides information that passes through the
     perceptual signal to the output signal. From the outside we can
     see the disturbing variable do whatever it does to affect the CEV,
     and we can see the ECS modifying its output to bring the perceptual
     signal back to its controlled value. From outside we can see the
     reference signal of the ECS changing, and the ouput changing to
     move the CEV so that the perceptual signal comes to its new
     controlled value. From outside, the arguments about there being no
     information from the disturbance in the perceptual signal lose
     their force.

So from the outside view, it is the information from the disturbance
that passes through the perceptual signal to the output signal, with the
result of modifying the output to bring the perceptual signal back to
its controlled value? This takes us back to the original information-in-
perception argument. If the information in the perception decreases as
the output comes to oppose the effects of the disturbance more
precisely, how can it be the information passing through the perception
to the output that is responsible for the increase in precision? Does
precision improve as the amount of information on which it is based
decreases? What you are saying may make perfect sense to you, but to me
is is nonsense.
--------------------
One more peanut:

     [Allan Randall 930325 12:40] to Rick Marken

     > >Are we also agreed that this disturbance, while defined in this
     > >external point of view, is nonetheless defined in terms of the
     > >CEV, which is defined according to the internal point of view?
     >
     > Say what? Why not just say CEV(t) = d(t) + o(t). If that's what
     > the above sentence means then I agree with it.

     The point is that the disturbance d(t), if separated out from o(t),
     is not a meaningful quantity to the ECS. It is meaningful only to
     the external observer. By drawing an arrow marked d(t) you are
     talking about something the ECS has no direct access to. From the
     perspective of the ECS, only the variation in the CEV matters. It
     cannot separate out its own output from the disturbance. On the
     other hand, this disturbance is defined in terms of the CEV, since
     only things in the world that affect the CEV can be said to be
     disturbance.

It is not the disturbance that is defined in terms of the CEV, but the
effect of the disturbance. As you say, all that matters is the value of
the CEV itself. Words like "meaningful" are just noises. Talking about
the ECS "having access to" something is just a noise. My whole point is
that the ECS does NOT have "access" to the disturbance d(t). Nor does it
have "access" to the form of the function relating d(t) to its effect on
the CEV. Nor is the linking function or the nature and number of d(t)
variables necessarily the same from one moment to the next.
-------------------------------
The basic problem in the "information about disturbance" argument is
that you keep forgetting that a given fluctuation of the CEV can be
produced by many different independent variables in the environment,
acting through many different paths, even from one moment to the next.
All your arguments are based on the (often apparently unconscious)
assumption that there is a _single_ disturbing variable acting through a
_known and invariant_ disturbance function on the CEV. When that
assumption is true, your conclusions follow trivially, but you are
dealing only with a special case set up to MAKE your arguments true. In
general, a control system _however intelligent and complex_ cannot know
what is causing a CEV to vary at any given time. All it can know -- that
is, all that can be represented by its perceptual signal -- is the
current state of the CEV. And that is all that it needs to know.
---------------------------------
     If Signal X matches the disturbance, the perceptual signal must be
     the route from which the mystery function M(r, p) gets the
     information about the disturbance. Right?

     Now let the function M be indentical to O(R-P). Signal X will then
     be the negative of the output signal, which is the disturbance.
     The only question here is whether O(error) is a function or a
     magical mystery tourgoodie. I prefer to think we are dealing with
     physical systems, and that O is a function. Therefore, information
     about the disturbance is in the perceptual signal, and moreover, it
     is there in extractable form.

     QED.

See what I mean? This sloppy analysis omits two things: the form of the
function through which even a single disturbance acts on the CEV, and
the number of such functions with disturbing quantities operating
simultaneously. What you have shown is that if you assume a single
disturbance acting through a unity transfer function, you can deduce its
value from knowledge of all other signals and functions in the system.
Big surprise! But you have not shown that there is only one disturbance,
or that the form of the disturbance function is a simple multiplier of
1. You're in such a hurry to get to your triumphant "QED" that you
overlook an elementary omission in setting up your imaginary experiment.

Enough. I'm just not up to following through all these arguments which
are made up on the spur of the moment to meet a particular case and then
forgotten about when the same principle comes up in a different context.
What I am hearing are arguments for the sake of arguing, for the sake of
appearing to win an argument. I've been picking holes in your arguments
for a good four years now, with no discernible effect. I know when I am
trying to alter a controlled variable that is being maintained by a
strong and active system, although I may be somewhat slow to admit that
I can't budge it.

This time I am going to stick to my oft-broken resolution: no more
participation in this line of discussion.
-----------------------------------------------------------------------
Best,

Bill P.

[Martin Taylor 960620 14:40]

Bill Powers (960617.1500 MDT)

I've been picking holes in your arguments
for a good four years now, with no discernible effect.

No you haven't. You've been putting arguments in my mouth about topics on
which I have said nothing, and knocking holes in your own arguments.

Very, very, early on, we tried to make clear that we UNDERSTOOD that no
ECS has any access to the disturbing VARIABLE, and that ALWAYS when we
used the word "disturbance" we were referring to what came to be known
as the "disturbing INFLUENCE." At no time have I or Allan or Jeff (or anyone
else of whom I am aware) believed that the ECS or any aspect of it can
have information about the disturbing variable. And yet, when you want
to show how silly I am, you always start by asserting that I'm trying to
demonstrate the absurd proposition that the perceptual signal contains
information about the disturbing variable. And of course, it's terribly
easy to knock holes in that proposition.

Even now, you can say, with no apparent tongue-in cheek:

The basic problem in the "information about disturbance" argument is
that you keep forgetting that a given fluctuation of the CEV can be
produced by many different independent variables in the environment,
acting through many different paths, even from one moment to the next.

I find this more than somewhat frustrating, and more than a little bizarre.

···

------------------

What was said in 1992 seems to be mostly a matter of what you now think
were your most important points then, and thus what you select from the
stream of communications to show that you were right all along.

I selected, of course. But I tried hard to select fairly. One can't, of
course, be sure that one is successful in such an attempt, but I did try.
I didn't omit any statements that I came across that were to the point, on
either side of the issue.

    "The central theme of PCT is that a perception in an ECS should be
    maintained as close as possible to a reference value. In other
    words, the information provided by the perception, given knowledge
    of the reference, should be as low as possible."

This is not and never was the central theme of PCT as I see it. The
central theme of PCT is that organisms control their perceptions by
acting on the environment. How well they control them depends on the
parameters of the control system. There is no "should" involved.
Organisms control as well as they control, neither better nor worse.

Yes, and I perhaps I should have requoted the discussion that followed
your similar interpretation of "should" at the time. But my posting was,
out of exasperation, far too long anyway. As we pointed out then, there
was, in "should", no moral connotation. It was and is a statement that
good control results in....

And I believe that the central theme of PCT is that perceptions are
_controlled_, and I think "controlled" means to try to keep the controlled
variable near its reference value. In other words to minimize the
information provided by the perception, given knowledge of the reference
value.
----------------

I have no idea what you mean by saying "the information provided by the
perception, given knowledge of the reference". Information and knowledge
are not the same thing, and anyway what is there in a control system
that can evaluate the information in a perception, with or without
"knowledge" of the reference? You're talking gobbledygook.

To you, gobbledygook. You refuse to accept the technical definitions of
information theory, so what else could it be but gobbledygook? And because
you refuse to accept them you necessarily have "no idea what [I] mean by
saying 'the information provided by the perception, given knowledge of the
reference'". A person who refused to accept the idea that a perception
could be a time-varying scalar value would find many statements about PCT
to be gobbledygook, too.

To be simple: "The uncertainty of the value of the perceptual signal, given
the historical waveform and current value of the reference, is less than
the uncertainty of the perceptual signal in ignorance of the history and
value of the reference, where uncertainty has the definition provided by
Claude Shannon."
------------------------

In fact there is no way to tell what the disturbing variable is ...
The reason is very simple: exactly the same
perturbation of the loop can arise from an infinity of different
disturbing variables acting (singly or together) through an infinity of
different disturbance functions.

There you go, putting "disturbing variable" in place of what we agreed on
your deck in Durango should be called "disturbance influence." You knew
then, as you should have known earlier, and certainly should know now, that
we are well aware that they are not the same thing. For other readers, I
have often produced postings that point out the problems in using a single
scalar variable to carry information about a multitude of different things.
It's called "multiplexing" when you do that, and multiplexing has never
been an issue in the "information passed by perception about the disturbance"
discussion. There are other issues about multiplexing, as recent discussions
have brought out.

Every now and then you seem to wake up and say "Oh, OF COURSE there is
no information about the CAUSE of a perturbation in the perceptual
signal. How could you ever have thought I would suggest such a silly
thing? Please read what I say and you will not attribute such foolish
ideas to me." And then you turn right back to the same theme and claim,
as above, that there really is information in the perceptual signal
about the disturbance, and that you proved it.

Every now and then you seem to wake up and realize that I'm not talking
about disturbing variables EVER, unless I specifically say so, and never
have been. It was a shock to me the first time you accused me of that, and
it never ceases to amaze me that, like a rubber band stretched and released,
you come back to the same position again and again, telling me I believe in
something I have never thought and never would think.

Your statement about information in the perceptual
signal about "the disturbance" cannot apply to the distal disturbing
variable. It applies trivially to the proximal variable, because the
proximal variable is exactly what we mean by a CEV.

Not so trivially, because the state of the CEV is also influenced by the
output of the control system, and because if there is good control, the
perceptual signal will correlate _very poorly_ with the disturbing influence
on the CEV. You have chosen to see this low correlation as a paradox. So
be it. One day, I think you will have an epiphany (as these flashes of
insight seem to be called). Or maybe not.
----------------------

             ...using experimental simulations agreed to be effective
    for the purpose.

They were not adequate for the purpose except in your own mind. You and
Randall agreed they were effective; Rick and I did not.

They were adequate before the fact, inadequate after the fact. I quoted a
fairly complete portion of the relevant transcript, leaving out no caveats
posted by you or Rick. Check it out. I include the relevant portions
below.

This somewhat scurrilous allegation rests on our initial difference in
conceiving the conditions of the "challenge." Rick and I were assuming
that you would be given only the state of the perceptual signal.

Really? Oh, sorry. I must have been misreading our interchanges at the
time, and now as well. The following is a bit long, but I'll mark the
relevant bits with \\...// to make the reading easier.

*[Martin Taylor 930330 11:20]
*In my thought experiment, I will take the ECS, and add a simple function
*that takes as its input the reference signal to the ECS (which I think we
*can agree has no information about the disturbance) and the perceptual
*signal, which Rick CAPITALIZES as having no information abou the disturbance.
*Let us see whether a function can be constructed that takes these two
*inputs and produces a signal that matches the disturbance. If so, I
*would consider it conclusive evidence that information about the disturbance
*is to be found in the perceptual signal.

*If Signal X matches the disturbance, the perceptual signal must be the
*route from which the mystery function M(r, p) gets the information about
*the disturbance. Right?
*
*Now \\let the function M be indentical to O(R-P).//

Were you really assuming that we were NOT using the form of the output
function when you responded (Bill Powers 930331.1030)

The diagram you gave, below, won't work:

            ----> Signal X (which should match the disturbance)
           >
      mystery function M(r, p)
       ^ ^
       > >
       > > V (reference signal R(t) into ECS)
       > > >
       > <-------|
       > V
       >---------->comparator------- error = P-R
       > >
   perceptual output
    signal P(t) function O(error)
       ^ |
       > V
       > output signal
       > (accepted as mirroring the disturbance)

If the reference signal is zero, the signal X won't mirror the
disturbance. In general, it won't.

To which I responded (before running the experiment):

*Have another look. There are two inputs to function M. But there is
*only one input to function O. The input to function O is (P-R).
*\\M is defined as equivalent to O// except that it incorporates (R-P)
*as a first stage. \\It is functionally identical to O(R-P).// If you
*want to be even more general about it, you can remove the requirement
*for odd symmetry in O, and define M as -O(P-R).

If you then assumed that we were not going to use the form of the output
function, you must have been indulging in some pretty imaginative reading
of the experimental description. And to continue from the same posting
of mine:

Notice that I never claimed O mirrors the disturbance. That's Rick's
claim (I don't remember you making it without the necessary simplifying
assumptions). The diagram and thought experiment is intended to show
the inconsistency of simultaneously maintaining the two claims:

(1) The output mirrors the disturbance,
(2) There is no information about the disturbance in the perceptual signal.

Now, how scurrilous of me was it to suggest that Rick also accepted the
conditions INCLUDING use of the output function:

Rick Marken (930331.0800)

If I understand the
claim, oft repeated, Rick means that no function that takes as input
(a) the perceptual signal and (b) any other signal that is agreed to have no
information about the disturbance can reconstruct the disturbance, but
that nevertheless the disturbance is mirrored in the output.

I'll buy it.

In my thought experiment, I will take the ECS, and add a simple function
that takes as its input the reference signal to the ECS (which I think we
can agree has no information about the disturbance) and the perceptual
signal, which Rick CAPITALIZES as having no information abou the disturbance.

Excellent!
...

Let us see whether a function can be constructed that takes these two
inputs and produces a signal that matches the disturbance. If so, I
would consider it conclusive evidence that information about the disturbance
is to be found in the perceptual signal.

OK!

If Signal X matches the disturbance, the perceptual signal must be the
route from which the mystery function M(r, p) gets the information about
the disturbance. Right?

Right!! I completely agree with your proposal as diagrammed above.
\\I think a good first candidate for M(r,p) would be the function
O(r,p), right? Ah, I see you think so too//:

Now \\let the function M be indentical to O(R-P)//. Signal X will then be the
negative of the output signal, which is the disturbance.

It is at this point that experience will triumph over the "obvious"
conclusions of your thought experiment. I think it's time to fire up
the simulator; really!

The only question
here is whether O(error) is a function or a magical mystery tourgoodie.

Your magical mystery tour will really begin when you run the simulation!

And a right excellent proof i'tis. Now try the simulation.

And Bill Powers (930331.1430 MST):

Martin Taylor (930331) --

\\Rick is right. Simulate your proposed setup with Simcon and see
what happens. It is not what you say happens. The signal X does
not reproduce the waveform of the disturbance.//

And Rick Marken (930331.2100)

no information about the disturbance
can be recovered from the perceptual signal all by itself (\\or, as
you will see from the simulation, with the help of the output function//).
...
Your method \\does not use knowledge of o to extract
the information about d from p// -- so I consider it quite a fair method.
It \\does use information about the output function (that transforms error
into the variable that affects the perception)//. But I think you imagine
that it is the output function that extracts (recovers) the information
from p. In fact it doesn't -- but that's what the simulations will show.

Do you still claim that both Rick and you assumed that we would be given
ONLY the perceptual signal, and that I then "proceeded to use your own
assumptions about the forms of all the functions, including the disturbing
function, and the values of all the variables and signals, including the
reference signal, to deduce the only remaining unknown, the disturbing
variable."

If so, all I can say is: BIZARRE.

Then we come to the next day --Allan Randall (930401.1700 EST):

I've done Martin's"Mystery function" experiment,...It is
based on the Primer code of Bill Powers.
...
You can see that the function
Mystery() uses only the percept and reference. Note that the mystery
function produces the negative of the output (compare the qo and the qX
columns). On the assumption, which Rick Marken has agreed to, that the
output contains almost 100% of the information about the disturbance, then
the percept must also contain this information.

The result of this was Rick Marken (930401.1800):

I am going to continue to insist that there is no
information about the disturbance in controlled perception so I hearby
RECANT and REJECT any claim I ever made that the output has ANY (let alone
most of the) information about the disturbance.

Oh, well. Now a waveform that may have a correlation as high as -0.98 with
the disturbing influence is said to have no information about the disturbing
influence--but ONLY after the data came in to show that Rick's and Bill's
predictions of the previous day, knowing ALL the conditions of the experiment,
were shown to be false.

How scurrilous can one get?

As for the rest, see my message of today to Rick.

You're saying that I can't criticize your experiment and
its conclusion if I can't correctly assert that any variable or function
in the control loop but the perceptual signal "varies in a manner
coordinated with variations in the disturbing influence on the CEV." If
I've untangled this set of nested negatives correctly, you're saying
that the perceptual signal _does_ vary in a "coordinated" way (whatever
that means) with the disturbing influence.

I'm saying that if one can use some function that takes as input any number
of arguments, all but one (called X) of which are either fixed or are known
to vary independently of some waveform Q, and if by using that function with
those arguments one can construct a waveform correlated with Q, then input X
conveys information about Q.

In the case in question, X is the perceptual signal. The other arguments
that are fixed or are varied by influences independent of the disturbance
are the output/feedback function and the reference signal.

Since the result of applying the function is exactly as highly correlated
with the disturbance waveform as is the output influence on the CEV (i.e.
very highly if control is good), it is incumbent on you either to show
that the output/feedback function and the reference signal can be used
by themselves to reconstruct the disturbance waveform as accurately, or to
acknowledge that the perceptual signal carries information about the
disturbance waveform.

I don't think you will be able to find out much about the disturbing
waveform by looking _only_ at the reference signal and the form of the
output function.

If I've missed something in that logic, I'd love to know, because it looks
copper-bottomed and iron-clad to me.
---------------------

But this is exactly what I am trying to tell you is your primary
mistake. The perceptual signal does NOT vary in a way that correlates
with any particular disturbing variable. At one moment there might be a
single disturbing variable acting through a simple linear function; at
the next there might be twelve disturbing variables acting through a set
of functions ranging from square to square root to exponential.

There's no mistake here, except in your repetitive misstatement of my
thinking.

As I've many, many times pointed out (including above), that's entirely
outside the area under discussion. It's completely agreed, accepted, taken
as basic, understood, common background knowledge...

How often and in how many ways do I have to say it?

I use "know" in a loose way, to be sure. I say that a system "knows"
about something outside it if there is a variable inside the system that
covaries with the external something. A photocell "knows" about light
intensity, but not about color. In a simple control system, the only
"knowledge" that exists is the perceptual signal. And it is "knowledge"
only in the sense that it represents the value of a function of some set
of input quantities.

Since this is the only knowledge that the system itself has, it is
absurd to say (as you have said) that the system "uses information" that
is "contained in" the perceptual signal. All the control system needs is
the perceptual signal itself.

See my response of today to Rick.

It does not have to perform any operations
to detect or manipulate measures of information.

Quite true. But then neither does a bouncing ball need to detect or to
manipulate measures of mass or modulus of elasticity. Planets do not need
to detect or to manipulate measures of distance or mass, or of G.

    You should be stating that "as the precision of opposition to the
    disturbance increases, so the information about the disturbance
    remaining in the perceptual signal decreases" and then you would
    see it as a perfectly straightforward, self-evident proposition, in
    place of a paradox contrary to reason.

But that is contrary to the idea that the control system uses the
information in the perceptual signal to construct an output that
precisely opposes the effects of the disturbance on the input quantity.

So you keep asserting. But your assertion is the only thing that leads to
paradox. Let it go! Accept the result of the demonstration experiment as
being _data_ that allow you to see that there is no contradiction.

The paradox lies in claiming that control -- the precise opposition to
the effects of an unknown disturbing variable or variables -- relies on
information in the perceptual signal, and also to say that the better
the control, the less information there is in the perceptual signal. In
the limit, according to this way of looking at the system, control would
be perfect if there were NO information in the perceptual signal. But in
that case, what would be the basis for constructing the output?

I know this gives you difficulty. Let's try to ease it, by showing that the
"paradox" shows up in a domain with which you are intimately familiar.

Translate it back into non-informational terms. Firstly, look at the last
pair of sentences, and recast them as follows: "In the limit, control would
be perfect if there were no error. But in that case, what would be the basis
for constructing the output?

Look at that reformulation. Do you see a flavour of going to the limit and
dividing zero by zero, or multiplying zero by infinity? With any finite
gain the disturbance does introduce error, but there's a factor 1/(1+G)
that determines the degree of control (ignoring issues of loop delay).
The disturbance influences the error very little if control is good, but
there's always _some_ error. And that error influences the output in such a
way as to reduce the error--but not to zero (except after infinite time if
the disturbance is infinitely stable and the output function is a perfect
integrator).

Try putting it backward. Indeed, if control were perfect, there would be
no error; and indeed, if control were perfect, there would be NO information
about the disturbance in the perceptual signal. But realistically, no control
is perfect, and there is error, and there is _some_ information--less of both,
the better the control.

Now go further back in your paragraph:

The paradox lies in claiming that control -- the precise opposition to
the effects of an unknown disturbing variable or variables -- relies on
information in the perceptual signal, and also to say that the better
the control, the less information there is in the perceptual signal.

The key, perhaps is in that word "in"--"in the perceptual signal." Perhaps
the paradox might be easier to resolve if, for the moment, you think of
the perceptual signal as a channel rather than as a repository. The better
the control, the less has to come through the channel in order to compensate
for whatever error there might be at the moment. Some other device, such
as our "Magical Mystery Function" might use what comes through the
channel to produce a signal that has the information "in" it. And in the
real control system, it is the real output system (which we perceive as a
function) that does it.

----------------------

Well, let's move on.

Yes, let's.

    Firstly, consider a predictable world. PCT is not necessary,
    because the desired effects can be achieved by executing a
    prespecified series of actions.

I thought this was silly in 1992 and I still do. If the world is
predictable, this does not mean that any organism is capable of
predicting it.

By "predictable" one necessarily means "predictable by something or someone."
It seems unnecessarily redundant to say "predictable by the organism or
system of which we speak."

Furthermore, as I pointed out back then, even if the
world is predictable, a control system is still the fastest and least
complex way to control it.

Not an issue. I believe you, but regard the comment as irrelevant. If there's
more than one way to do a job, it is not _necessary_ to use the simplest
and quickest. It may be most sensible to do so, but it's not _necessary_.

[(1)] So
even in a perfectly predictable world, the control system is still the
system of choice. [(2)] To say that the world is predictable is not to say
that it is simple or that a given organism is capable of predicting it.
[(3)] Your assumption is not tenable.

These three sentences are non-sequiturs. I agree with (1) and regard it as
irrelevant to the discussion. I agree with (2), separately from my agreement
with (1), but regard it as not only irrelevant but in contradiction to the
(necessary but unspoken) assumption that we are talking about an organism
that _is_ capable of predicting, not of any arbitrary or simple organism.
I disagree with (3), on the grounds that I stated above: if there are several
ways to do something, it is not _necessary_ that the simplest and most
efficient be chosen.

I don't understand you comments on control in a truly random world, one in
which the state at time t+delta is unpredictable from the state at time t,
no matter how small the delta. I asserted that control in such a world would
not be possible, but you assert that it would be, stating:

If the receiver is monitoring
the mean noise level of the sensor signals, acting at random can raise
or lower that noise level, since random acts imposed on a random world
will add in quadrature to the net effect. Control would still be
possible, if not very useful.

The quadrature addition would, on average, drive the perceptual signal
away from its reference value, would it not? Square root of sums of squares,
and all that...

    Now consider a realistic (i.e. chaotic) world.

Fine. But you are assuming at this point that PCT would not be necessary
in a predictable world, which is false.

Not false. PCT would be effective and useful in a predictable world, but
not necessary.

You are equating "predictable" with "simple" or "understandable."

You may be. I'm not.

In fact, you are attributing predictableness to the
environment, as if it were a property of the environment and not a
function of the organism's capacities to predict.

Actually, I'm not. It's a joint property, and at the extreme of a "predictable
universe" it is a universe predctable to whatever organism we are assuming,
because we are assume that it can predict.

    At time t one looks at the state of the world, and the
    probabilities of the various possible states at t+delta are thereby
    made different from what they would have been had you not looked at
    time t. If one makes an action A at time t, the probability
    distributions of states at time t+delta are different from what
    they would have been if action A had not occurred, and moreover,
    that difference is reflected in the probabilities of states of the
    sensor systems observing the state of the world. Action A can
    inform the sensors. PCT is possible.

When you start talking like a quantum physicist you lose me.

Sorry. But it's kind of basic.

And
anyway, I don't have to follow your arguments any further, since you
have made a basic mistake in saying that in a predictable world, PCT
would not be necessary.

If that's the only reason, don't let it stop you. But even if what you say
were true, it shouldn't stop you because the only reason for mentioning a
predictable world is to place the real world in a continuum between two
impossible endpoints (but endpoints that other people do sometimes seem
to treat as if they represented the real world). Concentrate on the real
case, the chaotic world in which there is divergence over time--predictability
in randomness or vice-versa, if you like to think in those terms. In that
real world, PCT is both possible and (I think) necessary.
------------------------------

    Things become more interesting when we go up a level in the
    hierarchy. Now we have to consider the source of information as
    being the error signals of the lower ECSs, given that the higher
    level has no direct sensory access to the world

Not the error signals: the perceptual signals. These are not the same
thing, even though you try to make them the same:

Hang on. It's the perceptual signals that are passed up the levels of the
hierarchy, but to a particular ECU, the _information_ depends on the
difference between what it asks for and what it gets, which is most easily
seen as the error signals in the next lower level of the hierarchy. I know
that this is a bit sloppy, but I think it's better to look at it this way
than to go into the whole rigmarole at every level. It's better than a
first approximation.

    Even though the higher ECSs may well take as sensory input the
    perceptual signals of the lower ECSs, nevertheless the information
    content (unpredictability) of those perceptual signals is that of
    the error, since the higher ECSs have information about their
    Actions (the references supplied to the lower ECSs) just as the
    lower ones have information about their Actions in the world.

This is patching up your argument as you go.

No it's not. It's a self-evident proposition that really should not need to
be stated. But we are in that peculiar position where even the most obvious
facts seem to need restating, and one doesn't ever know which ones will be
seen to be obvious and which will be questioned.

    (Unexpected events provide moments of high information content, but
    they can't happen often, or we are back in the uncontrollable
    world.)

So you are still assuming that disturbances have to be predictable for
control to work?

And have _you_ stopped beating your wife?

Disturbances have to be somewhat predictable for control to work, in the
sense that the value of the disturbance at t + delta must usually approaches
the value at t, as delta approaches zero. There would be no control if the
disturbance value at t+delta were unrelated to the value at t, for all delta.
But disturbances don't have to be predictable in the sense of being able to
determine from historic values of the disturbance what it will be for all
future values.

The issue is one of bandwidth. To put it into language you know, a control
system can effectively compensate only for disturbances whose bandwidth is
below a critical value. If the bandwidth is greater than 1/2L, where L is
the effective loop delay, you are bound to get frequency components that
have a phase lag that leads to positive feedback. (I think that 1/2L is
right, but even if it isn't, there is a critical bandwidth for which the
statement is true). Limited bandwidth means predictability.

    What does this mean? Firstly, the higher ECSs do not need one or
    both of high speed or high precision. The lower ECSs can take care
    of things at high information rates, leaving to the higher ECSs
    precisely those things that are not predicted by them--complexities
    of the world, and specifically things of a KIND that they do not
    incorporate in their predictions. In other words, the information
    argument does not specify what Bill's eleven levels are, but it
    does make it clear why there should BE level of the hierarchy that
    have quite different characteristics in their perceptual input
    functions.

If information theory could really, out of its own premises, come up
with these predictions, that would be impressive. But it can't because
it didn't. You're solving a problem to which you already know the
answer,

Can't help that. It's a matter of history. But it doesn't make the statements
false. They do fall out naturally, and it is not (so far as I am aware)
true that:

You're solving a problem to which you already know the
answer, and throwing in all the assumptions needed to make your
"prediction" come out right.

What does information theory have to say about
"kinds" of perceptions? Nothing.

Not in those words, no. What does Newtonian gravitational theory say about
planets? Nothing.

    In your comment, you take it to refer to how a functioning ECS is
    to be designed, and that the perceptual bandwidth should be low.
    If the perceptual bandwidth is low, then the ECS will have
    difficulty matching the perceptual signal to the reference signal,
    and thus the error signal will have high information content.

First I have never said that the perceptual bandwidth should be low.

No, you said that I said it should be, which is the opposite of what I
did say--that in well functioning control system, the information rate
about the disturbance through the perceptual signal should be low. The
bandwidth of the perceptual function should be as high as possible, to
minimize the effective loop delay, if for no other reason (and there are
lots of other reasons).

    Well, given last year's experience, I didn't expect my information-
    theory posting to be understood, and I wasn't disappointed in my
    expectation. Is it worth trying some more?

No, it is not. You don't have a clear and rigorous argument that can be
built up from basic principles without any outside assumptions to carry
you across the rough spots. If you knew what you were talking about, you
would be able to explain it clearly.

I can't speak to my clarity, but I reject the comment. And I do agree that
it isn't worth trying any more, for reasons I expressed to Rick. I think I
do know what I am talking about, but then how can I be sure? Can you?

Most of our discussions on CSGnet are fun. Wasting time on history is not.

Martin