prediction and other cognitive processes

[From Bill Powers (960625.0830 MDT)]

Bruce (Rat Master) Abbott (960624.1900 EST) --

     As I understand information theory (and I'm no expert, but have
     read some elementary explications of it), it merely provides a way
     to quantify the ability of a system ("observer") to predict the
     value of the signal at some time t. For example, if a signal can
     take only one of two values, then without any further knowledge of
     the signal, I know that at time t it may have either one of those
     values (and no other), but I do not know which value it will be.

But why should the "system" generate a prediction of the signal at some
time t? Is there anything in a control process that REQUIRES this
prediction to be made? If the prediction is made ("at t = 0.5 seconds
from now, the signal is going to be 12.7 units"), what will the system
then do with this prediction, at the time it is made?

I'm not denying that some living systems sometimes make predictions;
they do. Making a prediction is a specific, complex, cognitive process.
It requires a high-level control system that can sample inputs at
intervals, fit a curve to the samples, and extrapolate to some future
expected value. The remainder of the system may then calculate some
action that is to be taken just before the time the predicted value is
due to occur, and then wait for its occurrance to produce the action.
We've all done things like this -- getting out your wallet, for example,
when you're second in line at the ticket counter.

But my point is that this sort of behavior is very complex, requiring
many control systems and considerable computing ability. I wouldn't have
any objection to speaking about making predictions if this were the sort
of thing that is meant. I have difficulty, however, when some simple
low-level function is spoken of as if it is carrying out this complex,
high-level kind of process. A perceptual signal, for instance, might
consist of components proportional to the actual value and the first
derivative of the sensed variable. This is sometimes spoken of as
"anticipation" or "prediction". There is some analogy with actual
anticipation or prediction, but it is only an analogy; an accurate
description would be p = k1*v + k2*(dv/dt).

The problem with using loose analogies with higher-level processes is
that an analogy carries with it other characteristics of the higher-
level process that are inappropriate. In the rate-plus-proportional
sensor, there is no involvement of a "future event." The system does not
"plan what to do when the predicted event occurs." The control behavior
is simply based on p = k1*v + k2*(dv/dt) as it stands in the current
instant. The controlled variable is k1*v + k2*(dv/dt). In fact, when the
future does arrive at time t + dt, the "prediction" is always wrong,
because by that time the influence of the output on the input and the
states of external disturbances have changed. It is always too late to
act on the basis of the "prediction."

Whenever we try to explain a simple quantitative process in words, we
are necessarily creating a perception at a level higher than the simple
process. Words are symbols and they occur one at a time. They represent
stylized categories that either exist or don't exist. The means of
manipulating them involves logical, either-or processes which come out
with true-false values. This is the ideal medium in which to create a
picture of the world in which everything is an event that either happens
or doesn't happen, so you can speak in terms of probabilities and
logical relationships and past and future.

Well, I have more to say on that, but that's enough for one uncalled-for
diatribe for which you are not responsible (it's a run-on from my post
to Martin Taylor yesterday). Just remember that when you talk about
"reducing uncertainty" you have to specify WHOSE uncertainty you're
talking about. As you do:

     So for me information theory simply provides one metric for
     analyzing control-system performance. Information is not something
     the control system "uses," it is a description of how well the
     control system reduces unpredictable (for the observer) variation
     in the CV.

That I can buy. And I like your conclusion:

     Information theory may simply provide another, valid way to
     describe how control systems work and to quantify how well they do
     their jobs. Whether the view it provides of control system
     operation is useful, or more useful than other ways of describing
     such systems, is an empirical question whose answer may depend on
     what you are trying to understand about the system.

···

-----------------------------------------------------------------------
Best to all,

Bill P.

[Martin Taylor 960625 11:30]

Bill Powers (960625.0830 MDT) to Bruce Abbott

    So for me information theory simply provides one metric for
    analyzing control-system performance. Information is not something
    the control system "uses," it is a description of how well the
    control system reduces unpredictable (for the observer) variation
    in the CV.

That I can buy.

How come you can buy it when Bruce Abbott says it, but not when I say it?
I've tried often enough to say the same thing. Is it my wordings, or your
preconceptions as to what I must have been saying, and which have led over
and over to my responses of "I'm not saying X"?

Anyway, I'm happy that you do buy it. It should make my life easier in future.

And I like your conclusion:

    Information theory may simply provide another, valid way to
    describe how control systems work and to quantify how well they do
    their jobs. Whether the view it provides of control system
    operation is useful, or more useful than other ways of describing
    such systems, is an empirical question whose answer may depend on
    what you are trying to understand about the system.

Even better. That's something else you've steadfastly denied over the years.

We really are making progress!!! I'm very happy.

But Bruce was not right in saying, as you quote (I haven't seen intervening
postings sent over the last few days, having been busy):

Bruce (Rat Master) Abbott (960624.1900 EST) --

    As I understand information theory (and I'm no expert, but have
    read some elementary explications of it), it merely provides a way
    to quantify the ability of a system ("observer") to predict the
    value of the signal at some time t.

It can do that, sure. But it's actually a way of quantifying the relationship
among different variables, just as is correlation. If the variables happen
to be, by happenstance, (1) the waveform prior to time t0 and (2) the value
of the waveform at time t, then it quantifies the ability to predict the
future of a waveform given its past. But the variables can be anything,
discrete or continuous, imagined or sensed. One way of looking at "information"
is as changes in nonlinear correlation, but of course the calculations
are quite different. There are lots of other ways of looking at it, just as
there are lots of ways of looking at correlation.

Now I have to try to find time to go back to the weekend's messages and see
where this came from. And Bill says his message was a run-on from his post
to me yesterday, which I haven't seen--but I can't see anything in it that
speaks to my ideas, so I'll have to seek out the original.

···

------------------

With trepidation, here's a proposed simulation experiment. I may have time
to set it up, but probably not soon, and someone else might like to. But
for now, think of it as a thought experiment, and give me your comments
on what it might show if it were ever to be done.

What we are using as the conceptual basis of the proposed experiment is
the pair of PCT Mantras: "All you know about the real world is perception",
and "It's the real world, not your perception of it, that hits you." We
are going to look at a situation where the perceptual function doesn't
tell you everything about the value of the real-world variable it defines.

I propose to study a trivially simple control system. For the sake of
ensuring an accurate simulation, this is a sampled, bang-bang control
system. Its circuit diagram is the usual, but the error value is sampled
only at discrete intervals, separated by delta(t). The values of all
variables are allowed only integer values (positive, negative, or zero).
This is a feasible control system to construct out of hardware, if anyone
wanted to do it, and it is _not_ intended as a simulation of a continuously
sampled, continuous-valued control system.

The output function of this system is the normal integrator, or I should
say summator, since everything is quantized to unit values and discrete
time samples internally to the control system. The output is increased
by k*(error) at each time-step, where k is called the output Gain. k can
take on any real value, fixed for the duration of any one simulation run.

To simplify things, the feedback transfer function is the unity function,
and the perceptual function is an object of the experiment, but the perceptual
output is of the same dimensionality as the input.

Throughout the experiment, the reference level for the perception is zero
(one can easily extend the experiment to make it variable, but that's not
the point at the moment).

The effect of the perceptual function, as I said, is the object of the study.
The behaviour of the disturbing influence is an experimental variable, as
are one parameter of the perceptual function, and the gain of the output
summator.

The perceptual function does NOT produce a replica of the physical situation.
It produces a probabilistic result. For example, if the real value of the
physical variable is 3, the perceptual function will sometimes produce a
value 3, sometimes 2, sometimes 4, occasionally 1...

For simplicity of simulation, we make this probability function triangular
(quite unrealistically), and for any value s of the physical function as
presented to the sensor input of the perceptual function, the output will
be p, where

P(p-s)= 0 for |(p-s)| > z
P(p-s) = pmax for p=s
P(p-s) = pmax*(1-|(p-s)|/(1+z)) for 0 < |(p-s)| < z

pmax is chosen so that the sum over integer values of p-s is unity. If
z = 0, then pmax = 1 and p=s always. That's the situation usually treated
in our discussions of control systems. If z = 1, then the probability
p=s is 0.5, and the probability that p = s-1 or p = s+1 is 0.25. Similar
calculations can be made for all integer values of z. One can use non-integer
values of z, too, and one could choose non-triangular functions. But this
is OK for now.

The first parameter of the experiment is obviously the value of z. If the
physical variable is undisturbed by external influences, what will be its
behaviour under the influence of the control system's output? Can one
describe its fluctuations (if any) as a function of the output gain value?
(Remember the output function is a summation device that adds k*error to
its output once every delta(t)). Does the value of k matter in this system
when z = 0? When z > 0?

The third experimental parameter is the disturbance, which is taken to be
a step-wise random walk in which a unit step in a random direction is made
every now and then. For simplicity, we will say that the stepping moments
are made every m*delta(t) seconds, where m is not necessarily an integer.
What will be the fluctuation pattern of the physical variable if m is very
large? If m = 2? If m = 1? If m << 1? How will these results relate to
the value of z?

I'll await your (the wider readership's) comments.

Martin

[Hans Blom, 960625]

(Bill Powers (960625.0830 MDT))

But why should the "system" generate a prediction of the signal at some
time t? Is there anything in a control process that REQUIRES this
prediction to be made? If the prediction is made ("at t = 0.5 seconds
from now, the signal is going to be 12.7 units"), what will the system
then do with this prediction, at the time it is made?

You forget that the prediction made by the control system depends on
how it acts. Its prediction will be something like: "If I tense my
arm muscles, my hand is going to be 12.7 inches above the table top
in 0.5 seconds; if I relax my arm muscles, my hand will be on the
table in 0.5 seconds." Thus, position depends on tension, and given a
wanted position the required tension can be computed. A control
system may not actually do things this way, but it might (a model-
based controller would do it this way).

I'm not denying that some living systems sometimes make predictions;
they do. Making a prediction is a specific, complex, cognitive process.

You look at prediction as a high-level process. It also operates at
the lowest levels, however, although we might not call it prediction
there. But the mechanism isn't different. Prediction is everything
where you can see a formula

    p = f (u, d)

where p is the perception at some time, u the action and d a
disturbance (if it exists). The formula states that, within the
limits of the disturbance, I can know (predict) the value of p when
inserting a certain value for u. Control is inversion of this formula

    u = f' (p, d')

where, given a required perception p_opt, a "best" (within the
uncertainty limits given by d') u, u_opt can be computed. This
scheme works only, of course, if the world that we live in is
predictable (has discoverable laws).

high-level kind of process. A perceptual signal, for instance, might
consist of components proportional to the actual value and the first
derivative of the sensed variable. This is sometimes spoken of as
"anticipation" or "prediction". There is some analogy with actual
anticipation or prediction, but it is only an analogy; an accurate
description would be p = k1*v + k2*(dv/dt).

Much more extensive predictions can be made, not just for the single
next observation but for very long time periods, depending upon what
I know about the "laws" of the world. If the world is unchanging and
I have seen a few periods of a sine wave, I can predict that the sine
wave will continue indefinitely and thus I am able AT THIS MOMENT to
essentially predict the signal for all times. In fact, this is how
sunrise and sunset can be given so accurately by newspapers.

instant. The controlled variable is k1*v + k2*(dv/dt). In fact, when the
future does arrive at time t + dt, the "prediction" is always wrong,
because by that time the influence of the output on the input and the
states of external disturbances have changed. It is always too late to
act on the basis of the "prediction."

You must live in a very unpredictable world, without regularly
occurring sunrises and sunsets, with chaotic planetary orbits, and
with an unreliable gravity. I just don't see it that way. Much of
what I see around me is lawlike, and where I see what you call
"disturbances", I try to discover the laws that govern them. I do not
accept that the world is unpredictable; I want to discover its laws!

That concerns control, too. I think that all (higher) organisms act
this way: they continually improve their control quality. By
learning. When you disregard learning -- you ssem to want to analyze
only static control systems -- I think you miss out on what goes on
in the more interesting species.

Greetings,

Hans

[Bruce Gregory (960625.1430 EDT)]

(Hans Blom, 960625) to
(Bill Powers 960525.0830 MDT)]

You must live in a very unpredictable world, without regularly
occurring sunrises and sunsets, with chaotic planetary orbits, and
with an unreliable gravity. I just don't see it that way. Much of
what I see around me is lawlike, and where I see what you call
"disturbances", I try to discover the laws that govern them. I do not
accept that the world is unpredictable; I want to discover its laws!

If I have been understanding what Rick and Bill are saying, we
must ask which system is doing the controlling and for what. An
astronomer selects a galaxy to observe and points her telescope
toward the location on the sky where the galaxy is "predicted to
be". She is controlling for the perception of the image of the
galaxy lying on the slit of the spectroscope, but before she
attempts to control for the image on the slit, she must first
control for the telescope pointing in the "right direction".
She has a reference level for the telescopes orientation which
she acquired by reading the sidereal time and making a
calculation. But that reference level is not a prediction of
anything. It is a present-time goal. Before moving the
telescope she must control for seeing that the sky is clear and
for opening the dome. This chain of controlled actions leads to the
observations she plans to make, but each step involves only the
"present" of the subsystem doing the controlling. If we ask her
what she is doing she will tell us a story similar the one I
have outlined above, but the systems actually doing the controlling
are free to ignore the story and "worry" only about the reference
levels provided by the higher-order control systems.

Regards,

Bruce