end of information in perception

[From Bill Powers (960625.1530 MDT)]

Martin Taylor 960625 14:00 --

Writing to Rick Marken:

     Actually, we used three things: (1) f(), The unvarying output-to-
     CEV function (which I sometimes label the output/feedback
     function), (2) The reference waveform r(t), and (3) the perceptual
     signal waveform p(t).

If you are given Fi (input function), Fo (output function), Fe
(environmental feedback function), r (reference signal), and the
behavior of any signal in the loop (p,e, or o) you can then calculate
the part of the state of the perceptual signal that is due to the sum of
all independent disturbances acting through their respective disturbance
functions. This has nothing to do with information theory, Martin: it is
just plain mathematics. And this is all you showed with your "magical
mystery function." I never doubted that this could be done; I'm sure
Rick never did, either. That calculation doesn't depend on "the
information in" anything -- it's just algebra.

I think that the problem was that Rick and I were expecting you to pull
some sort of rabbit out of your hat, something that would show that the
"Information" in the perceptual signal had something fundamental to do
with the way this system works. I know I was thinking you must have some
sort of short-cut, the way dynamical equations can be magically picked
out of a simple statement of conservation of angular and linear
potential and kinetic energy. But what did you come up with? Nothing but
a solution of a set of algebraic equations in which there was only one
unknown.

This piddling result didn't seem to jibe with your grandiose claims
about the power of information theory. There wasn't even a mention of
"information" in your triumphant demonstration. The calculations didn't
involve any equations for Information, per se, or any relations
involving Information. Nothing at all was derived from which one could
calculate the information in the perceptual signal about the
disturbance. And don't forget that at the time, you were defining
"disturbance" very differently from the way Rick and I were defining it.
As later events showed, your definition of "disturbance" is simply "that
part of the value of the CEV (or perceptual signal) not attributable to
the output." Since by your own admission this conceptual subdivision of
the input plays no role in the actual operation of a control system, the
point of calculating this kind of "disturbance" is rather hard to see.
Your line of argument is something like the one I would use in showing
that information theory is irrelevant to control theory (if I believed
that).

Your position on information has changed considerably over the years
since 1993. Back then, you said things like

     Throughout, I am trying to take the position that the only
     probabilities that can be observed are based within the observing
     entity. ...

     Sometimes I slip, I acknowledge. But that's a simple mistake, not
     a failure of principle. In this case, the reference signal and
     the perceptual signal are both known within the ECS. If you
     remember a long way back, this came up. There is no need for an
     external evaluation of the probability distribution, any more than
     there is a need for an evaluation of a neural current that is based
     on a rate of neural impulses. I suppose it might be possible for
     an external observer with a probe to make the analyses, and
     sometimes it is didactically easier to posit such an observer.
     But in practice there isn't one, and it is not necessary to think
     of one. ...

Now, 1996, you are saying that of course the information is not known to
the system itself, and that all this is just an external observer's way
of evaluating what is going on inside the control system. But the
kicker, to me, is this passage:

     ... I really do think that I can use information theory to identify
     that the PCT structure was correct, at least feasible. When you
     put in the appropriate perceptual input functions, gains, and
     delays, you get the same model that you and/or Tom would produce
     without information theory, so it should make the same predictions
     in any specific case. So why should I try to do better, when I
     anticipate the result being identity?

How do you know that "When you put in the appropriate perceptual input
functions, gains, and delays, you get the same model that you and/or Tom
would produce without information theory?" Are you saying that you have
actually "put them in", but simply haven't shown your results to us? Or
do you mean that you can see how it would be done in principle, and are
so sure that the results would be identical that you don't see any
reason actually to do the calculations? Or do you mean that a model
based on information theory would not actually contain any calculations
of information, and would simply be the same model we already have?

I don't see how any of these possibilities lends support to your claim
that information theory shows WHY PCT WORKS. We have only your word,
your claims, that it does. The models you pick as examples are so
contrived, so obviously tailored to the special needs of informational
calculations, that they hardly seem general -- but even so, if you could
show how information theory itself -- not algebraic or sequential
calculations -- can be used to demonstrate some essential point, that
would be at least a step in the right direction. But even in the latest
round on this subject, you say

     With trepidation, here's a proposed simulation experiment. I may
     have time to set it up, but probably not soon, and someone else
     might like to. But for now, think of it as a thought experiment,
     and give me your comments on what it might show if it were ever to
     be done.

I am simply not interested in thought experiments that will never
actually be done. All that thought experiments do is to reveal your
beliefs. I claim that you do not know how to do informational
calculations for a working control system and that you have no way of
showing that they in any way "explain" how a control system works. This
is just more bluffing and procrastination. I don't believe that you will
ever get around to actually setting up this model or working out the
informational calculations that would be needed to draw any conclusions.
After three years, I am still waiting to see an orderly analysis of
information in a control system that simply proceeds from A to Z without
detours for verbal explanations and arm-waving, and without heaping ad-
hoc assumptions upon assumptions to take care of problems encountered
along the way.

I hate being in this relationship to you, Martin. I don't expect that
you enjoy it, either. I feel that I'm questioning someone's religious
faith, and you've done little to relieve me of that onus. I think that
we are at an impasse, and that communication between us on this subject
is so poor that there is no hope of resolving the problem. I suppose
that all I can do is find the guts just to stay out of it. I hope I can
do that, although the present post makes me a liar.

···

-----------------------------------------------------------------------
Best,

Bill P.

[Martin Taylor 960627 14:15]

Bill Powers (960625.1530 MDT)

I'm sorry, but I can't figure out what you want from me. When I have tried to
use mathematical forms, you have said that they don't relate to what you
call "uncertainty" and I should explain. When I've tried to use words, you
want mathematical forms. All I can see from your current posting is that
you are angry, apparently at the notion that there is no conflict between
an information-theoretic approach and PCT, when you want there to be a
conflict. Why you want there to be, I can't guess. But there isn't, and
I'm sorry.

If you are given Fi (input function), Fo (output function), Fe
(environmental feedback function), r (reference signal), and the
behavior of any signal in the loop (p,e, or o) you can then calculate
the part of the state of the perceptual signal that is due to the sum of
all independent disturbances acting through their respective disturbance
functions. This has nothing to do with information theory, Martin: it is
just plain mathematics.

Most analysis is. Laplacian, Fourier, Information...

And this is all you showed with your "magical
mystery function." I never doubted that this could be done; I'm sure
Rick never did, either.

I'm glad to hear it. Both of you said, most explicitly, that under these
conditions it could NOT be done. I've twice in the last few days quoted
the relevant parts of the discussion. (I wish I hadn't felt the need to
do so, but I did). But I'm glad to know you didn't mean what you said.

Clearly, if the reproduction can be done algebraicly (as you correctly say
we did), then something in the process of reproduction carries the
necessary information.

Oh, I suppose it isn't information if it is precise? Is that the issue?

No, I can't imagine that, because at least Rick spent a lot of time saying
that there was no information because the reproduction is not precise. As
it isn't--it's only as good as the control.

So, if something in the process of reproduction carried the information,
what is it?

If you are given Fi (input function), Fo (output function), Fe
(environmental feedback function), r (reference signal), and the
behavior of any signal in the loop (p,e, or o) you can then calculate
the part of the state of the perceptual signal that is due to the sum of
all independent disturbances acting through their respective disturbance
functions.

Let's go through these. Of the set you mention, we did not use e or o, so
drop them.

1. Fi (input function). Does this fluctuate in some way related to the
disturbing variable?

2. Fo (output function). Does this fluctuate in some way related to the
disturbing variable?

3. Fe (environmental feedback function). Does this fluctuate in some way
related to the disturbing variable?

4. r (reference signal). Does this fluctuate in some way related to the
disturbing variable?

5. p (perceptual signal). Does this fluctuate in some way related to the
disturbing variable?

If none of these do, then using them cannot result in a reconstruction in
any way correlated with the disturbing influence (note: "influence" here,
not "variable"). Let's consider them.

1. No. It's fixed for the duration of the test.

2. No. It's fixed for the duration of the test.

3. No. It's fixed for the duration of the test.

4. Possible, but in the test conditions it is stated to be independently
variable, so under the test conditions the answer is No.

5. Possible. If any reconstruction is possible, then the answer must be Yes.
That's the question at issue: Is the answer to question 5 Yes?
And the physical situation suggests that it might be Yes, since p is
derived from sensory data about an aspect of the outer world known to be
inflenced by the disturbing variable. If reconstruction is impossible, as
Powers and Marken assured us it would be under these conditions, then
the answer to question 5 would be No, and we would all agree that the
perceptual signal carries no information about the disturbing influence.

But what did you come up with? Nothing but
a solution of a set of algebraic equations in which there was only one
unknown.

Which demonstrated the ONLY thing that was at issue: whether the perceptual
signal carried information about the fluctuations in the disturbing
influence.

This piddling result didn't seem to jibe with your grandiose claims
about the power of information theory. There wasn't even a mention of
"information" in your triumphant demonstration.

No, that was what made it so powerful. The demonstration was done entirely
on your own ground and on your terms.

And don't forget that at the time, you were defining
"disturbance" very differently from the way Rick and I were defining it.

I don't think so, except that I was using the term "disturbance" for what
we have since come to label "disturbing influence". As was made clear every
time you tried to impute to me the intention of discussing what we now
call the "disturbing variable."

As later events showed, your definition of "disturbance" is simply "that
part of the value of the CEV (or perceptual signal) not attributable to
the output." Since by your own admission this conceptual subdivision of
the input plays no role in the actual operation of a control system, the
point of calculating this kind of "disturbance" is rather hard to see.

Nobody argued that the _control system_ makes this kind of subdivision,
or calculates the disturbing influence separately. The demonstration only
showed _that it could be done_ and therefore that the information about the
disturbance was carried by the perceptual signal.

My "own admission" is/was that if the perceptual signal did not carry this
information, then the control system would fail to control, not that in the
process of control the effects of the disturbing influence were segregated
from the effects of the output. I can't count how often this was said, in
all sorts of ways. It was, and remains, very frustrating that you link the
two ideas.

Your position on information has changed considerably over the years
since 1993. Back then, you said things like

    Throughout, I am trying to take the position that the only
    probabilities that can be observed are based within the observing
    entity. ...

    Sometimes I slip, I acknowledge. But that's a simple mistake, not
    a failure of principle. In this case, the reference signal and
    the perceptual signal are both known within the ECS. If you
    remember a long way back, this came up. There is no need for an
    external evaluation of the probability distribution, any more than
    there is a need for an evaluation of a neural current that is based
    on a rate of neural impulses. I suppose it might be possible for
    an external observer with a probe to make the analyses, and
    sometimes it is didactically easier to posit such an observer.
    But in practice there isn't one, and it is not necessary to think
    of one. ...

In what way have my ideas changed? What you quote seems fine to me now.

I don't see where you see a difference between that and what I have been
saying recently.

It's all VERY puzzling.

Now, 1996, you are saying that of course the information is not known to
the system itself, and that all this is just an external observer's way
of evaluating what is going on inside the control system.

This is weird. Are you saying that when we make a Fourier analysis of the
behaviour of a control loop, we are saying that the control system knows
the spectrum of the signal waveforms, or the bandwidths and spectrum shape
of its filters? That seems to be your position--either the control system
(not the external analyst) knows them, or they can't affect the system's
behaviour.

I don't see how any of these possibilities lends support to your claim
that information theory shows WHY PCT WORKS. We have only your word,
your claims, that it does.

Here you seem to be wanting analytical expressions--algebra. Is that right?
Would you be happy with that?

The models you pick as examples are so
contrived, so obviously tailored to the special needs of informational
calculations, that they hardly seem general -- but even so, if you could
show how information theory itself -- not algebraic or sequential
calculations

Oh, so you wouldn't be.

What DO you want? I've tried, as you point out, to choose models that
demonstrate the essential points, from various angles. Always to show
the general principles in a simplified situation--rather as you did in your
"Models and their Worlds" paper (by the way, I don't have it yet, talking
as editor). You don't like ANY of the approaches I have taken.

After three years, I am still waiting to see an orderly analysis of
information in a control system that simply proceeds from A to Z without
detours for verbal explanations and arm-waving, and without heaping ad-
hoc assumptions upon assumptions to take care of problems encountered
along the way.

One person's ad-hoc assumption is another's basic principle. You can find
my ad-hoc assumptions in Shannon and Weaver, for the most part. But you
don't like the approach, _a priori_, so I can't expect you to accept ANY
result I might give you.

You quote the introduction to my proposed simulation experiment, which was
introduced with the intention of providing a testing ground in which you,
Bill Powers, would not be able to say that the information calculations
were wrong because the simulation of a continuous system was actually
discrete. It annoys you.

I hate being in this relationship to you, Martin. I don't expect that
you enjoy it, either. I feel that I'm questioning someone's religious
faith, and you've done little to relieve me of that onus.

Me too, but in my case, I feel as if Jesus thinks I am trying to
weaken his faith, and that I am the Satan to whom he must say "Get thee
behind me." I see myself more in the role of a carpenter offering to add
a drill to the hammer, chisel and saw already owned by the disciples.

Simple approaches don't work because they are "oversimplified and full of
assumptions." Full blown approaches don't work because you want something
you can understand without all the maths, which you distrust. Without the
detail, it's false faith, and with the detail it's gobbledygook.

Not a good basis for coming to an understanding.

···

------------------------------

Now, if you remember, lots of times I have pointed out that if you know
more about the situation, you should use that knowledge. If you know the
control system to be linear, if you know the gain function, if you know
the parameters, use them. Information theory will almost always make either
the equivalent prediction or a less constraining one. The "grandiose
claims" are far from grandiose when it comes to a single scalar control
loop. They become more interesting when we deal with multiple parallel
control loops, and much more so when we deal with a hierarchy. As you have
seen.

Martin