[From Bill Powers (930308.1520)]
Martin Taylor (930308.1200) --
What I like about you, Martin, is the civilized way you respond
when people attack your life's work with bludgeons.
I'm looking forward to seeing your paper on IT and PCT. It is
going to be different from the sort of IT work up with which I
have the most difficulty puting -- that seems clear from your
discussions, e.g.:
Probability is something relating only to the information
available at the point where that information is used. It
relates to individual possible events that might be detected at
that point, not to frequency.
I wonder whether you have considered this point yet:
In your terminology, the perceptual signal in an ECS, an
elementary control system, stands for the state of a CEV -- a
complex environmental variable. As we all seem to agree now, that
CEV is a construct created primarily by the forms of all the
perceptual functions lying between the primary sensory interface
and the place where the perceptual signal finally appears.
When we speak loosely (for coherence and convenience), we say
that the perceptual signal is an analog of some aspect of the
environment. This sets up the picture of the environment and the
aspect of it that exists outside the organism, and a perceptual
signal that represents that aspect. It is natural, then, to say
that the perceptual signal contains information ABOUT that aspect
of the environment.
From there, we can go on to ask how well the perceptual signal
represents that aspect. This is where the difficulty starts. In
order to answer that question, we must have an independent
measure of the aspect of the environment in question, one that
tells us its true state so we can compare that state with the
representation in the form of the magnitude of the perceptual
signal (or any other form, come to think of it).
But this contradicts the previous statements, which say that the
CEV is a construction by the nervous system. From the internal
point of view, the perceptual signal is always a PERFECT
representation of the CEV -- in fact, that signal plus all the
perceptual functions leading to its production defines the CEV.
What, then, if the perceptual signal is noisy? We could say that
this defines a noisy CEV, or we could say that there is a
noiseless CEV out there, and that the perceptual signal is being
derived through a noisy channel -- or any combination of these
effects.
Consider a television set tuned to a very distant station. The
screen shows a picture with a lot of noise dots dancing over its
surface. Inside the person looking at the TV set, presumably, is
a set of signals representing the state of the TV screen. This
set of signals would show an amplitude envelope that would be
varying at a high frequency, like some amount of noise
superimposed on a average picture. These noise signals don't
reflect the state of affairs we would measure with optical
instruments at the face of the TV screen -- for one thing, they
don't show the frame rate, and the highest-frequency noise dots
are smoothed out. So the human perceptions of the TV screen
aren't quite as noisy as the screen itself would look to fast-
responding physical instruments.
What the human being experiences is a noisy set of signals. Some
part of this noise, theoretically, is channel noise in the
perceptual system. So the CEV is defined as being noisy in
exactly the way the perceptual signals are noisy. The implies
that the noisy perceptual signal is a noiseless representation of
the hypothetical CEV.
Of course the person is trying to ignore the noise in the
perception -- meaning that some higher perceptual system must be
constructing something closer to a noiseless CEV, extracting the
average picture, which presumably is closer to what the TV
station is transmitting.
How, then, do we define information in this situation?
···
--------------------------------
Some of our disputes seem to depend on the inherent noisiness or
ambiguity of the information being transmitted; others seem to
come down to the nature of the transmitting channel itself (the
discrete-impulse nature of neural signals).
In the first case, the brain itself could be completely free of
noise and ambiguity, and still have a problem with deciding what
the message is. The brain may have to rely on (noise-free)
estimations of probabilities to resolve such problems, but we're
talking about learned algorithms now, not fundamental problems of
information theory or probability in the brain's own operation.
In the second case, the importance of the channel noise depends
on the signal amplitudes (frequencies) involved in the brain.
Here the problem is dynamic range -- how small an average signal
can be detected in the presence of the irreducible channel noise.
In this context, information theory is simply a version of
statistical analysis cast in terms of logarithms, isn't it? Now
the problem can be handled in many ways. Electronically, we would
handle it in terms of noise power spectra and filters designed to
favor the signal spectrum over the noise spectrum. In analyzing a
system operating in this region, we could use a statistical
analysis or a Bode diagram or probably many other methods -- the
results are equivalent. The statistical approach would call the
unpredictable component of the signal "uncertainty" while an
electronics analysis would call it "noise" -- but the phenomenon
is the same.
In this second case I doubt that we have any important
divergences of understanding.
---------------------------------
RE: the challenge experiment
If you change, say, the contrast of a tracked target (or the
tracking cursor), or the bandwidth of the disturbance, or put a
grid on the visual surface...is the integration factor the only
thing that changes?
Another way to ask this question is to ask about the conditions
under which such changes would make a difference. We can try
these things later, if you wish. In previous posts, the question
of disturbance bandwidth has already been raised, and the answer
is that the best-fit integration factor varies with bandwidth.
This could be treated as a problem with nonlinearity in the
system, or as a problem of information theory. I would take the
approach of trying to define a nonlinear function -- perceptual
or output -- as a way of making the same model work over a range
of disturbance bandwidths. The least nonlinearity possible would
be the addition of a square term in the output function, and that
is what I would try first.
As to varying contrast or background, I suspect that these
factors would not make any difference until some extreme was
reached -- very low contrast, or very strong interfering
background patterns. And there, information theory might surprise
me with some useful predictions. This, however, would bring us
into the realm of behavior near the limits of perception, which
isn't a consideration in most real behaviors.
... the challenge is really for me to show whether information
theory could predict the change of integration factor across
experiments, taking into account the conditions of the
experiment. I don't know how I would do that. The integration
factor is not a construct that I have so far analyzed in my
thinking.
The use of an integration factor is motivated by PCT. You may
find an equivalent motivation in IT, but the way you analyze the
situation for your purposes is independent of PCT. If the
integration factor isn't a natural part of your analysis, there's
no rule that say you have to use one.
In making an analysis for a specific experiment, I would use
other experiments to obtain data--perceptual discrimination
experiments, for instance, or control experiments that use
presumed lower-level control systems common to the specified
experiment.
You have two conditions, with data, to use as you wish -- one
without feedback and one with (conditions 2 and 3, and also 1 if
you can make any use of it). The PCT analysis does not need any
other experiments with perceptual discrimination -- that is, it
uses simple assumptions about perceptual discrimination on which
its performance depends. You can make any reasonable assumptions
you like. You can, for example, assume that given an output
signal, the handle will come instantly to a position proportional
to it. The PCT model assumes that.
I suspect you will find some difficulties at the point where the
disturbance effects and the regulator's output effects converge
to produce a net effect on the essential variable.
I'm glad you're going to do it. Don't feel pressed to hurry: I'll
take your IOU.
----------------------------------------------------------------
Allan Randall (930308) --
RE: challenge
In order for this to be interesting, I think you need to do
one of two things:
1) Explain in a little more detail why you think information
theory predicts that the compensatory system will perform
better.
I'm just guessing and could be wrong. The object is for an
information theorist to show that I am wrong by showing exactly
how IT would be used to arrive at a correct prediction.
My reason for suspecting that IT will produce the wrong result is
Ashby's analysis on p. 224 of Intro to Cybernetics. On reading
this, I realized that the kinds of measures involved in
information theory or "variety" calculations are such that they
can't predict the equilibrium state of a system with a negative
feedback loop in it. As I just said to Martin above, there is a
problem when two information channels come together to produce an
output signal that affects the essential variable. The nature of
these statistical calculations is that they can say nothing about
the coherence of the two converging signal channels. If the
variables are treated only in terms of information content or
probabilities, they will add in quadrature, so the variance of
the signal from T to E will be larger than the variance of either
signal entering T. It is hard, in fact, to see how even the
disturbance-driven regulator could be predicted to reduce the
variance of E.
2) Find an information theorist that actually does make this
claim.
Ashby made it: the disturbance-driven regulator was said to be
potentially perfect, while the error-driven regulator was said to
be inherently imperfect. But I'm simply voicing a strong
suspicion -- that if an information theorist were to analyze this
as a problem in information flow, the wrong answer would be
found. Anybody can prove my suspicion wrong right here, in
public.
----------------------------------------------------------
Best to all,
Bill P.