PCT and IT: The trapdoor effect

[From Bill Powers (930404.0800)]

Allen Randall and Martin Taylor (past week of posts):

Rick Marken yelled

NO INFORMATION GETS THROUGH TO THE OUTPUT.

And Allen replied:

Yes! This was exactly the entire point of Ashby's Law. Ashby
defined control as the blockage of disturbance information.

It seems to me that there's a contradiction in saying that
control blocks information from entering the control system, and
then in another context insisting that there is enough
information in the control system to allow reconstruction of the
disturbance. This is saying that you can reconstruct the
disturbance by using only a fraction of the information in the
disturbance. It seems to me that if you block some of the
information from entering the control system, your reconstruction
is going to be inaccurate to the same degree. If you block 90% of
the information, how can the remaining 10% be sufficient to
reconstruct the disturbance waveform?

ยทยทยท

---------------------------------
A different problem: Ashby didn't define control as blockage of
information. He _characterized_ the effect of control as blockage
of information. There's a big difference.

Information itself doesn't specify things like waveforms or
quantitative relationships among variables. The calculations of
information theory discard that sort of information. You can have
two completely different waveforms that contain exactly the same
amount of information. You can go from the waveform to the
information content, but you can't get back from the information
content to the same waveform.

Consider the definition I = blog(R/r). This is the amount of
information in a single sample of a waveform. If the waveform is
being sampled at some frequency f, equal to the optimum rate,
then the information rate is f*I. Some aspects of the waveform
can be specified by computing the probability of a given
amplitude within the range, so that the information is now the
sum over an interval of time of the information contained in a
given amplitude times the probability of occurrance of that
amplitude, summed over all amplitudes.

But we are still left with the fact that this information measure
is basically a statistical sort of measure, which applies to an
ensemble of samples and is completely insensitive to the order of
their occurrance. If a waveform is represented by the series of
numbers 12,15,20,15,-5, -10, we could calculate some information
content. But we would calculate exactly the same information
content if we rearranged these numbers at random: 15,-5,15,12,-
10,20. There would be no way to reconstruct the order of the
samples from knowledge of the information content. The ordering
of the samples is not used in the calculations of information
content. But it is vital to any explanation of the operation of a
real system.

It seems to me that there's a trapdoor effect here. Given a
particular design of a control system, you can show that the
calculations of information content, flow, or whatever result in
certain distributions of information within the system. Even if
we haven't got those calculations exactly straight yet, let's
assume there is some correct calculation.

But once having gone from the control-system design to the
information calculations, you can't get back again. Provided with
a different set of results from information calculations, you
can't reason backward to deduce the kind of control system that
was present -- or, I'll wager, even that this distribution was
found in a control system.

The reason for the trapdoor effect is exactly the same as it
would be if you were calculating probabilities. Given a series of
throws of the dice, you can come up with probability calculations
for a string of throws. You could calculate the effects on
parallel strings of throws with different numbers of dice. But
given only those probabilities, you can't reconstruct what the
original sequence of throws was. Once you've crossed the border
into the world of probabilities, you can't get back again to the
world of specific relationships and processes. All you can do is
look for more relationships among the probabilities.

I think the same thing is going on with calculations of
information. Once you start using information measures instead of
analog measures of specific variables behaving in specific ways,
you're stuck in that world and can't get back. You might be able
to predict the effect of a given change in a control system on
the measures of information you use, but you wouldn't be able to
say, given only a change in an information measure, what change
in the control process caused it.

When you speak of getting information about the disturbance from
information calculations, you can only mean that given the
information in the control system, you can say how much
information there was in the disturbance. That is not
"reconstructing the disturbance" as Rick and I think about it. It
is reconstructing it to the satisfaction of an information
theorist, but not to the satisfaction of an engineer. You could
deduce that there was some abstract quantity in the input
waveform, but you couldn't specify which of all waveforms
containing that same amount of the quantity was present. You can
get from a specific waveform to a quantity of information, but
you can't go back from the quantity of information to that same
waveform. Once you're in the world of information, you're stuck
there. I repeat myself, but I hope to a good end.

Our arguments about reconstructing the disturbance have been
mixing two levels of abstraction. At the level of signal
magnitudes and time functions, we understand all the
relationships in a control system, and it's clear that a control
system can control its input quantity even if the causes of
disturbances are continually changing, one disturbing variable
being replaced by another in unending succession. At this level
of abstraction we have no problems.

Our problems arose when the word "information" was introduced.
Now we had to think about the variables and signals not just as
specific physical quantities related in specific ways, but about
the information content of these quantities. Unfortunately, the
word "information" kept slipping back and forth between two
meanings: the meaning of an analog measure, and the meaning of a
global measure of a distribution of quantized samples. It's
perfectly clear that given the value of the output and the value
of the perceptual signal, and the knowledge that p depends on d
and o according to p = o + d, we could always solve for d. In
fact, given the design of the control system and knowledge of the
reference signal, we could deduce d when given the state of any
other variable in the loop. All that is just algebra or
differential equations. There was no confusion until the
information theorists got on board.

What hasn't been clear to me until now is that introducing
information theory made ABSOLUTELY NO DIFFERENCE in the way we
did our systems analyses and predictions. All it did was to add a
level of abstract representation of what was going on. The
abstract representation relies completely on the level of
analysis that we're already using. It simply takes the given
relationships and analyzes them in terms of global measures of
information and probabilities.

I understand what the information theorists hope for. They hope
that by translating the physical measures into abstract
informational quantities, they will discover regularities and
constraints that are not visible at the lower level of
representation, in much the way physics use conservation laws to
provide short-cuts in the analysis of physical processes. But I
am not sure that they have succeeded in finding any comparable
constraints -- any constraints that have the force of a
conservation law in the behavior of the system as seen at the
lower level of abstraction. There is no basic reason why
manipulating abstract representations should reveal any
fundamental property of nature. Sometimes such manipulations seem
to work out, but I think that most of them are just
manipulations.

Of course if such constraints could actually be found, they might
be of great usefulness in constructing and analyzing models of
behavioral organization. Consider how conservation laws are used
to aid the analysis of robot arm dynamics. The equations of
motion are most directly constructed by starting with the law
that the sum of potential and kinetic energy is a constant. This
constraint leads directly (if not easily) to all the equations of
motion. So I don't question the motivation. All I question is
whether information theory is capable of providing
generalizations of such power and usefulness -- whether it has
done so yet, or whether it will do so in the future.
--------------------------------------------------------------
Best,

Bill P.