[From Bill Powers (930404.0800)]

Allen Randall and Martin Taylor (past week of posts):

Rick Marken yelled

NO INFORMATION GETS THROUGH TO THE OUTPUT.

And Allen replied:

Yes! This was exactly the entire point of Ashby's Law. Ashby

defined control as the blockage of disturbance information.

It seems to me that there's a contradiction in saying that

control blocks information from entering the control system, and

then in another context insisting that there is enough

information in the control system to allow reconstruction of the

disturbance. This is saying that you can reconstruct the

disturbance by using only a fraction of the information in the

disturbance. It seems to me that if you block some of the

information from entering the control system, your reconstruction

is going to be inaccurate to the same degree. If you block 90% of

the information, how can the remaining 10% be sufficient to

reconstruct the disturbance waveform?

## ยทยทยท

---------------------------------

A different problem: Ashby didn't define control as blockage of

information. He _characterized_ the effect of control as blockage

of information. There's a big difference.

Information itself doesn't specify things like waveforms or

quantitative relationships among variables. The calculations of

information theory discard that sort of information. You can have

two completely different waveforms that contain exactly the same

amount of information. You can go from the waveform to the

information content, but you can't get back from the information

content to the same waveform.

Consider the definition I = blog(R/r). This is the amount of

information in a single sample of a waveform. If the waveform is

being sampled at some frequency f, equal to the optimum rate,

then the information rate is f*I. Some aspects of the waveform

can be specified by computing the probability of a given

amplitude within the range, so that the information is now the

sum over an interval of time of the information contained in a

given amplitude times the probability of occurrance of that

amplitude, summed over all amplitudes.

But we are still left with the fact that this information measure

is basically a statistical sort of measure, which applies to an

ensemble of samples and is completely insensitive to the order of

their occurrance. If a waveform is represented by the series of

numbers 12,15,20,15,-5, -10, we could calculate some information

content. But we would calculate exactly the same information

content if we rearranged these numbers at random: 15,-5,15,12,-

10,20. There would be no way to reconstruct the order of the

samples from knowledge of the information content. The ordering

of the samples is not used in the calculations of information

content. But it is vital to any explanation of the operation of a

real system.

It seems to me that there's a trapdoor effect here. Given a

particular design of a control system, you can show that the

calculations of information content, flow, or whatever result in

certain distributions of information within the system. Even if

we haven't got those calculations exactly straight yet, let's

assume there is some correct calculation.

But once having gone from the control-system design to the

information calculations, you can't get back again. Provided with

a different set of results from information calculations, you

can't reason backward to deduce the kind of control system that

was present -- or, I'll wager, even that this distribution was

found in a control system.

The reason for the trapdoor effect is exactly the same as it

would be if you were calculating probabilities. Given a series of

throws of the dice, you can come up with probability calculations

for a string of throws. You could calculate the effects on

probabilities of adding more dice, or adding biases, or combining

parallel strings of throws with different numbers of dice. But

given only those probabilities, you can't reconstruct what the

original sequence of throws was. Once you've crossed the border

into the world of probabilities, you can't get back again to the

world of specific relationships and processes. All you can do is

look for more relationships among the probabilities.

I think the same thing is going on with calculations of

information. Once you start using information measures instead of

analog measures of specific variables behaving in specific ways,

you're stuck in that world and can't get back. You might be able

to predict the effect of a given change in a control system on

the measures of information you use, but you wouldn't be able to

say, given only a change in an information measure, what change

in the control process caused it.

When you speak of getting information about the disturbance from

information calculations, you can only mean that given the

information in the control system, you can say how much

information there was in the disturbance. That is not

"reconstructing the disturbance" as Rick and I think about it. It

is reconstructing it to the satisfaction of an information

theorist, but not to the satisfaction of an engineer. You could

deduce that there was some abstract quantity in the input

waveform, but you couldn't specify which of all waveforms

containing that same amount of the quantity was present. You can

get from a specific waveform to a quantity of information, but

you can't go back from the quantity of information to that same

waveform. Once you're in the world of information, you're stuck

there. I repeat myself, but I hope to a good end.

Our arguments about reconstructing the disturbance have been

mixing two levels of abstraction. At the level of signal

magnitudes and time functions, we understand all the

relationships in a control system, and it's clear that a control

system can control its input quantity even if the causes of

disturbances are continually changing, one disturbing variable

being replaced by another in unending succession. At this level

of abstraction we have no problems.

Our problems arose when the word "information" was introduced.

Now we had to think about the variables and signals not just as

specific physical quantities related in specific ways, but about

the information content of these quantities. Unfortunately, the

word "information" kept slipping back and forth between two

meanings: the meaning of an analog measure, and the meaning of a

global measure of a distribution of quantized samples. It's

perfectly clear that given the value of the output and the value

of the perceptual signal, and the knowledge that p depends on d

and o according to p = o + d, we could always solve for d. In

fact, given the design of the control system and knowledge of the

reference signal, we could deduce d when given the state of any

other variable in the loop. All that is just algebra or

differential equations. There was no confusion until the

information theorists got on board.

What hasn't been clear to me until now is that introducing

information theory made ABSOLUTELY NO DIFFERENCE in the way we

did our systems analyses and predictions. All it did was to add a

level of abstract representation of what was going on. The

abstract representation relies completely on the level of

analysis that we're already using. It simply takes the given

relationships and analyzes them in terms of global measures of

information and probabilities.

I understand what the information theorists hope for. They hope

that by translating the physical measures into abstract

informational quantities, they will discover regularities and

constraints that are not visible at the lower level of

representation, in much the way physics use conservation laws to

provide short-cuts in the analysis of physical processes. But I

am not sure that they have succeeded in finding any comparable

constraints -- any constraints that have the force of a

conservation law in the behavior of the system as seen at the

lower level of abstraction. There is no basic reason why

manipulating abstract representations should reveal any

fundamental property of nature. Sometimes such manipulations seem

to work out, but I think that most of them are just

manipulations.

Of course if such constraints could actually be found, they might

be of great usefulness in constructing and analyzing models of

behavioral organization. Consider how conservation laws are used

to aid the analysis of robot arm dynamics. The equations of

motion are most directly constructed by starting with the law

that the sum of potential and kinetic energy is a constant. This

constraint leads directly (if not easily) to all the equations of

motion. So I don't question the motivation. All I question is

whether information theory is capable of providing

generalizations of such power and usefulness -- whether it has

done so yet, or whether it will do so in the future.

--------------------------------------------------------------

Best,

Bill P.