[Bill Powers (930315.0700)]

Allan Randall (930314.1830) --

This post goes on for nearly a little more than 6 pages, and I

have already cut out large parts of it. I don't seem to have had

enough to do today. Just a warning to those who would like to use

the delete key and make some space.

To my knowledge, calculus actually CANNOT make the kind of

predictions you are asking of information theory. Calculus can

be used to make predictions only in combination with a physical

model such as Newtonian mechanics.

What are physical models but mathematical forms, manipulated

according to mathematical rules, that model or idealize

observations? Given that each element of mass attracts each other

element with a force proportional to (exactly) the product of the

masses and (exactly) the inverse square of the distance between

them, and given expressions for the conservation of potential +

kinetic energy, one can apply the calculus and derive the fact

that orbits are conic sections. That is the kind of prediction I

am asking of information theory: a prediction of how the system

will actually behave through time.

## ···

----------------------------

If I were trying to model behavior that takes place under

difficult conditions, this analysis might offer more of

interest by way of predicting limits of performance.

You seem to be admitting here that information theory might

have something useful to say about control systems.

Insofar as information theory could predict the limits of

performance given signals and signal-handling devices with

certain characteristics and in a known organization, sure.

So I guess I still don't understand exactly where you stand. Is

information theory completely wrong-headed or is it correct,

but of little use to PCT?

Information theory rests on definitions and mathematical

manipulations. Unless someone has made an undetected mathematical

blunder, the calculations of information theory follow correctly

from the premises. It's unlikely to be "incorrect" in those

terms. The problems I see come not in the internal consistency of

IT, but in its applications to observations. Premises can be

wrong; when they are wrong, no amount of mathematical correctness

will make the conclusions right. I don't yet see how IT is

actually linked in any rigorous way to specific physical

situations.

--------------------------------------

RE: statement of the challenge.

Using the three conditions (prior to the experiment being

performed) make some assumptions about the bandwidths of the

system, and compute in information theory terms the amount of

control required for the three conditions.

The challenge was a response to the assertion that PCT could be

derived from information theory. The prediction I'm asking for is

not how much control is required, but how much control there will

be in the two situations. To use a theory to derive the fact that

control will result from either arrangement means to make

predictions by manipulations that follow the rules of the theory.

I'm going to skip a lot that I wrote here, because prediction is

the real point.

------------------------------------------------------------

... from Ashby's diagrams + information theory, one cannot

predict what exactly R, the regulator, is doing. You cannot

predict that R is going to oppose the disturbance. Whether this

will meet your requirements for the challenge is the main point

I'd like clarified before accepting.

If you stick with these conclusions, the challenge is unnecessary

because you have agreed to my original claim. You are agreeing

that information theory can't provide the predictions of behavior

that control theory provides, but can only be applied once those

predictions are known and verified.

-----------------------------------------------------------

The very concept of a "real physical entropy" distinct from the

"formal entropy" is quite meaningless. There is no reason to

separate the physical entropy from the informational entropy.

You're just reasserting the claim that they are the same. This

claim is based on nothing more (I claim) than a similarity in

mathematical forms. In all the information-theoretic stuff I have

seen, the assumption is that information and energy flow the same

way. I was showing that energy does not go in the direction that

is commonly assumed in the nervous system. But in addition to

that, it's possible to show that messages can be sent with energy

flowing EITHER way (for instance, sending a message to the guy

operating a winch by intermittently putting a frictional drag on

the cable). If information can flow one way with energy going

either way, then the "entropy" involved in information flow is

not physical entropy. Physical entropy is always transferred

opposite to energy flow.

When the information is cancelled, this means, in my

terminology, that E has experienced "information loss," and

thereby has high information content!

Are you saying that the system begins with a certain information

content, and after losing information it has a greater

information content? I don't follow your arithmetic.

The transfer of one bit *requires* an increase in entropy to

accomplish the local decrease involved in the transmission of

information.

This seems to imply that at the terminus of the transmission,

there is a local decrease in entropy. I would like to know where

that occurs. I can see no point in the transmission of a neural

impulse where there is anything but a loss of energy (and an

increase in physical entropy) in the next place that the impulse

reaches. Where is this final place where we suddenly get the

opposite effect (taking energy FROM the arriving signal?).

This usage of "requires" is odd: I think you have the logical

implication backward. Transmission of a bit PREDICTS a change of

entropy, but it does not follow that a change of entropy PREDICTS

transmission of a bit. Entropy is a not a cause, but an effect.

The logical implication is "It is not the case that a bit is

transferred and entropy does not change."

There is no requirement that this be anywhere

near the theoretical lower limit (work must be done to transmit

information, but it need not be done with near-perfect

efficiency).

But aren't you assuming that this work is done by the transmitter

on the receiver? I can think of numerous methods to transmit a

message in a way that requires the receiver to do work on the

transmitter. For example, I can detect the presence of a control

system by pushing and seeing if there is resistance to the push.

I have to do work on the system to receive that message, and the

message can be extracted from the amount of work I do. So by

resisting and not resisting, you can send me 1's and 0's. 1 means

I'm doing some work on you and you are controlling; 0 means I'm

not and you're not.

-------------------------------------------------------------

If the system is controlling, the percept signal carries

few bits. The channel carries more bits only if there is a

disturbance.

The latter is true only for homeostasis. Changing the reference

signal to a new value will also cause more bits to be carried,

for a moment, by the percept signal. Even without a disturbance,

the system will then reduce those bits as closely as possible to

zero. The action is based on the state of the percept, not the

state of the disturbance. And the state of the percept does not

depend on the state of the disturbance alone. This means that it

does not depends on the disturbance in a known way AT ALL.

As Rick Marken has been trying to say, if you know only the sum

of two numbers, you know nothing about either of the numbers by

itself. The percept represents the sum of effects from the

system's own output and from the disturbance. A percept of 3

units could represent the sum of 6 and - 3, 6000 and -5997, -200

and 203, and so forth. There is NO information in the percept

about the disturbance, in a control system.

-----------------------------------------------------------

E is the thing under control, according to Ashby, so it HAS to

be the percept doesn't it? You seem to be placing the thing

controlled in the external environment. Ashby places it clearly

internal to the control system.

It's clearly inside the control system according to PCT. I doubt

that Ashby saw it that way, as he made the "regulator" into a

different unit. I was taking the external view in treating E as

the environmental controlled quantity, with the sensor and

percept being internal details of the regulator. But your

interpretation is just as good.

--------------------------------------------------------------

Perhaps we are using the term "disturbance" differently? I

would define it as the net effect of things in the world on the

CEV.

I have spoken of this ambiguity before. I use the term

"disturbing variable" to mean the independent CAUSE of the

disturbance. If control is tight, a 100-unit change in the

disturbing variable might produce only a 1-unit disturbance of

the controlled variable, where without the control it would

produce a 100-unit disturbance. Just speaking of "the

disturbance" is ambiguous -- do you mean the applied force, or

the movement that results from it?

You can unambiguously specify the disturbing variable without

considering what the system it affects is doing. I can specify

that I am pushing on your arm with a force of 20 pounds, or a

slowly-changing sine-wave force with an amplitude of 20 pounds

and a certain frequency. By itself, this doesn't say anything

about how your arm will move; that depends on what muscle forces

are also acting at the same time and how they are changing. You

can pull back on your end of the rubber bands, but you can't say

how the knot will move as a result.

When multiple disturbing variables exist, you can reduce them to

a single equivalent disturbing variable acting though some

equivalent path on a one-dimensional controlled variable. But you

can't say what changes in the controlled variable will actually

occur without knowing how the output of the system is acting.

That output also affects the controlled variable; it can cancel

most of the effect that the disturbing variable would have had in

the absence of control. You can't arbitrarily specify the actual

disturbance of the controlled variable; that depends on the

properties of the system being disturbed. All you can specify

arbitrarily is the state of the disturbing variable.

When you model a control system, you MUST apply modelled

disturbances via a disturbing variable. If you simply assume a

given change in the controlled variable, you're breaking the

loop: you're saying that the output makes no contribution to the

state of the controlled variable. The amount of change in the

controlled variable is one of the effects to be calculated, not

an independent variable.

-----------------------------------------------------------

Note that I never said that the control system gets "direct

information" about D - that whole notion is meaningless. But it

most definitely DOES get information about D, however you

choose to arbitrarily define it.

This is not true. If I send you a Morse Code message and John

simultaneously sends you a Morse Code message, and you receive

only the OR of these two messages, how much information do you

have about either John's message or mine? None at all. If the

receiver gets only the OR of the messages, it has no way to sort

out which dot or dash should be attributed to me or to John. The

only information it can get is about the resulting combined

message -- which in fact will be semantic gibberish.

A control system's input function receives information only about

the controlled variable. It can't tell how much of the amplitude

of that variable is due to an independent disturbance and how

much is due to its own output function. It experiences only the

sum.

This information is definitely contained in the perceptual

signal, or the organism would be unable to control against the

disturbance.

You're toying with the same paradox that got Ashby. Suppose the

disturbance is transmitting 100 bits per second to the controlled

variable. According to the Law of Requisite Information, the

output must also transmit 100 bits per second to the controlled

variable if perfect control is to be achieved. This is clearly

impossible, because then there would be zero bits per second

coming from the controlled variable into the system's output

function, while the output function is producing 100 bits per

second of information. So what level of control would be

possible? Suppose that the output function transmitted only 50

bits per second, the amount required by Law to "block" 50 bits

per second from the disturbance. That leaves 50 bits per second

unblocked, reaching the controlled variable, which is just

sufficient to cause a perfect output function to produce the 50

bits per second assumed. On this basis you would predict that the

error-driven control system could reduce the information flow

from the disturbing variable to the controlled variable by at

most one half.

At the same time, a compensating regulator could be perfect: 100

bits per second could pass from the disturbancing variable to the

controlled variable and also to the regulator. A perfect

regulator would pass the whole 100 bits per second to its output,

which according to the Law is just sufficient to block the 100

bits from the disturbance entirely. So no bits would reach the

controlled variable.

By this reasoning, the disturbance-based system has a wide margin

of performance over the error-based system even with imperfect

signal transmissions.

I claim that the experiment will show that this is not true. If

it does, something is wrong with the Law of Requisite

Information.

...The control system needs no information about them

[multiple disturbances], singly or collectively.

How can you say this, when the sole purpose of the control

system is to oppose the disturbance? It can't oppose something

it has NO information about. It simply cannot.

When you get this figured out, you can finally claim to

understand PCT.

------------------------------------------------------------

RE: tautology in defining compensating control

No. This is a tautology, as you well know. I wasn't trying to

say that if such a thing were possible, then it would be

possible. I was simply trying to state that it *is* possible.

That's what I said. You asserted that it is possible, but you

made your assertion sound like a deduction because you didn't

fill in all the premises that the deduction would require.

You did present the statements as a (partial) deductive argument

(930310):

If I can actually have complete knowledge of the disturbance D,

it is theoretically possible for me to respond appropriately

before it has had an effect on the controlled variable.

I was pointing out that the single "if" you supplied was

insufficient; you might have complete knowledge of D but be

unable to calculate or produce an output of the exactly-required

amount, or you might respond a little early or a little late, and

so forth. To make your conclusion true (I could respond

appropriately) you must supply all the required premises, among

which are those that define what "appropriately" means.

If you had filled in all the premises, you would have found that

your assertion had to be one of them. So you were simply making a

groundless assertion, and the rest is window-dressing.

Anything can be made into a tautology in the manner that you

just did. If I state A, you reword it into "If A then A," and

you have your tautology. This is a common debating tactic, but

it holds no water.

It's a common debating tactic, but it's by no means possible to

make ANY statement into a tautology:

If it rains tomorrow, my car will get wet. It will rain tomorrow;

I conclude that my car will get wet.

It is not tautological to say that my car will get wet, because

that statement does not depend on assuming that it will get wet.

It depends on assuming a fact: that it will rain tomorrow. So the

conclusion becomes testable; the car might not get wet tomorrow.

When you say that a perfect compensator compensates perfectly,

you are just asserting the same statement twice. The premises on

which a perfect compensator depends contain exactly, and only,

the assumption of perfect compensation. The only way to make the

argument non-tautological is to introduce factual premises: it is

possible for a system to (fill in the requirements on each part).

If those premises should hold true, then perfect compensation is

possible. But perfect compensation may not be possible, because

the assumed premises may not be factual. That's the only way to

say something meaningful about perfect compensation.

A tautology is simply an argument that looks like a deductive

argument but contains no possibility of being false. I agree that

people often present such arguments, trying to make them seem to

lead to a necessary conclusion and to disguise the hidden

assertion of the conclusion. That man must have been guilty of

something; the police arrested him, didn't they?.

-------------------------------------------------------------

Best,

Bill P.