# H(S)

[Allan Randall (930618.1400 EDT)]

Bill Powers (930612.0930 MDT)

When we say "disturbance" in equations, we mean the STATE OF THE
DISTURBING VARIABLE:
...We do NOT mean the amount by which the knot (X) is displaced from
the target position.

In spite of all the other misunderstandings, I don't think this is a
problem. I do not mean the amount by which the knot is moved. I mean
the force(s) acting on the knot from the external environment, whether or
not the knot actually moves.

Bill Powers (930613.2200 MDT)

>Bill, I have a real problem with this whole log(D/r) thing...
I got it from Martin Taylor... I'm just following orders.

Well, all I can say is that this is not the standard definition, which
is -log(probability), as I've described many times in the past. I suspect
you are using log(D/r) differently than Martin, but I will let him speak
for himself. The way you are using it, it is not equivalent to information.

>If two signals have wildly different scalings of amplitude, but
>are otherwise identical, then I can write a very short program
>to convert one to the other...

Suppose you have the relationship X(t) = Y(t)/10. ... No matter
how you write the program, when you compute X from Y you will get
a waveform with less relative resolution than there was in Y.
If you multiply X by 10, you will not get Y back; ...

Not necessarily. It depends on the representation used for the numbers. Eg:

00.51928374 / 10.00000000 --> 00.05192837 * 10.00000000 -->
00.51928370 LOSS

But:
.51928374E+00 / .10000000E+02 --> .51928374E-01 * .10000000E+02 -->
.51928374E+00 NO LOSS

So if I scale a sequence of these numbers, I do not have to also scale
the resolution at which they were measured nor the resolution at which
they are represented in the computer in order to retain information.

I don't want to make a federal case out of this - its not a really
important point. I was only trying to illustrate that your statement
about log(D/r), while true about THAT measure, is not IN GENERAL true
about information. D/r is just the number of possibilities inherent in
the numbering system or the measuring apparatus. If you considered this to
be the inverse of the probability, then I would have no problem. But this is
not how you've been using it. You are tying the whole thing to the way
numbers are represented - to the precision at which measurements are made.
But many other schemes are also possible - this is why the ideas of computer
programs/languages and probabilities/distributions are much better ways to

Another way of viewing it: D/r *can* be used to compute information on a
sequence of numbers *after* they have been compressed to their shortest
possible representation in whatever language we're using. Only then does
it make sense to use the precision of the numbers themselves to compute
the information. Then your comments would be more or less accurate.
But this is a BIG difference. This compressibility has not been
addressed in the measure log(D/r) as you are using it.

>...tell me how to compute log(D/r) for the
>following sequence S:

> .0 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .2 .7 .7
>.7 .7 .7 .7 .7

Before I could do that, you would have to tell me the size of r.
There isn't any inherent "resolution" in an arbitrary sequence of
numbers like the one above.

Sorry - I had thought it implied that the resolution was .1.
If r had been .001, I would have written .000 .500 .500, etc.

...What they mean depends on the
physical situation they are taken to represent.

Sure - and that depends on our *model* of the physical situation.
When you say: "...depends on the physical situation they are
taken to represent," it sounds like the same thing I mean when I say
"...depends on the chosen language."

If this is a
series of measurements of some physical variable, then we have to
talk about the measuring device's resolution.

If that is the extent of your model of the physical situation, then fine.
As I said, if your model or language is restricted to the number system or
measuring apparatus, then log(D/r) will work. However, the "physical
situation" the numbers are "taken to represent" is generally more involved
than just the measuring resolution. In a human, it is likely to be something
like a hierarchy of control systems.

My problem is that I can't find any link between the
manipulations you talk about and any PHENOMENON.

You misunderstand. The arbitrary hypothetical examples are my attempt to
explain a mathematical system called information theory. This is not
meant to be connected to any particular physical phenomenon. It has
nothing to do with control theory or thermal systems or any other
physical system. It is a mathematical notion. Now I have ALSO talked about
applying this theory to control systems, but in that case, I was not making
up arbitrary hypothetical examples - I was using the control hierarchy as
the language.

... the only answer I've received so far is
"Well, that depends on what you assume."

This is not a fuzzy, ill-defined answer. *Given* a language or model to work
with, information *is* well-defined. But, you said yourself that what the
numbers mean depends on the physical situation we take them to represent.
And this *does* depend on what is assumed.

... How do you
decide what is the most plausible way to set it up for a control
system model of a specific example of behavior? So far I'm
drawing a complete blank on that. And, apparently, so are you.

No - I've said many many times that the control hierarchy is the language.
The general notion of information *is* completely open-ended as
you describe, but as applied to control theory, it is much better defined.

ยทยทยท

-----------------------------------------
Allan Randall, randall@dciem.dciem.dnd.ca
NTT Systems, Inc.
Toronto, ON