Ashby's uninformative information

[From Rick Marken (960701.1210)]

Peter Cariani (960701) --

Your example is one of transmitting a (known) signal through a noisy channel,
and this assumes that the original signal is known by someone, an external
observer looking at the system.

Right. It's known to the person who is going to measure the amount of
information communicated about the possible sensor states when a particular
sensor state is observed.

Ashby's analysis of uncertainty reduction, however, is only from the
perspective of the signal receiver and its "model" of what its sensors will
register. They are very different perspectives that deal with different
aspects of information (signal transmission vs. uncertainty reduction/
interaction with the external world).

I guess I don't understand Ashby's approach.

In your previous post you said:

The informational content of the measurement is the degree to which
information has been reduced from before the measurement (t0) to afterwards
(t1):

Information gain (bits) = U(t0) - U(t1) where U (in bits) = sum ( p(i) *
log2 (p(i))), over all states i p(i) being the probability of getting a
given reading i given the observer's model of the outcome i.e. the
observer's expectations of what will happen p(i=outcome state) = 1 after
the measurement is made and the outcome is known. A very simple model
would be to assume that the probability of a specific outcome is the same as
the observed frequency over the last 100 outcomes, above);

Let's say that my model is that the probability of each sensor measurement
is what you said it was in your earlier post:

A: .2
B: .2
C: .2
D. .39
E: .01

Now I can use your formula and these probability estimates to calulate U(t0),
the uncertainly about the observation before the observation is made (the
value of U(t0) is approximately -1.98). But I don't understand how I compute
U(t1), the uncertainly after the observation. According to your explanation,
it seems like the probability of the observation that actually occurs is 1.0
so U(t1) will always be 0.0. So the information gain [U(t0) - U(t1)] is
always precisely U(t0) which is the information in thge "model" of the
probability distribution of the possible observations. So the amount of
information I get from an observation is determined completely by my model;
it doesn't matter what observation is actually mode.

So it seems to me that Ashby is saying that the amount of information
commumicated by an observation depends completely and totally on one's model
of the environment. If I think observation A is a certainty: that is, if I my
model of the situation is:

A: 1.0
B: 0
C: 0
D. 0
E: 0

Then I get no information at all no matter what sensory measure I actually
observe. If I observe measure B, I get no information. If I observe A I get
no informaiton. This is a most remarkable view of information, indeed. It
certainly shows that I was wrong to say that information theories assume
that perceptions communicate information about the real world. Apparently,
some information theories (like Ashby's) assume that perceptions
(observatons) communicate no information at all about the real world;
information exists only in what we expect to to happen.

Anyway, this is how I understand "information" in the context of sensing/
measurement, and it does not involve any assumptions that "information about
something else" resides in the signal.

That's for sure. It sounds to me like Ashby would have to say not only that
there is no information about the disturbance in the perceptual signal,
but that there is no information about _anything_ in the perceptual signal.
This is farther than I'm willing to go. There is certainly information about
the state of the controlled variable in the perceptual signal, isn't there?

Best

Rick

ยทยทยท

Date: Mon, 01 Jul 1996 12:09:01 -0700
From: Richard Marken <marken@aerospace.aero.org>

From peter Cariani, July 1, 1996:

Rick Marken:
In your previous post you said:

>The informational content of the measurement

is the degree to which information has been reduced
from before the measurement (t0) to afterwards (t1):

I should have said the degree to which uncertaintly is
reduced from before to after the measurement. (Sorry).

Let's say that my model is that the probability of each sensor measurement
is what you said it was in your earlier post:

>A: .2
>B: .2
>C: .2
>D. .39
>E: .01

Now I can use your formula and these probability estimates to calulate U(t0),
the uncertainly about the observation before the observation is made (the
value of U(t0) is approximately -1.98). But I don't understand how I compute
U(t1), the uncertainly after the observation. According to your explanation,
it seems like the probability of the observation that actually occurs is 1.0
so U(t1) will always be 0.0. So the information gain [U(t0) - U(t1)] is
always precisely U(t0) which is the information in thge "model" of the
probability distribution of the possible observations. So the amount of
information I get from an observation is determined completely by my model;
it doesn't matter what observation is actually mode.

Uncertainty after the measurement could be greater than 0 if one knew the
outcome was one of several states (e.g. either A or B), but I agree with you,
somewhere I think I've made an error in explaining the computation. It does
not just depend on the model, nor just on the outcome. I'll have to think
about it, look it up, and get back to you.

The error is most certainly mine and not Ashby's. I came up with the
example off the top of my head, and it's based on an approach I think
Ashby would approve of, but it's been a long time since I've actually
computed entropies for anything, so I've probably omitted a term.....

If an event is completely predictable, then no reduction of uncertainty is
gained; if an event is very improbable and it happens, then a great deal
of uncertainty is reduced.........

Despite my mistake, the following interpretations certainly don't obtain........

Then I get no information at all no matter what sensory measure I actually
observe. If I observe measure B, I get no information. If I observe A I get
no informaiton. This is a most remarkable view of information, indeed. It
certainly shows that I was wrong to say that information theories assume
that perceptions communicate information about the real world. Apparently,
some information theories (like Ashby's) assume that perceptions
(observatons) communicate no information at all about the real world;
information exists only in what we expect to to happen.

>Anyway, this is how I understand "information" in the context of sensing/
>measurement, and it does not involve any assumptions that "information about
>something else" resides in the signal.

That's for sure. It sounds to me like Ashby would have to say not only that
there is no information about the disturbance in the perceptual signal,
but that there is no information about _anything_ in the perceptual signal.
This is farther than I'm willing to go. There is certainly information about
the state of the controlled variable in the perceptual signal, isn't there?

More comments when I get a chance,

Peter