summation of IT polarization

bnhpct · April 7, 1993, 2:02pm

[From: Bruce Nevin (Wed 93047 08:45:25)]

A number of people, most recently Bill, have pointed out the
perilous ambiguity of the term "information". Let us substitute
the expression log(R/r) for one meaning, and the word
"representation" (or possibly "knowledge") for the other.

PCT says o = -d, or very near to it, for continuous values of
output o and disturbance d.

IT says that this means there is a representation of d in o.
IT says that since o = r - p (for continuous values of reference
signal r and perceptual input p), and p = o + d, a representation
of d must be present in p. Otherwise, how could there be a
representation of d in o?

PCT responds that log(R/r) for d is not a representation of d
(the waveform is not reconstructable, etc.).

IT responds that there must be some way to reconstruct a
representation of d from log(R/r) for p, since the representation
of o is in such close correspondence with the representation of
d, and any correspondence is measured by log(R/r).

Probably I haven't got the translation right, but surely
equivocation over the term "information" is making this
discussion needlessly tangled.

Now this is what I think, and I'm sure of course that this will
convince everyone and we can move on to more interesting and
fruitful topics

"Information" in the sense log(R/r) is a measure of exclusion
from set membership. This is commonly talked of as reduction in
uncertainty as to set membership, such talk begs the question of
a possible relationship between the two senses of information,
insofar as "uncertainty" is a term appropriate to the "knowledge"
sense of information. There is nothing in the quantum `amount of
information' log(R/r) to indicate what it is that one is
uncertain about, and what it was that her uncertainty was reduced
about. We know that the number of candidates for set membership
was reduced by log(R/r), but we don't know what members of what
set. (Nor is there anyone in log(R/r) to perceive the
uncertainty, and log(R/r) is not the value of any signal with
respect to some other signals in a control system so far as we
know or has been proposed.)

A measurement is not `about' the thing measured in the same sense
that uncertainty is or knowledge is `about' the thing. The
quantity of information is not the information quantified. The
map is not the territory. They are incommensurate, of different
logical type, and to equivocate between them is a serious
epistemological mistake.

If the intent is to talk about the information content of d and
p, then I submit that some way of talking about information
content must be determined first. Quantity of information
is not it. Harris' finding of linguistic information in language
is a candidate for some aspects, but goes only so far as language
goes.

The observation is that o correlates closely with d. The
proposed hypothesis is that information content that is in d is
conveyed from d to the control system by way of p: that p in some
sense contains a representation of d, and this representation is
passed along to o, by way of o=r-p. The hypothesis is that the
control system then calculates the negative of the the
representation of d that is in p (the information content that
previously was in d) as the value of o.

But there is no need for the concept of information content or
representation (which as I noted we don't know how to talk about
anyway). The instantaneous comparison of the instantaneous
values r(t) and p(t) produces instantaneous error/output o(t) at
moment t, and in the environment the instaneous value o(t) and
other environmental factors abstracted to a single value d(t)
are combined as p(t), moment by moment continuously--and this suffices.

The correspondence of variation over time in d(t) with variation
over time in o(t) is a result of continuous entrainment, not of
the transmission of information content (whatever that is).

The concept of information content or representation thereof is
abstracted from a time-series of values of d(t). It is a
configuration or pattern of some sort, like a waveform. But the
comparator at time t has no access to the patterning that may be
findable in a time-series of values of p(t), at any given instant
it has access only to one value at each input, p(t) for instant t,
r(t) for instant t. The result is one value o(t) for instant t.
As d(t) and o(t) vary over time, the instant-by-instant
correspondence p(t)=o(t)+d(t) and the instant-by-instant
correspondence o(t)=r(t)-p(t), always and continuously in
correspondence, results in the correspondence of whatever pattern
may be in d (over time) to a pattern in p (over time). But the
pattern (the information content) is detectable only by an
outside observer, and IT IS A BYPRODUCT OF CONTROL instant by
instant.

Just so, by manipulating a disturbance in a rubber-band demo the
experimentor can get the participant to write the experimentor's
name in the experimentor's own handwriting. The information
content (pattern=signature) in the disturbance can be observed by
the audience to be present, with a high correlation, in the
output.

But the participant is controlling a perception of relationship
between knot and mark on whiteboard. A trace of that
relationship correlates with how well the participant is
controlling; if control is perfect, the trace is stead--no
pattern whatsoever, no pattern; if control is imperfect, there is
a correlation of the trace of the relationship (as viewed by the
audience) with a derivate of the presumed error signal within the
participant, and deviations from perfect control (the outsider's
perception of the relationship that the participant is
controlling) correlate with deviations from perfect
correspondence of the experimentor's signator with the
participant's byproduct rendition of that signature (though noise
from various sources would have to be factored in as well). But
the presence of the pattern matters only to the experimentor (her
controlled perception) and the audience. It is not a factor at
all in the participant.

Perception of pattern could be controlled at a higher level, but
that perceptual signal could not be fed back down to the lower
level to guide or influence control at the lower, for reasons
that Bill has discussed at various times. Nor would the
correspondence be as good e.g. in the rubber band experiment,
because there are just too many lower-level perceptions (angle,
pressure, speed, shape, all sorts of relationships) that must be
controlled in mimicking someone's signature.

So the information content of a perception controlled over time
is not a factor in the instant-by-instant control of that
perception, and the information content of o is in close
correspondence with the information content of d over time as a
byproduct of instant-by-instant control of p(t)=o(t)+d(t).

Or, d is outside of the control system, and p is inside it. This
is the meaning of Chuck's summation.

Bruce
bn@bbn.com