The reality of "information"

[Martin Taylor 2009.04.06.14.23]

This comment

[From Bill Powers (2009.04.05.1650 MDT)]

In the Schouten experiment, you imagine (I am guessing) that there is
something gradually accumulating information about something, and that
the longer the information is accumulated the more precisely the
something is “known”, whatever that term is supposed to mean. I
can account for the data by imagining something different going on, but
that doesn’t mean I’m right. What it does mean is that the experiment
as
it stands is insufficient to let us choose. Given that we have more
than
one explanation, the next step has to be to modify the experiment to
pick
one over the other.

and this exchange
[Martin Taylor 2009.04.01.14.57]
If you don’t like to think that
“information” really exists, what do you think about “RMS
error”. Does that really exist?
[From Bill Powers (2009.04.01.1106 MDT)]
No, it doesn’t. It’s a computation that gives a number useful
for
measuring the average variability in a signal. There is nothing in the
environment that corresponds to it, just as there is nothing that
corresponds to an “average.”
may possibly have led me to an insight as to why we have had so much
trouble communicating over the concept of information.
I hope my so-called insight is true, because it has always puzzled me
why Bill and others are quite happy to use measures like RMS variation
and correlation, but are so very antagonistic to using measures related
to information and uncertainty. I’ve long wondered why Bill has kept
writing as though the use of an information measure implied that the
measured perception was happening at some high logical level rather
than being simply an ordinary perception. Why should any perception
subject to an information measure have to be a high-level perception? I
have privately mused as to whether Shannon might have insulted Bill in
some distant past episode, leading to this unwavering opposition to any
use of Shannon’s discoveries. But now I think I see that there is a
only simple misunderstanding about the concept and computation of
“information”.
The first comment quoted above puzzled me, because I could not think
what Bill could mean by “something different going on”, other than that
he disliked measuring information. He has never described anything
“different going on” from what I thought of as “going on”, namely a
rise in a perceptual signal out of a noise floor. So this was a puzzle
until I thought more about his strange wording “something gradually
accumulating information” in the first sentence.
That’s when the insight hit me that he might possibly be thinking of
“information” as some sort of a magic fluid that could be stored and
released. That conception would make sense of something he said in an
earlier message about “phlogiston”, which threw me for a loop at the
time, because I could not see any analogy between “information” and
“phlogiston”.
two variables has the same “reality” status as measuring their
correlation or the RMS difference between their values. I asked Bill
the question as to whether he thought “RMS error” really exists,
because if he said “No”, I would then understand why he says
“information” doesn’t really exist, but if he said “Yes” I would then
have had to ask why he thought “information” does not exist. To me, all
three – correlation, RMS error, and information and its companion
measures – are perceptions in the mind of an analyst. Each, in
conjunction with other perceptions, can be used to infer things about
the entities measured, but none of them taken alone tells you more than
that there is a particular kind of relationship between or among the
variables concerned. Like any other perception that is a function of
several variables, the analyst perceives them as having a reality in
the environment.
Now Bill says: "
In the Schouten experiment, you imagine (I am guessing) that there is
something gradually accumulating information about something, and that
the longer the information is accumulated the more precisely the
something is “known”, whatever that term is supposed to mean."
Here is a clue that I should have picked up months or years ago, that
Bill thinks of “information” as some magical fluid that is transported
alongside physical signals. Then he says “I
can account for the data by imagining something different going on,”
which, from previous postings, I take to be the rise of a perceptual
signal above a noise floor. I haven’t commented on this in prior
discussions, other than to point out that Schouten had the same
explanation. I just took it for granted that this is what was
happening. I don’t remember saying so explicitly, though, as I took it
as almost self-evident. I was more concerned to make the point that the
information measure was a measure of how fast that signal was rising
relative to the noise floor. “Relative to the noise floor” is what
makes the informational measure different from a simple measure of the
magnitude of the rising perceptual signal.
Now let’s consider a bit more closely what I think I’m talking about.
First, think of correlation, because that’s a fairly close analogue to
information. You can take a bunch of measures of two variables, xi and
yi. and create a measure called “correlation” = c(xi, yi) or another
called “RMS difference” = r (xi, yi), where c and r are the appropriate
functions over the values of i. You can take those same set of measures
xi and yi and determine various information measures, such as H(x|y) or
H(x,y) (uncertainty of x when you know y, or joint uncertainty of the
distribution of x and y).
So far, there’s no “reality” difference among the measures. They are
all just algorithms applied to sets of variables. They all tell you
what you can know about one variable if you know the other. None of
them say anything about whether one variable has any causal
relationship with the other. Causality between them may be one-way
(either way), two-way, or no-way. You can’t tell simply from measuring
the variables.
If you know from other sources that the two entities can NOT influence
one another, then you can make inferences. If H(x|(knowing y)) is much
less than H(x|(y unknown)), or if the correlation is high between x and
y, or if the RMS difference between x and y is much less than their
individual standard deviations, then if they cannot influence one
another it is almost certain that they are both subject to a common
influence. Despite that they cannot influence each other, nevertheless
if you know the values of one, you know more about the values of the
other than you otherwise would. There is an informational relationship
between them with no signal transmission between them. By observing
both, you may learn something about how the common influence is
varying, though you may not know what the common influence is.
At the other extreme, if you know that X (whose successive values are
xi) causally influences Y (values yi) and that there is nothing that
independently influences both X and Y, then the correlation, RMS
difference (after scaling), or informational relation between X and Y
lets you know whether there are other influences on Y (e.g., noise in
the channel of causal influence, or independent causal influences). The
measures tell you how important those other influences on Y are,
relative to the causal influence of X. The measures don’t tell you what
the other influences might be, but they guide you in considering how
important it is to look for them. Correlation is a relative measure –
a correlation of 0.5 means the other influences on Y outweigh that of X
by 3:1. Uncertainty and RMS variation (which are interconvertible if
the distributions are Gaussian) are absolute measures, which can be
made relative by comparing them to information and scaled RMS deviation.
The details of all this are unimportant. I mention them only to
illustrate the parallels between the measures based on Gaussian
variances and those based on uncertainty measures. In fact, Garner and
McGill (The relation between information and variance analyses.
Psychometrika, 21, 1956, 219ff) showed that everything in an
analysis of variance can be generalized by using information measures,
and that an analysis of information can be more reliable than an
analysis of variance because it does not depend on distributions being
joint Gaussian.

Now I don’t think anyone believes that there is a magic fluid called
“correlation” that flows between an input to a noisy channel such as a
nerve fibre and its output. Why then should anyone think that there is
a magic fluid called “information” that flows along such a connection?
The waveform that emerges from the output is related to the time
pattern of the inputs, and there are many ways of measuring that
relationship, correlation and information being two.

There are also many ways of describing properties of the connection
channel itself. One of them is bandwidth, which tells you how the
channel deals with signals that vary at different rates. Another is
precision, which tells you how variable the output is likely to be for
a given input. Another is channel capacity, which is a kind of amalgam
of the previous two. A communication channel has a higher capacity if
it is more precise or if it can follow quicker variations in the input.

When we talk about perception and the perceived environment, we assume
that variations in the environment have a direct causal influence on
the related perception, rather than that something else independently
influences both perception and the environment. We can be sure, since
this is a physical world, that there is some
limit on how precisely a perceptual signal matches the environment of
which it is a function. Where the limitation matters depends on the
circumstances. In this causal link, channel capacity may be the
important limiting quantity, or the limitation might be in bandwidth,
or it might be in ultimate precision achievable after long observation
of something.The momentary value of a perceptual signal is what it is,
but over time, it may fluctuate even when the input is steady. That
fluctuation would be measured as uncertainty in the perceptual signal
if we could ever track the values of a perceptual signal. The
fluctuations that follow fluctuations in the environmental variables of
which the perceptual signal is a function are the transmitted
information. They don’t mimic some magic flow of information; the
information transmitted is their measure.

I assume the misunderstanding about some kind of magic information
fluid flow comes from the phrasing that we tend to use. We always talk
about information transmission instead of signal transmission, because
it is a more general way of dealing with the relationships that are a
consequence of the signals that pass between entities. That should not
be taken as implying that anything is transmitted other than the
signals. Information, uncertainty, entropy, and the like, are
perceptions in the mind of an analyst, measures that may tell much or
little about the measured entities.

I hope this might help reduce the level of misunderstanding about
information as phlogiston, and help reduce miscommunication in future
discussions.

Martin

···

from my point of view, measuring the informational relationship between

[From Rick Marken (2009.04.07.1430)]

Martin Taylor (2009.04.06.14.23)--

We can be sure, since this is a physical world, that there is
some limit on how precisely a perceptual signal matches
the environment of which it is a function.

The question of "how precisely a perceptual signal matches the
environment of which it is a function" does not really fit my
understanding of the PCT model of perception. The question assumes
that perception is a process of communicating to the behaving system
what is _really_ going on in the environment. Based on my
understanding of PCT, I view a perception as a construction based on
sensory input. The question in PCT isn't how precisely a perceptual
(or, at the lowest level, a sensory) signal represents (or matches) an
environmental variable. The question in PCT is what kind of
construction -- what function of the sensory consequences of the
environment -- is represented by the perceptual signal. When the
perception is one that is being controlled the question becomes "What
is the controlled variable"?

Where the limitation matters
depends on the circumstances. In this causal link, channel capacity may be
the important limiting quantity, or the limitation might be in bandwidth, or
it might be in ultimate precision achievable after long observation of
something.The momentary value of a perceptual signal is what it is, but over
time, it may fluctuate even when the input is steady. That fluctuation would
be measured as uncertainty in the perceptual signal if we could ever track
the values of a perceptual signal.

I think there is certainly some noise added to perceptual signals. The
research I'm doing right now might actually provide a way to get an
estimate of the magnitude of that noise. But noise is not really much
of a problem for control systems; at least not if the signal to noise
ratio is reasonably high, which it seems to be based on my modeling of
control phenomena, such as the low correlation between cursor traces
on different tracking trials with the same disturbance (and output).

The fluctuations that follow fluctuations
in the environmental variables of which the perceptual signal is a function
are the transmitted information. They don't mimic some magic flow of
information; the information transmitted is their measure.

That may be true but it is true only if the model of perception you
are using is correct. I don't think your model is correct and I think
there is evidence that it is not. One piece of evidence comes from my
studies of hierarchical perception and control
(http://www.mindreadings.com/ControlDemo/HP.html). This demonstration
shows that the exact same environmental situation can be perceived and
controlled in any of (at least) three different ways. Which one is the
correct match to the environment? I think these three perceptions are
three different constructions from the same sensory consequences of
the environment and that some of these percpetions take longer to
construct (the sequence) than others (the configuration and
transition), which is why the configuration perception can be
controlled at a faster rate than the sequence.

Best regards

Rick

···

--
Richard S. Marken PhD
rsmarken@gmail.com

[Martin Taylor 2009.04.13.23.01]

I’m trying to follow my own prescription of delaying responses so as to
give everyone time to rethink.

[From Rick Marken (2009.04.07.1430)]
Martin Taylor (2009.04.06.14.23)--

The fluctuations that follow fluctuations
in the environmental variables of which the perceptual signal is a function
are the transmitted information. They don't mimic some magic flow of
information; the information transmitted is their measure.
That may be true but it is true only if the model of perception you
are using is correct.

I think you still don’t understand about information as a measure. It
does not depend in the slightest on any model. All it depends on is
having two or more data sources.

I don't think your model is correct and I think
there is evidence that it is not. One piece of evidence comes from my
studies of hierarchical perception and control
(). This demonstration
shows that the exact same environmental situation can be perceived and
controlled in any of (at least) three different ways. Which one is the
correct match to the environment? I think these three perceptions are
three different constructions from the same sensory consequences of
the environment and that some of these percpetions take longer to
construct (the sequence) than others (the configuration and
transition), which is why the configuration perception can be
controlled at a faster rate than the sequence.

What does this evident truth have to do with whether the fluctuations
in those perceptions are related to fluctuations in the environment?
The exact same environmental situation can be perceived and controlled
in a literally infinite number of different ways. In fact it’s the
third infinity, greater than the number of real numbers, which itself
is greater than the number of rational fractions. In any case, whether
the perceptual fluctuations are or are not related to fluctuations in
the environment, you can still measure the information you can get
about one by observing the other. If there’s a causal link, one would
be likely to call it the “information transmitted” from the environment
to the perceptual signal, at any and all levels of perception. Remember
that “information transmitted” does not mean
information actually flows. It’s just a measure that relates how much
you can tell about variation in the perceptual signal given the
environmental variation, or vice-versa. The “transmission” is a
reference to the causal path that is assumed to connect the environment
to the perceptual signal.

What information is transmitted from environmental variation to the
perceptual signal with any of these infinitely many functions depends
partly on the function and partly on the noise. “Partly on the
function” refers to the partial derivative of the perceptual signal
with respect to some environmental variable. If it’s too small, noise
is likely to overwhelm the contribution of that variable to the
signal. Information transmission from the environment can be measured
for each of your three perceptions, and moreover, the information
available from each about any of the others can also be computed. If
one perception is a precursor or an input to another, we can talk about
the information transmitted between them, too. In HPCT, with however
many levels are eventually accepted as realistic,
every environmental situation will be perceived in at least that number
of ways, multiplied by all the ways that the environmental situation is
perceived by the many perceptual functions at any one level. And for
each of them, the information transmitted from the
environmental situation to the perception at that level could, in
principle, be computed.

We can talk this way for theoretical purposes, and can measure
information transmission in simulations, but in the real world, one can
only find lower bounds, because we cannot directly measure fluctuations
in any perceptual signal, and must rely on surrogate measures such as
observations of the outputs to the environment when people are
controlling some related perception. Lower bounds found this way might
well be quite conservative, but one can’t tell without some kind of
comparison test. That’s where experiments like the one you reference
are useful, because parts of the control systems that generate the
observable output appear to be much the same for all the different
perceptions that are controlled. That assumption (which you make)
allows us reasonably to allot the differences in control rates to the
information capacities of the different perceptual pathways, each
depending on its lower-level predecessor. Furthermore, in your case, it
seems likely that the differences in information capacity are due to
differences in bandwidth, though again I should repeat the caveat that
such measures are only lower bounds. Slower control could, in
principle, be caused also by increased transport lag, with no change in
information capacity.

Martin

···

http://www.mindreadings.com/ControlDemo/HP.html

[From Rick Marken (2009.04.15.2200)]

Martin Taylor (2009.04.13.23.01) --

I think you still don't understand about information as a measure. It does
not depend in the slightest on any model. All it depends on is having two or
more data sources.

I think the idea that perception is based on transmitting data from
one place (source) to another (receiver) is a model. My model of
perception has nothing to to with transmitting information; it's about
constructing representations.

The exact
same environmental situation can be perceived and controlled in a literally
infinite number of different ways.

That seems to rule out the idea that perception is a process of
communicating to the mind what is actually out there in the
environment. If the same environmental situation can be perceived in
an infinite number of ways, the there is no information to be
transmitted about it. Information theory assumes that there is a
message to be transmitted and received. The message might be a binary
sequence like 1011010. There are 7 bits of information to be
transmitted in this message. If the received message is 1011010 then
we can say that 7 bits of information have been transmitted. If there
is noise in the transmission channel then the message received might
be 10x101x; only 5 bits were transmitted successfully. In this
situation, measuring the amount of information carried by a
transmission channel makes sense and it might even have practical
value; it can tell us how many times a message should be repeated over
a channel so that we can be sure it was received successfully.

This is not the way I think perception works. Perception is not a
channel that brings a message about the "true" state of the
environment into the brain. The "true" state of the environment could
be represented as a binary "message" like 1011010. But I don't think
of this as a real message; it is just the state of a set of physical
variables. If what is perceived is, say, some linear combination of a
subset of the elements of this "message", then it makes no sense (it
seems to me) to ask how much information about the state of the
environment is communicated by the perceptual signal. It's just
doesn't seem like a relevant question.

In any case, whether the perceptual fluctuations are or
are not related to fluctuations in the environment, you can still measure
the information you can get about one by observing the other.

You might be able to measure it but would it make any sense? Suppose
the environment is a 7 bit sequence, 1011010 being one possible state
of that environment. Also suppose that a perception of that
environment is p = a1b1+a2b2+...a7b7, where the a's are constants and
the b's are the bits in the environmental "message". I presume you
could measure whether the fluctuations in p are related to variations
in the 7 bit environmental message. But I don't think that would tell
you much about what is being perceived or how perception works. It
would be misleading, actually, because it would give the impression
that perception is about carrying information about the state of the
environment to the brain when that's not what's happening.

If there's a
causal link, one would be likely to call it the "information transmitted"
from the environment to the perceptual signal, at any and all levels of
perception. Remember that "information transmitted" does not mean
information actually flows. It's just a measure that relates how much you
can tell about variation in the perceptual signal given the environmental
variation, or vice-versa. The "transmission" is a reference to the causal
path that is assumed to connect the environment to the perceptual signal.

Yes, and it makes it seem that perception is a particular kind of
causal path: a communication channel. I think perception is more
properly modeled as functions that map aspects of the environment into
perceptual signals that vary with variations in the degree to which
these aspects can be constructed given the state of the environment.

What information is transmitted from environmental variation to the
perceptual signal with any of these infinitely many functions depends partly
on the function and partly on the noise.

OK, I think I can agree that if you _know_ the perceptual function
then you can measure the extent to which variations in that signal are
due to noise rather than environmental variations. I think I can do
that pretty well without using any information theory at all; I do
have to use control theory though.

I just don't see any value in information theory for the study of
purposeful (control) behavior. It may be of value in communication
engineering but I don't see how it could help me in my work on living
control systems. But of you like it, go for it.

Best regards

Rick

···

--
Richard S. Marken PhD
rsmarken@gmail.com

[From Bill Powers (2009.04.16.0819 MDT)]

Rick Marken (2009.04.15.2200)]

Martin Taylor:

The exact

same environmental situation can be perceived and controlled in a
literally

infinite number of different ways.

RM:

That seems to rule out the idea that perception is a process of

communicating to the mind what is actually out there in the

environment. If the same environmental situation can be perceived in

an infinite number of ways, the there is no information to be

transmitted about it. Information theory assumes that there is a

message to be transmitted and received. The message might be a
binary

sequence like 1011010. There are 7 bits of information to be

transmitted in this message. If the received message is 1011010 then

we can say that 7 bits of information have been transmitted. If
there

is noise in the transmission channel then the message received might

be 10x101x; only 5 bits were transmitted successfully. In this

situation, measuring the amount of information carried by a

transmission channel makes sense and it might even have practical

value; it can tell us how many times a message should be repeated
over

a channel so that we can be sure it was received
successfully.

BP:
I think you’re getting close to something here. Electrical engineers, or
most people (like me) when they’re being engineers, are naive realists.
We assume that the soldering iron is really there, that the circuit
components are really what they appear to be, and so on. And the
communications engineer assumes that the dots and dashes the telegrapher
is sending are really in the sequence that appears to be happening.
Shannon’s job at Bell Labs was to figure out how faithfully, and how
fast, that sequence could be transmitted via some particular channel to
its destination. Fidelity is determined by comparing the message that was
sent against the message that was received. To define information
transfer, or determine channel capacity, you have to know both. If you
receive a message that says “Mary had a libble limb”, for all
you know that is exactly the message that was transmitted, and the
channel capacity was not exceeded. But if the original message was
“Mary hab a labble lamb,” the message was not transmitted
faithfully, regardless of what you expected the original to be. To know
what the channel capacity is you have to have a way of knowing what is
really Out There – what message was really sent.
Determining channel capacity, as you say, is a pretty simple proposition
– no metaphysics needed. But information theory introduces metaphysics,
and that is where IT and I part company. The concept that information is
a reduction in uncertainty comes from confusing an equation used to
describe a phenomenon with the phenomenon itself. I saw this happen in
cybernetics, with Ashby’s “Law of Requisite Variety.” The whole
concept of uncertainty in physics or in casinos is metaphysics. The fact
that we sometimes say we are “uncertain” about something has no
meaning outside our private experiences. It doesn’t mean that there is
something in nature called uncertainty and we are sensing it. And
reducing uncertainty can be accomplished by many means, including getting
a good night’s sleep or regaining one’s self-confidence (justifiably or
not).
Here is a quote from a Wiki article:

http://en.wikipedia.org/wiki/Information_entropy

···

=============================================================================
Shannon’s entropy represents an absolute limit on the best possible
lossless compression of any communication, under certain constraints:
treating messages to be encoded as a sequence of independent and
identically-distributed random variables, Shannon’s source coding theorem
shows that, in the limit, the average length of the shortest possible
representation to encode the messages in a given alphabet is their
entropy divided by the logarithm of the number of symbols in the target
alphabet.
A fair coin has an entropy of one bit. However, if the coin is not fair,
then the uncertainty is lower (if asked to bet on the next outcome, we
would bet preferentially on the most frequent result), and thus the
Shannon entropy is lower. Mathematically, a coin flip is an example of a
Bernoulli trial, and its entropy is given by the binary entropy function.
A long string of repeating characters has an entropy rate of 0, since
every character is predictable. The entropy rate of English text is
between 1.0 and 1.5 bits per letter,[1] or as low as 0.6 to 1.3 bits per
letter, according to estimates by Shannon based on human experiments.[2]
.

BP:

My immediate reaction to the first sentence is to start looking for
exceptions to this wild generalization. What do you mean, the “best
possible lossless compression of any communication?” Who says you
have exhausted all the possibilities ever known or that will ever be
known? You can do this only by defining some small universe with only a
few possibilities so you can be sure nothing has been left out – and
this is exactly what information theory does.

That is why Shannon has to say “in a given alphabet”. As soon
as he said that, I knew two things: (1) information theory is not about
the real world, and (2) neither Shannon nor anyone else had any idea of
the size of the alphabet needed to encode all possible messages.

Channel capacity is a physical property of the transmission channel
itself – it does not change when you change alphabets. For example what
is the message you get if you call someone on the telephone and there is
no answer? It doesn’t matter what alphabet you expect the answer to be
written or spoken in: no answer gives you the information that nobody is
answering that telephone. You don’t know why, but there are endless
possibilities, including a mass murder, a fire, or a fickle friend.
Considering all the things that might account for the lack of an answer,
it is clearly impossible to find any finite alphabet in which every
answer could be encoded. So information in the sense of knowledge about
the world is not the same thing as Shannon information. Channel capacity
does not tell you how much information the world has to give us, or how
fast it is generating that information.

An interesting thing happened on the way to the internet. Here’s another
reference, clearly somewhat dated:

[
http://www.skepticfiles.org/cowtext/comput~1/9600info.htm

](http://www.skepticfiles.org/cowtext/comput~1/9600info.htm)And some quotes from it:

=================================================================================


The roughly 3000-Hz available in the telephone bandwidth poses few
problems
for 300 bps modems, which only use about one fifth of the
bandwidth.  A full
duplex 1200 bps modem requires about half the available bandwidth,
transmitting simultaneously in both directions at 600 baud and using
phase
modulation to signal two data bits per baud.  "Baud rate"
is actually a
measure of signals per second.  Because each signal can represent
more than
one bit, the baud rate and bps rate of a modem are not necessarilly the
same.
In the case of 1200 bps modems, their baud rate is actually 600 (signals
per
second) and each signal represents two data bits.  By multiplying
signals per
second with the number of bits represented by each signal one determines
the
bps rate: 600 signals per second X 2 bits per signal = 1200 bps.
In moving up to 2400 bps, modem designers decided not to use more
bandwidth,
but to increase speed through a new signalling scheme known as
quadrature
amplitude modulation (QAM).
In QAM, each signal represents four data bits.  Both 1200 bps and
2400 bps
modems use the same 600 baud rate, but each 1200 bps signal carries two
data
bits, while each 2400 bps signal carries four data bits:
600 signals per second X 4 bits per signal = 2400 bps.

ECHO-CANCELLATION
This method solves the problem of overlapping transmit and receive
channels.
Each modem's receiver must try to filter out the echo of its own
transmitter
and concentrate on the other modem's transmit signal.  This presents
a
tremendous computational problem that significantly increases the
complexity
-- and cost -- of the modem.  But it offers what other schemes
don't:
simultaneous two-way transmission of data at 9600 bps.
The CCITT "V.32" recommendation for 9600 bps modems includes
echo-
cancellation.  The transmit and receive bands overlap almost
completely, each
occupying 90 percent of the available bandwidth.  Measured by
computations per
second and bits of resolution, a V.32 modem is roughly 64 times more
complex
than a 2400 bps modem.  This translates directly into added
development and
production costs which means that it will be some time before V.32 modems
can
compete in the high- volume modem market.

*- - - - - - -

=================================================================
BP:

… and now we have dial-up modems that run at 56000 bits per second by
compressing the message before transmission and decompressing the
received message. Net-Zero and Juno, I read, can compress text (in the
server) to 4% of its original size and achieve another factor of
25.

A last quote from

http://en.wikipedia.org/wiki/Modem
:

================================================================================

List of dialup speeds

Note that the values given are
maximum values, and actual values may be slower under certain conditions
(for example, noisy phone lines).[4] For a complete list see the
companion article List of device bandwidths.
Connection
Bitrate
Modem 110
baud
0.1 kbit/s
Modem 300 (300 baud) (Bell 103 or
V.21)
0.3 kbit/s
Modem 1200 (600 baud) (Bell 212A or
V.22)
1.2 kbit/s
Modem 2400 (600 baud)
(V.22bis)
2.4 kbit/s
Modem 2400 (1200 baud)
(V.26bis)
2.4
kbit/s
Modem 4800 (1600 baud)
(V.27ter)
4.8
kbit/s
Modem 9600 (2400 baud)
(V.32)
9.6 kbit/s
Modem 14.4 (2400 baud)
(V.32bis)
14.4
kbit/s
Modem 28.8 (3200 baud)
(V.34)
28.8
kbit/s
Modem 33.6 (3429 baud)
(V.34)
33.6
kbit/s
Modem 56k (8000/3429 baud)
(V.90)
56.0/33.6 kbit/s
Modem 56k (8000/8000 baud)
(V.92)
56.0/48.0 kbit/s
Bonding Modem (two 56k modems)) (V.92)

112.0/96.0 kbit/s [5]
Hardware compression (variable)
(V.90/V.42bis) 56.0-220.0 kbit/s
Hardware compression (variable)
(V.92/V.44)
56.0-320.0 kbit/s
Server-side web compression (variable) (Netscape ISP) 100.0-1000.0
kbit/s

BP:
Entropy is not easy to define. A good discussion is in

http://www4.ncsu.edu/unity/lockers/users/f/felder/public/kenny/papers/entropy.html
Here is a quote:

If I were able to measure the complete, microscopic state of the air
molecules then I would know all the information there is to know about
the macroscopic state. For example, if I knew the position of every
molecule in the room I could calculate the average density in any
macroscopic region. The reverse is not true, however. If I know the
average density of the air in each cubic centimeter that tells me only
how many molecules are in each of these regions, but it tells me nothing
about where exactly the individual molecules within each such region are.
Thus for any particular macrostate there are many possible corresponding
microstates. Roughly speaking, entropy is defined for any particular
macrostate as the number of corresponding microstates.
To recap: The microstate of a system consists of a complete description
of the state of every constituent of the system. In the case of the air
this means the position and velocity of all the molecules. (Going further
to the level of atoms or particles wouldn’t change the arguments here in
any important way.) The macrostate of a system consists of a description
of a few, macroscopically measurable quantities such as average density
and temperature. For any macrostate of the system there are in general
many different possible microstates. Roughly speaking, the entropy of a
system in a particular macrostate is defined to be the number of possible
microstates that the system might be in. (In the appendix I’ll discuss
how to make this definition more explicit.)

BP:

Since the number of possible microstates of the outside world is rather
large, about all that we can conclude is that the entropy of any
macrostate is infinite.

RM:

This is not the way I think
perception works. Perception is not a

channel that brings a message about the “true” state of
the

environment into the brain. The “true” state of the environment
could

be represented as a binary “message” like 1011010. But I don’t
think

of this as a real message; it is just the state of a set of physical

variables. If what is perceived is, say, some linear combination of
a

subset of the elements of this “message”, then it makes no
sense (it

seems to me) to ask how much information about the state of the

environment is communicated by the perceptual signal. It’s just

doesn’t seem like a relevant question.

BP:

This is an important observation. If the taste of lemonade consists of
temperature, tartness, sweetness, and other sensations, the perception of
lemonade is not about some corresponding entity in the outside world.
There is no “message” about lemonade coming into the brain.
Instead, the perception is to the individual components as density is to
the positions of individual molecules. The world consists of microstates;
perceptions are the macrostates. One level of perception consists of the
microstates of the next level up, which relatively speaking consists of
the macrostates. Obviously we can’t go from the macrostates to the
microstates, although by considering many different macrostates derived
from the same microstates outside, we can begin to build a fuzzy picture
of the microstates. And the control process, plus reorgnaization, allows
us to manipulate microstates in such a way as to give us control of the
macrostates without even knowing what the microstates are.

Well, that is metaphysics, too: it’s one level talking about other
levels. I think the most important point you make here is that we can’t
consider perceptual signals as “messages” passed to higher
levels. The higher levels take whatever inputs they want from the
existing lower levels; they create a new set of perceptions from some set
of lower-perceptions, with the “taking” being done by the
recieving entity, not by some transmitting entity. The lower levels do
not decide what they want to say to the higher levels. Yet the higher
levels can tell the lower ones what they want to receive from
them.

Best,

Bill P.

[Martin Taylor 2009.04.17.10.58]

[From Rick Marken (2009.04.15.2200)]

I’m adding [MT] and [RM] on behalf of David Goldstein. I may have
missed some, but at least I tried :slight_smile:

Martin Taylor (2009.04.13.23.01) --

[MT] I think you still don't understand about information as a measure. It does
not depend in the slightest on any model. All it depends on is having two or
more data sources.
[RM] I think the idea that perception is based on transmitting data from
one place (source) to another (receiver) is a model. My model of
perception has nothing to to with transmitting information; it's about
constructing representations.

So is mine. But “constructing representations” (in the form of scalar
variable values) can’t happen unless information is transmitted (in the
sense of “transmitted” that I have described before) between the
environment and the perception. So your comment is self-contradictory.


[MT] The exact
same environmental situation can be perceived and controlled in a literally
infinite number of different ways.
[RM] That seems to rule out the idea that perception is a process of
communicating to the mind what is actually out there in the
environment. If the same environmental situation can be perceived in
an infinite number of ways, the there is no information to be
transmitted about it.

That’s a mind-boggling comment. I have no idea where such a notion
might come from, or what part of my statement causes you a problem.
It’s a simple fact that the same environmental situation can be
perceived and controlled in a literally infinite number of ways.
It appears to be a fact that perception relates in some way to “what is
out there”, if only because we seem able to control at least some of
our perceptions some of the time.
So what is your problem? Do you not think that all those functions
equally could represent something about the environment, or is the
problem just that you don’t understand what is meant by transmitted
information?
If the nth of these infinitely many possible perceptual functions is Pn
= pn(environment), then an analyst could measure the “information
transmitted” from the environment to Pn by looking at how variations in
Pn relate to variations in the environment. The analyst could do the
same for Pi, Pk, … and all the others at the same time (if he had
infinite analytic capacity :-)). Why would you say “there is no
information to be transmitted about it
”?

[RM] Information theory assumes that there is a
message to be transmitted and received. ...

That is a major misunderstanding about information theory. You are
talking about the original application environment that interested
Shannon. But even he in his original writing on “The Theory of
Communication” was careful not to restrict it in that way. “Information
theory” is not a very good name, really. I think “Uncertainty Analysis”
would be better, but “Information Theory” has been around for a long
time, so we have to stick with it despite its tendency to mislead.

Uncertainty is the basic measure, and information the differential
measure that refers to how the uncertainty about X changes as a
consequence of knowing something about Y. Y could be a message from X,
but there’s no reason it has to be. Sometimes we might call Y an
“observation”, if the situation makes that a sensible name. Knowing the
value of Y (which could be a perceptual signal) may tell something
about the value of X (which may be a function of some environmental
variables) beyond what could have been known without considering the
value of Y. If so, Y can provide information about X, and vice-versa.
But only if X is causally related to Y would we say that X transmits
information to Y, and then we would NOT say Y transmits information to
X.

[RM] This is not the way I think perception works. Perception is not a
channel that brings a message about the "true" state of the
environment into the brain. The "true" state of the environment could
be represented as a binary "message" like 1011010. But I don't think
of this as a real message; it is just the state of a set of physical
variables. If what is perceived is, say, some linear combination of a
subset of the elements of this "message",

As well it might be … So far, so good, though using the label
“message” seems calculated to lead you astray if you aren’t careful. As
it seems to have done.

[RM continued] then it makes no sense (it
seems to me) to ask how much information about the state of the
environment is communicated by the perceptual signal. It's just
doesn't seem like a relevant question.

There’s the nub. You don’t think it is a relevant question.

I think it’s highly relevant to the quality of control. If the
perceptual signal does not fluctuate in some reasonably consistent way
when the environmental variable influenced by the control output
changes, control will not be very good.

At one extreme, fluctuations in the perceptual signal might be
unrelated to changes in the real environment. (Remember that the real
environment is unknowable other than by way of perceptual signals, but
it is assumed to exist and to be that on which the output necessarily
acts). Under those conditions, the perceptual signal will be
uncontrollable.

At the other extreme, the perceptual signal mirrors exactly any changes
in some function of variables in the real environment, and, assuming a
similar consistency in the effect of the output on those environmental
variables, the ability to control against any specified disturbance or
to any specified reference waveform is limited by properties of the
feedback loop such as gain and transport lag. Between these two
extremes, the latter of which is physically unattainable, lies a range
in which there is a balance between (a) constraints based on the speed
and precision of the relation between the perceptual signal and the
real environment, and (b) constraints based on the properties of the
control loop. As the precision and speed of perception increase, the
properties of the rest of the loop come to dominate the constraints; as
the speed and precision (channel capacity) of the perceptual functions
decrease, these come to dominate the accuracy of control.

Let’s ask about a hypothetical experiment, which shouldn’t be too hard
to write and test. It’s a simple pursuit tracking based on Bill’s
“TrackAnalyze” (I might even be able to do it myself – I think I’m
getting a better understanding of Delphi programming – but not until
after I get back from a NATO meeting in the first week of May for which
I spending most of my time preparing, and perhaps not until after I
manage to complete a different tracking task I am trying to program).
The Environment is the same as in TrackAnalyze: a target moves smoothly
up and down, and the subject influences a cursor to move up and down
along a vertical track close to the track of the target. The subject
is asked to control a perception of target and cursor being at the same
level. What differs from the version of TrackAnalyze in LCS III is the
display resolution.

This figure represents the normal TrackAnalyze experiment as well as
the suggested revision. It deliberately omits description of whatever
control loops are involved in the subject’s operations, and subsumes
all the perceptual levels between the display and the controlled
perception into a single small arrow, one marked red for the target,
the other in black for the cursor manipulated by the subject. The red
pathway is the one of interest. I specifically include the “Display”
function in the diagram, since the experiment uses it to emulate
variations in the channel capacity of the perceptual pathways up to the
perceptual function that creates the controlled perception. The
objective is to determine how variations in the channel capacity of the
path Target->Controlled perception affect the parameters of the
modelled control loop, and the accuracy of control.

Resolution_expt.jpg

The “Real Target” and “Real Cursor” are exactly what they are in the
standard “TrackAnalyze” – objects whose vertical location in the
display space varies smoothly (within the limits of display
resolution). What is different between the suggested experiment and the
original is the role of the “Display” function.

I suggest two variants, both of which use the channel capacity of the
“Display” function as an experimental variable (IV in Rick’s
terminology). In both variants the “Real Target” and “Real Cursor” are
influenced smoothly in the usual way by the disturbance program and the
Mouse respectively. The display resolution is varied in space and/or
time. At one extreme is the standard TrackAnalyze, and at the other
extreme the displayed cursor does not move at all when the mouse moves.

1 (spatial information constraint) Between these extremes, the spatial
resolution of the display is varied. The displayed cursor is not a
pointed bhorizontal bar, but a rectangle that fills some vertical space
on teh screen. At 1 bit, the cursor is represented by a rectangular
block that extends from the mid-line to either the top or the bottom of
the display, depending on whether the Real Cursor is at that moment
above or below the mid-line. At 2 bits, the subject can see in which of
four regions of the display the cursor is – the rectangle fills the
vertical extent of whichever region contains the Real Cursor (the
regions should really be chosen so that the target spends an equal
amount of time in each of the four, but this probably doesn’t matter,
numerically ). At 3 bits, the subject can see in which of 8 regions the
Real Cursor is, and so forth.

2 (temporal information constraint). Between these extremes, the
temporal resolution of the display is varied. The displayed cursor is
the normal TrackAnalyze horizontal bar. It moves stepwise, the steps
being separated by regular time intervals. At each interval, the
subject could gain a certain amount of information, the same at each
step until the steps became close enough that the successive locations
showed some correlation. This type of constraint actually provides a
rough measure of the rate of the temporal degrees of freedom for
control in this situation, since at some point, the data will be
indistinguishable from the data using the standard (fully displayed)
TrackAnalyze.

Using both techniques, the actual information rate of the perceptual
path from the standard TrackAnalyse cursor to the controlled perception
could be measured reasonably accurately.

By varying the channel capacity of the “Display” function, the
experiment emulates the effect of different channel capacities of the
perceptual pathway that in previous messages we have called D->S2.
By finding the values for spatial and temporal resolution beyond which
improving the resolution makes no difference, we get a measure of the
informational limits of the perceptual pathway beyond the display from
sensors to the controlled perception.

A third variant is possible, in which the Display adds noise to the
location of the Real Target, by making the displayed target appear
above or below the correct location of the Real Target. Since the
subject can only control the perception, but can only act on the Real
Cursor. this added noise, which reduces the channel capacity of the
Target->Controlled Perception pathway, should have the effect of
reducing the subject’s ability to control the perception AS MEASURED BY
the Analysis Program.

[MT] In any case, whether the perceptual fluctuations are or
are not related to fluctuations in the environment, you can still measure
the information you can get about one by observing the other.
[RM] You might be able to measure it but would it make any sense?

Depends on the circumstances. Sometimes yes, sometimes no. Crudely put,
it would make sense under much the same circumstances in which
measuring correlations or RMS deviations would make sense. On the other
hand, if there you can get a significant amount of information about
one thing from observing another that seems totally unrelated, you can
be fairly sure that either there is some communication between them or
they are both influenced by some common variable that might be worth
seeking out. The same is true, of course, for correlation.

[RM] Suppose
the environment is a 7 bit sequence, 1011010 being one possible state
of that environment. Also suppose that a perception of that
environment is p = a1b1+a2b2+...a7b7, where the a's are constants and
the b's are the bits in the environmental "message". I presume you
could measure whether the fluctuations in p are related to variations
in the 7 bit environmental message. But I don't think that would tell
you much about what is being perceived or how perception works.

Quite true. A version of “The Test” is needed in order to estimate
roughly what is being perceived, out of the infinity of possibilities
of what might be being perceived. You would need to model it using
different values of the “a” parameters. Of course, even if you did
that, you wouldn’t really know that what was being perceived was
actually a linear combination of the “b” variables, which is the
assumption you make above. Even using The Test with meticulous accuracy
and even assuming that the true perceptual function really is a linear
sum, I think you would be hard put to it to find accurate values for
the relative magnitudes of the “a” parameters. So you really don’t have
any way of telling what is being perceived using ordinary PCT
techniques, whether or not you include information-theoretic tools in
your toolkit.

[RM continued] It
would be misleading, actually, because it would give the impression
that perception is about carrying information about the state of the
environment to the brain when that's not what's happening.

Is it truly not what’s happening? I have your authoritative word for
that?

What experimental data do, or could even in principle, allow you to
make that assertion? If you have a thought-experiment or a real one
that supports this claim, I’d be delighted to know of it. I propose a
thought-experiment above, which I expect would show that perception is
carrying information about the state of the environment to the brain,
and which would moreover allow an estimate of how much information that
is in respect of a specific perceptual control function (even though
the perceptual input function might not be well specified); but you
apparently believe that manipulating the channel capacity of the
Display function in the channel will have no effect on the ability of
the subject to control in this pursuit tracking task. Or do I
misunderstand the implications of “that’s not happening”?

[MT] If there's a
causal link, one would be likely to call it the "information transmitted"
from the environment to the perceptual signal, at any and all levels of
perception. Remember that "information transmitted" does not mean
information actually flows. It's just a measure that relates how much you
can tell about variation in the perceptual signal given the environmental
variation, or vice-versa. The "transmission" is a reference to the causal
path that is assumed to connect the environment to the perceptual signal.
[RM] Yes, and it makes it seem that perception is a particular kind of
causal path: a communication channel. I think perception is more
properly modeled as functions that map aspects of the environment into
perceptual signals that vary with variations in the degree to which
these aspects can be constructed given the state of the environment.

Why “more properly”? You make it sound as though there were some kind
of contradiction between the two ways of labelling the same thing,
whereas your second sentence is actually just a description of a
particular kind of communication channel.

[MT] What information is transmitted from environmental variation to the
perceptual signal with any of these infinitely many functions depends partly
on the function and partly on the noise.

[RM] OK, I think I can agree that if you _know_ the perceptual function
then you can measure the extent to which variations in that signal are
due to noise rather than environmental variations. I think I can do
that pretty well without using any information theory at all; I do
have to use control theory though.
I just don't see any value in information theory for the study of
purposeful (control) behavior. It may be of value in communication
engineering but I don't see how it could help me in my work on living
control systems. But of you like it, go for it.

Thank you for your permission.

It would be more welcome if I felt that it came from some understanding
of the issues.

Martin

[From Rick Marken (2009.04.18.1020)]

Martin Taylor (2009.04.17.10.58)

Rick Marken (2009.04.15.2200)

[MT] The exact
same environmental situation can be perceived and controlled in a literally
infinite number of different ways.

[RM] That seems to rule out the idea that perception is a process of
communicating to the mind what is actually out there in the
environment.

[MT] That's a mind-boggling comment. I have no idea where such a notion might
come from, or what part of my statement causes you a problem.

[RM] Let me try to un-boggle your mind. Suppose that p = a1x1 + a2x2
where x1 and x2 are physical variables and a1 and a2 are coefficients
of the perceptual function. Let's say a1 = a2 = 1. Then p is
accurately communicating about the world when p = x1+x2. But there is
no one state of the world that corresponds to a particular value of p.
p, for example, will equal 2 if x1=1 and x2 =1 or if x1=.5 and x2
=1.5, etc. So there is no single environmental event that maps to a
perceptual event. In order to be able to measure the information in p
about the environmental situation represented by p you have to know
the perceptual function in some detail; that is you have to know, in
this example, that p is a function of only two environmental
variables, x1 and x2, that it is a linear sum of the two variables and
that the coefficients of this function a a1=1 and a2=1. In that case
you could measure the information about the environment transmitted by
p; but to get their you would have had to do what is basically a test
for the controlled variable (the perceptual variable, p). If you will
agree that testing for controlling perceptual variables is
propaedeutic to information measurement then I think we can get back
onto something like the same page, although I still think the very
idea that perception is "informative" is misleading.

[MT] If the nth of these infinitely many possible perceptual functions is Pn =
pn(environment), then an analyst could measure the "information transmitted"
from the environment to Pn by looking at how variations in Pn� relate to
variations in the environment.

[RM] Yes, he could if, as I note above, the analysis knows the
perceptual function pn(environment). In my example pn = a1x1 + a2x2;
so to determine the information transmitted by pn about a1x1 + a2x2
the analyst would have to know that pn is a funciton of only two
environmental variables, x1 and x2, the pn is a linear sum of these
two variables and the the coefficients of this sum are 1.

[RM] Information theory assumes that there is a
message to be transmitted and received. ...

[MT] That is a major misunderstanding about information theory.

[RM] OK, I can accept this. But if this is the case then, to be
useful,the information analyst has to have a pretty darn good idea of
the perceptual functions involved before any information transmitted
via those functions can be measured. The best way to find that out is
by testing for controlled variables. I am not aware of any such
testing having been done by information theorists.

[MT] I think it's highly relevant to the quality of control. If the perceptual
signal does not fluctuate in some reasonably consistent way when the
environmental variable influenced by the control output changes, control
will not be very good.

[RM] Certainly noise reduced the quality of control somewhat. But, in
fact, noise has far less effect than you might imagine. The negative
feedback control process turns out to be a very effective filter.
Noise is a far bigger problem in open-loop systems where the input
drives the output. In that case, in order to produce outputs that are
accurate and, thus, useful to the user of the system, serious steps
must be take to reduce the amount of noise in the system. This is far
less of a problem in closed-loop system. Indeed, the research I am
doing know involves the use of a model determine just how much noise I
have to inject into the system to mimic the levels of control I'm
actually observing. It turns out that even when I inject fairly
substantial levels of noise, control can still be quite good (and
similar to the subject;s performance) if system gain and slowing is
adjusted properly.

[MT] The objective is to determine how variations in the
channel capacity of the path Target->Controlled perception affect the
parameters of the modelled control loop, and the accuracy of control.

[RM] Yes, that's exactly what I'm doing, although I'm doing this in a
compensatory version of the task.

1 (spatial information constraint) Between these extremes, the spatial
resolution of the display is varied. The displayed cursor is not a pointed
bhorizontal bar, but a rectangle that fills some vertical space on teh
screen. At 1 bit, the cursor is represented by a rectangular block that
extends from the mid-line to either the top or the bottom of the display,
depending on whether the Real Cursor is at that moment above or below the
mid-line. At 2 bits, the subject can see in which of four regions of the
display the cursor is -- the rectangle fills the vertical extent of
whichever region contains the Real Cursor (the regions should really be
chosen so that the target spends an equal amount of time in each of the
four, but this probably doesn't matter, numerically ). At 3 bits, the
subject can see in which of 8 regions the Real Cursor is, and so forth...

Using both techniques, the actual information rate of the perceptual path
from the standard TrackAnalyse cursor to the controlled perception could be
measured reasonably accurately.

[RM] I think this could be a useful experiment. You can show that
behavior of a basic control model to one that includes information
theoretic concepts and see which accounts best for the subjects'
perceptions. I predict that the basic PCT model, with no inclusion of
or reference to information theoretic concepts, can account for the
results nearly perfectly. I presume you would say that information
theory would in some way be needed to account for the results. So go
ahead and do the experiment. Let's see if information theory really
has anything to contribute.

Best

Rick

···

--
Richard S. Marken PhD
rsmarken@gmail.com

[Martin Taylor 2009.04.18.13.53]

You know what! I think we are actually converging, though still some
distance apart. That’s nice.

[From Rick Marken (2009.04.18.1020)]
Martin Taylor (2009.04.17.10.58)
Rick Marken (2009.04.15.2200)


[MT] The exact
same environmental situation can be perceived and controlled in a literally
infinite number of different ways.
[RM] That seems to rule out the idea that perception is a process of
communicating to the mind what is actually out there in the
environment.

[MT] That's a mind-boggling comment. I have no idea where such a notion might
come from, or what part of my statement causes you a problem.
[RM] Let me try to un-boggle your mind. Suppose that p = a1x1 + a2x2
where x1 and x2 are physical variables and a1 and a2 are coefficients
of the perceptual function. Let's say a1 = a2 = 1. Then p is
accurately communicating about the world when p = x1+x2. But there is
no one state of the world that corresponds to a particular value of p.

[MT] We have to think about what is meant by “one state of the world”.
If we are to believe physics since Dalton, objects consist of atoms
that vibrate all over the place, that are emitted from the object’s
surface and that come from elsewhere and adhere to the surface. The
“state of the world” doesn’t stay the same for a picosecond, and never
returns to what it was. But we don’t perceive it that way. We perceive
that the object has some permanence, and if it is of appropriate size
and weight, we can pick it up and put it elsewhere without perceiving
it to be a different object. When the object is in either place, we
perceive it itself as being one “state of the world”, though the object
in its context is perceived as being a “different state of the world”.

Suppose the object is a file folder, and we take it from one file
drawer to put it in another (because we are rearranging the physical
filing system without altering the conceptual filing system). One
“state of the world” that we perceive is “the folder is filed”, and
that state has not changed. So it is true that if p = a1x1 + a2x2, a
perceptual system perceiving x1 can find x1 changed, and one perceiving
x2 can find x2 changed while at the same time our p is unchanged.
Looking at p does not allow us to say anything about x1 or about x2
individually. On the other hand, if the value of p (and of a1 and a2)
is known, then looking at x1 will tell us exactly the value of x2. If
we didn’t know the value of p, then looking at x1 would tell us nothing
about x2. So finding the value of p does provide information about the
world of x1 and x2, even though it gives us no information about either
taken by itself. Looking at the object you have moved, you know nothing
about the location of any particular atom, but you do know that if you
see an atom of the object “here”, no atom currently of the object is
two miles away “there”, which might have been the case if you had lost
the object and couldn’t see it. You know, even if you have lost it,
that if you ever do find an atom that clearly is part of the object,
all the other parts will probably be very close. The ability to make
those deductions is information about the object’s “one state of the
world” obtained from the perception.

[RM] In order to be able to measure the information in p
about the environmental situation represented by p you have to know
the perceptual function in some detail; that is you have to know, in
this example, that p is a function of only two environmental
variables, x1 and x2, that it is a linear sum of the two variables and
that the coefficients of this function a a1=1 and a2=1.

[MT] I can see why you might think this, but it isn’t true. What you
say is true about finding the specific values of the variables, not
about measuring the information p can provide about them.

To see this, go back to the general case, in which you have no
knowledge that there might even be a causal connection between x1, x2,
and p. You observe p, x1, and x2 over some period during which their
values change, and you record those values in bins in a 3D matrix.
Looking at that matrix, you discover that some bins are more filled
than are others. This in itself tells you that the three variables
provide information about each other; at least that is true unless the
bin-filling ratios are independent of the value of one of the
variables. There’s nothing about your observations that says “this is
dependent on that”, but you can say that if you know the value of one
of them, you can expect more probably to see a particular relation
between the other two than a different relation. In the case of p =
a1x1 + a2x2, even when the perception is noisy, still if you observe
x1, you expect the bins near (p-a2x2)/a1- x1 = 0 to be more filled than
bins far from that surface (of course, you don’t need to know anything
about the “a” values to observe such a clustering of the data around a
surface in the 3D space).

That is the basic proposition behind the Garner and McGill paper I
referenced a little while ago (“The relation between information and
variance analyses”, Psychometrika, 21 No 3, Sept 1956, 219-228). The
idea generalizes to N dimensions.

[RM] If you will
agree that testing for controlling perceptual variables is
propaedeutic to information measurement then I think we can get back
onto something like the same page, although I still think the very
idea that perception is "informative" is misleading.

[MT] I don’t think testing for controlled perceptual variables is a
prerequisite for information measurement, but I do think information
measurement can be useful in testing for controlled variables. If you
don’t know that the perceptual system and the rest of the part of the
control loop inside the skin is linear, then information measurement is
better justified than is correlation measurement (which is probably its
best analogue in the Gaussian analysis toolbox), an can be used in all
the same circumstances.


[MT] If the nth of these infinitely many possible perceptual functions is Pn =
pn(environment), then an analyst could measure the "information transmitted"
from the environment to Pn by looking at how variations in Pn  relate to
variations in the environment.
[RM] Yes, he could if, as I note above, the analysis knows the
perceptual function pn(environment).

[MT] No, there’s no such requirement. All that is needed are the data
of the variations.



[RM] Information theory assumes that there is a
message to be transmitted and received. ...
[MT] That is a major misunderstanding about information theory.
[RM] OK, I can accept this. But if this is the case then, to be
useful,the information analyst has to have a pretty darn good idea of
the perceptual functions involved before any information transmitted
via those functions can be measured.

[MT] Again, not so. All that is needed are the data.


[MT] I think it's highly relevant to the quality of control. If the perceptual
signal does not fluctuate in some reasonably consistent way when the
environmental variable influenced by the control output changes, control
will not be very good.
[RM] Certainly noise reduced the quality of control somewhat. But, in
fact, noise has far less effect than you might imagine. The negative
feedback control process turns out to be a very effective filter.
Noise is a far bigger problem in open-loop systems where the input
drives the output.

[MT] Yes, that is true. It’s one reason audio engineers used negative
feedback in high-quality systems when I was young (along with
linearizing the system response when using pentode amplifiers :slight_smile: Since
I have known about this for some 50+ years, I wonder what you imagine I
might imagine to be the effect that the actual effect is far less than?

[RM] Indeed, the research I am
doing know involves the use of a model determine just how much noise I
have to inject into the system to mimic the levels of control I'm
actually observing. It turns out that even when I inject fairly
substantial levels of noise, control can still be quite good (and
similar to the subject;s performance) if system gain and slowing is
adjusted properly.

[MT] There’s the point, exactly! …“if system gain and slowing is
adjusted properly”. That’s precisely where estimates of channel
capacity come in handy. You can’t control to a precision better than
the precision with which the perception is influenced by variations in
the elements of the environment that are affected by the output. If
there is noise, perceptual precision takes time to develop, so you can
keep good control either by gain reduction (avoiding the effects of
correcting for too much noise) and/or by reducing the output bandwidth
(avoiding the effects of high-frequency noise). You can, of course, use
PCT methods to adjust those parameters optimally, without any analysis,
but informational analysis may show you why, and perhaps where the
sensitive points are. I haven’t done such an analysis, so here I’m only
speculating.


[MT] The objective is to determine how variations in the
channel capacity of the path Target->Controlled perception affect the
parameters of the modelled control loop, and the accuracy of control.
[RM] Yes, that's exactly what I'm doing, although I'm doing this in a
compensatory version of the task.

[MT] Nice. Is it a TrackAnalyze variant? or a separately programmed
experiment?


1 (spatial information constraint) Between these extremes, the spatial
resolution of the display is varied. The displayed cursor is not a pointed
bhorizontal bar, but a rectangle that fills some vertical space on teh
screen. At 1 bit, the cursor is represented by a rectangular block that
extends from the mid-line to either the top or the bottom of the display,
depending on whether the Real Cursor is at that moment above or below the
mid-line. At 2 bits, the subject can see in which of four regions of the
display the cursor is -- the rectangle fills the vertical extent of
whichever region contains the Real Cursor (the regions should really be
chosen so that the target spends an equal amount of time in each of the
four, but this probably doesn't matter, numerically ). At 3 bits, the
subject can see in which of 8 regions the Real Cursor is, and so forth...

Using both techniques, the actual information rate of the perceptual path
from the standard TrackAnalyse cursor to the controlled perception could be
measured reasonably accurately.
[RM] I think this could be a useful experiment. You can show that
behavior of a basic control model to one that includes information
theoretic concepts and see which accounts best for the subjects'
perceptions.

[MT] They are the same model, so neither will account better than the
other. The question is how the model parameters must be varied to mimic
the behaviour of a subject as the channel capacity of the pathway (Real
Target → Controlled Perception) is varied.

I predict that the basic PCT model, with no inclusion of
or reference to information theoretic concepts, can account for the
results nearly perfectly.

[MT] I’m afraid I don’t know of a PCT model that predicts parameter
variation as a function of information constraints in the perceptual
pathway. Maybe you could point me in the proper direction, because it
would be nice to have a differential prediction.

I presume you would say that information
theory would in some way be needed to account for the results. So go
ahead and do the experiment. Let's see if information theory really
has anything to contribute.

[MT] I looked into the TrackAnalyze code, and I think I can see where
the Display function could be injected. But it’s not as easy as I
hoped, because some of the presentation code seems to be threaded into
what in MVC would be the “Model”. It will have to be teased out, I
think, and I don’t have time to do that before I leave for my NATO
meeting. I could be wrong about the MV separation, but I’m going to
have to put off looking into it any further right now.

Martin

[From Rick Marken (2009.04.18.1330)

Martin Taylor (2009.04.18.13.53)--

We perceive that the object has some permanence,

Have you ever seen the 3D Necker cube, or the Ames trapezoid? What is
permanent when we see a "permanent" object is the perception -- the
output of a perceptual function -- not necessarily anything in the
environment.

[RM] If you will
agree that testing for controlling perceptual variables is
propaedeutic to information measurement then I think we can get back
onto something like the same page, although I still think the very
idea that perception is "informative" is misleading.

[MT] I don't think testing for controlled perceptual variables is a
prerequisite for information measurement, but I do think information
measurement can be useful in testing for controlled variables.

Could you give a nice, concrete example? How could information
measurement help in determining the optical variables controlled when
catching a ball, for example?

[RM] Information theory assumes that there is a
message to be transmitted and received. ...

[MT] That is a major misunderstanding about information theory.

[RM] OK, I can accept this. But if this is the case then, to be
useful,the information analyst has to have a pretty darn good idea of
the perceptual functions involved before any information transmitted
via those functions can be measured.

[MT] Again, not so. All that is needed are the data.

Take the rotating trapezoid as an example. It can be rotating
clockwise or counterclockwise. I bit of information about rotation,
right. So if it's rotating clockwise and I say that it's constantly
changing from clockwise to counterclockwise then I am getting no
information?

[MT] Nice. Is it a TrackAnalyze variant? or a separately programmed
experiment?

I'm using Bill's TrackAnalyze in a compensatory version. It collects
data with better resolution than my current java programs.

[RM] I predict that the basic PCT model, with no inclusion of
or reference to information theoretic concepts, can account for the
results nearly perfectly.

[MT] I'm afraid I don't know of a PCT model that predicts parameter
variation as a function of information constraints in the perceptual
pathway.

There are none because information theory and measures are irrelevant
to control theory. You describe your experiment in information
theoretic terms but that is irrelevant. What you are proposing is
simply to have a person do the regular tracking experiment but in one
case show only whether the cursor is to above or below the target (1
bit of info in your jargon) and in another case show that that the
cursor is above by more or less than a certain amount or below by more
or less than a certain amount (2 bits of info). If a control model
mimics the behavior in these two conditions then the control model is
all you need. If you are saying that an information theoretic model
can predict what the control model parameter values will be then
that's interesting. Again, I'd like to see how that's done. If it's
true then why not tell us the equations that will allow us to predict
those values? I presume that the equations would predict the slowing
and gain parameter values from some measure of the subject's
performance in terms of information transmission. So tell me what the
measure of information transmission is that we should get from the
data and how this measure is related to gain and slowing.

[MT] I looked into the TrackAnalyze code, and I think I can see where the
Display function could be injected. But it's not as easy as I hoped, because
some of the presentation code seems to be threaded into what in MVC would be
the "Model". It will have to be teased out, I think, and I don't have time
to do that before I leave for my NATO meeting. I could be wrong about the MV
separation, but I'm going to have to put off looking into it any further
right now.

I can easily write up a java program that implements your 1 vs 2 bit
display idea. But you should be able to give me the equations I need
to test the predictions of information theory (regarding the control
parameters) before you leave. You wouldn't make the claim that
information theory can predict what the control parameters in such
tasks would be if you didn't know how to do that, now would you?

Best

Rick

···

--
Richard S. Marken PhD
rsmarken@gmail.com

[From Bill Powers (2009.04.18.1323 MDT)]

[Martin Taylor 2009.04.18.13.53]

[MT] We have to think about what
is meant by “one state of the world”. If we are to believe
physics since Dalton, objects consist of atoms that vibrate all over the
place, that are emitted from the object’s surface and that come from
elsewhere and adhere to the surface. The “state of the world”
doesn’t stay the same for a picosecond, and never returns to what it was.
But we don’t perceive it that way. We perceive that the object has some
permanence, and if it is of appropriate size and weight, we can pick it
up and put it elsewhere without perceiving it to be a different object.
When the object is in either place, we perceive it itself as being one
“state of the world”, though the object in its context is
perceived as being a “different state of the
world”.

This doesn’t mean that the object is the external counterpart of the
perception. “The world” is another set of perceptions, not
something outside that we can examine to see how it relates to a
perception. We do not ever see the x’s or the coefficients in Rick’s
equation p = a1x1 + a2x2. All we ever see is p.

To show the true situation we would have to imagine two observers, one
perceiving p1 and the other perceiving p2:

p1 = a11x1 + a12x2

p2 = a21x1 + a22x2

We assume that the two observers are perceiving the same environmental
variables. I don’t think I’m ready for this, but we could solve for p2 as
a function of p1, and we would find that the relationship is not a simple
unity transform. Then we could introduce a third observer, and this
observer would be able to say how the perceptions of the other two would
be related – but both would have to be expressed in terms of the third
observer’s coefficients.

The observer can never observe his own coefficients. And the observer can
never know if the other observer has same number of coefficients in his
input function as the first observer has: the observer with the simpler
input will see the other’s perception projected into a lower-dimensional
space, and will see fewer changes in the perception than the other
sees.

All in all, I think Rick has it right: we can’t simply speak of the
transmitted and received messages, when all we ever can observe is the
received messages.

Suppose the object is a file
folder, and we take it from one file drawer to put it in another (because
we are rearranging the physical filing system without altering the
conceptual filing system). One “state of the world” that we
perceive is “the folder is filed”, and that state has not
changed. So it is true that if p = a1x1 + a2x2, a perceptual system
perceiving x1 can find x1 changed, and one perceiving x2 can find x2
changed while at the same time our p is unchanged. Looking at p does not
allow us to say anything about x1 or about x2 individually. On the other
hand, if the value of p (and of a1 and a2) is known, then looking at x1
will tell us exactly the value of x2.

Let’s not confuse the perceived state of the world with the state of the
world Out There. If a file folder is a perception that can have different
states one of which is its location in three-space, then filing it in a
different place changes its state. If the perceptual function does not
have that extra state, then we can’t know that we have changed the
location of the folder, so we can’t knowingly file it in a different
place. A person with a higher-dimensional input function will still see
different locations if we manage, unwittingly, to change where the folder
is. But we won’t know it.

[MT]If we didn’t know the value
of p, then looking at x1 would tell us nothing about x2. So finding the
value of p does provide information about the world of x1 and x2, even
though it gives us no information about either taken by itself.

[BP]If you know the value of p and the value of x1, you know nothing
about the value of x2 unless you know the form of the input function,
f(x1,x2). That’s Rick’s point. Maybe changing x1 causes a change in x2
directly.

How do you determine the form of that function? The Test takes us one
step in that direction, though we still don’t know the forms of the
functions through which we think we are observing x1 and x2.

[MT]Looking at the object you
have moved, you know nothing about the location of any particular atom,
but you do know that if you see an atom of the object “here”,
no atom currently of the object is two miles away “there”,
which might have been the case if you had lost the object and couldn’t
see it.

[BP]Sorry, but in my dictionary that is not “knowing,” it’s
“imagining.” In our world models, which are perceptions, we
find rules saying that the “atoms” that make up
“objects” tend to stay close together. That works for some
objects, but far from all: think of ocean waves and clouds and
explosions, or a letter on the computer screen in front of which you are
inserting spaces. Whatever rules we accept, they are not the external
situation itself; they are only the perceptions we have been able to
construct out of the array of intensity inputs (according to
HPCT).

You know, even if you have
lost it, that if you ever do find an atom that clearly is part of the
object, all the other parts will probably be very close. The ability to
make those deductions is information about the object’s “one state
of the world” obtained from the perception.

This is all just reasoning about internal perceptions using internal
rules derived from experiences with perceptions. And you’re considering
only the minority of examples that support your proposition. If you can’t
think of any counterexamples, it might be that you’re not looking for
any.
It seems to me that this discussion is leaving out the essential aspect
of the Test for the CONTROLLED Variable. You can’t do
the test unless the person is actually controlling it – otherwise you’ll
disturb the variable as the opening step in the test, and it will change
and just stay changed. The person must perform some action on the
environment that prevents the variable from changing as much as you would
expect it to change if there were no control system. Ideally, you won’t
be able to see it change at all.

The naive way to apply the Test is simply to look for variables that
resist your attempt to change them. That’s the Chinese Character way of
specifying controlled variables: you either recognize a variable that
doesn’t change, or you don’t. You can’t deduced its nature from anything
else. The other and probably better way is to try to go down a level and
define a set of variables on which a controlled variable might depend,
such as the xyz coordinates in space. Use an alphabet instead of
characters. Instead of trying out different definitions of a controlled
position for each possible location in space, you try to determine what
function of x, y, and z remains constant when you apply disturbances in a
few (at least 3) different directions. If you then look at the
relationships between amount and direction of resistive force (or other
influence) and the amount of resistance in each direction, you can
determine the controlled variable, its reference level, the loop gain,
and the dynamics of control all in one experiment.

But if you do none of these things, you have no basis for proposing any
relationship between external variables (even as lower-order perceptions)
and a perception that is a function of them. The TCV doesn’t just detect
variables that are under control; it helps you refine your definition. In
whatever space you are doing the measurement, the resistance will not be
exactly in line with the disturbance. That tells you that your definition
does not weight the underlying variables quite right, and perhaps doesn’t
include enough variables, or contains too many. When you’ve finished the
test, you can say that any disturbance of this variable on any axis along
which it can change will be maximally resisted. That gives a much better
definition of the CV than just guessing at the overall form.

[RM] In order to be able to
measure the information in p
about the environmental situation represented by p you have to know
the perceptual function in some detail; that is you have to know, in
this example, that p is a function of only two environmental
variables, x1 and x2, that it is a linear sum of the two variables and
that the coefficients of this function a a1=1 and
a2=1.

[MT] I can see why you might think this, but it isn’t true. What you say
is true about finding the specific values of the variables, not about
measuring the information p can provide about them.

[BP] I don’t understand that. If you don’t know the way in which p
depends on x1 and x2, how can observing p and x1 tell you anything about
x2? If you have no basis for expecting x2 to increase or decrease when x1
increases, how can you say any information about x2 has been obtained?
How do you even know there is an x2 without doing some sort of tests? How
do you know that there isn’t an x3 and an x4 as well? You seem to be
assuming that you have a way of looking at the external variables on
which the perception depends. You do, of course – but you can only see
perceptions of them, and you don’t know what those perceptions are
functions of, either. Not without a lot of experimental tests, and
the TVC is probably the most informative one in the present
context.

[MT]To see this, go back to the
general case, in which you have no knowledge that there might even be a
causal connection between x1, x2, and p. You observe p, x1, and x2 over
some period during which their values change, and you record those values
in bins in a 3D matrix.

[BP]OK, now you are doing tests. From such tests you can start to
generate a mathematical expression relating the variables. The best test
I know of when a control system is involved is the TCV. You can eliminate
potential x2s and x3s and find better ones. In effect, you’re getting the
other person to tell you what the perception depends on, although this
knowledge still has to be stated in terms of your own perceptual input
functions, which you can’t see.

I have no doubt that you can compute – I don’t know how to put this
since you say information doesn’t flow and isn’t any of the things we
normally think of as information, though you can calculate it – the
information measures that relate to different parts of this situation
(how’s that?). But what can we do with that information that we haven’t
already done on the way to getting it?

Looking at that matrix,
you discover that some bins are more filled than are others. This in
itself tells you that the three variables provide information about each
other; at least that is true unless the bin-filling ratios are
independent of the value of one of the variables. There’s nothing about
your observations that says “this is dependent on that”, but
you can say that if you know the value of one of them, you can expect
more probably to see a particular relation between the other two than a
different relation. In the case of p = a1x1 + a2x2, even when the
perception is noisy, still if you observe x1, you expect the bins near
(p-a2x2)/a1- x1 = 0 to be more filled than bins far from that surface (of
course, you don’t need to know anything about the “a” values to
observe such a clustering of the data around a surface in the 3D
space).

What are all these expectations based on? I don’t see anything but common
sense here, of an ideosyncratic sort. I don’t see any reason either to
agree with these assumptions or reject them.

There’s too much here to comment on in one post, and more comments are
likely to be repetitious. I think I’ll leave it here. It seems to me that
information theory is most applicable in cases where the information is
very inadequate, or artificially masked with noise, or at the limits of
detection. It’s a little like the law of requisite variety – one wonders
how important it is when the controlling system has far more variety
available than it needs. Does the amount even matter then? What if the
perceptual information we get about the environment is nearly noiseless
in ordinary circumstances – what do we care if the channel capacity is
one megabit per second when the message is being keyed in by an old
telegraph operator at 80 baud?

The world I live in most of the time is clear, sharp, and free of any
background noise I can detect. The things I try to do are seldom at the
limits of my perceptions – that would be a rather exhausting way to
live, or anxiety-ridden if the lower limits were involved. I don’t
mean to say that nobody should care about such things. I have done
so myself while developing low-light-level television cameras for
astronomy. But most of the time, do upper and lower limits even
matter?

Best,

Bill P.

[From Bill Powers (2009.04.19.0141 MDT)]
Martin Taylor 2009.04.17.10.58 –
After some time to think, filled in partly by having to get help to
change a flat tire (getting old is a drag), I think a neuron has
struggled back to life long enough to suggest an idea that, aided by
other features of old age, woke me up.
When I think of “uncertainty,” I automatically think of noise,
of random variables, and of the shape of something becoming clearer as we
average the noise away and see the edges sharpening up. But there
are other meanings, and I think your proposed variation on TrackAnalyze,
plus Rick’s little equation about a perception being a function of two
variables, brought them to my attention.
Suppose you have a function of N unknowns, and through various means you
have established N-2 of them. This means you’re down to 2 unknowns, but
you still can’t determine the value of the function even by finding out
the value of one of the remaining ones. That’s because as long as even
one unknown is unknown, the value of the function is indeterminate – or
as they say in other contexts, uncertain. Only it’s the observer, not the
function, that is uncertain. Uncertainty is a psychological state;
indeterminacy is a mathematical state.
I think that indeterminate is the word I prefer. It means that the
required information for determining the value is simply not available.
Whether that makes me feel uncertain about something is another matter;
it could fill me with ambitious confidence because I know there’s only
one variable to go and I’m sure I’ll find a way to evaluate it.
More to the point, even knowing that there is only one variable left to
determine (but not knowing how it enters into the function), I don’t feel
any closer to knowing the value of the function than I was when I didn’t
know any of the N values of the variables. My uncertainty about the
value, if I felt any, wouldn’t have decreased at all. I still don’t know
the value. It still could be anything between plus and minus infinity.
Its indeterminacy hasn’t been reduced in the slightest. That range was
the original range of possible values, and it still is.
We learned in algebra that to solve for N unknowns, you need N
independent equations. If you have N-1 equations, independent or not, you
can’t solve for ANY of the unknowns. Indeterminacy is a binary variable;
either the set of equations is indeterminate or it is determinate. There
are no degrees of indeterminacy. (Let’s not quibble about terms in a
function with bounded values – that just occurred to me, but let’s just
say “polynomials”).
For me, that takes care of one meaning of uncertainty. I will say
indeterminate when I speak of a state of knowlege, and uncertain when I
say how I feel about that state (if it bothers me).
There is another meaning, which has to do with the determination of
magnitudes. This is where your proposal about TrackAnalyze comes in. You
propose to insert a function between the controlled variable and the
perceptual signal which reduces the number of possible states of the
perceptual signal when the controlled variable goes through its range of
values. I assume that you will leave the rest of the control model the
same, with the reference and error signals and the output and input
quantities being continuously represented as real numbers.
If the input quantity has a range from 0 to 100, as I understand you, the
proposition is to insert a function that generates a perceptual signal of
0 when the input quantity is between 0 and 50, and of 100 (for example)
when the input quantity is 50 or greater. Or you might divide the range
into any number of parts such as 4, 8, or any power of 2 (to get a whole
number of bits of information). The result would be a perceptual signal
that changes in a staircase pattern as the input quantity changes
smoothly from 0 to 100.
If you make that change alone, I predict that the model will begin to
oscillate between two values each of perceptual signal, error signal,
output quantity, and input quantity unless a great many steps of the
staircase are used. The reason is that at the points where the value of
perceptual input goes up by one step, the loop gain becomes infinite.
Between those steps, it is zero. If enough steps are used or the slowing
factor is made large enough (i.e., the resulting output changes are made
small enough), the oscillations will become invisible, and the slowing
factor in the output function will smooth them until the amplitude of the
oscillations is close to zero. That, in fact, is how we manage to
simulate a system made of continuous variables on a digital computer. We
insert, somewhere in the loop, a low-pass filter such that in the time
represented by one iteration, no significant change in the filtered
variable can happen. I have called oscillations at that frequency (one
whole cycle per two iterations) a “computational oscillation”
since it is not a real feature of the simulated system but is an artifact
caused by using a digital computer.
Given that we have a sufficient slowing factor, inserting the digitizing
filter into the input path will not destroy control. However, it will
reduce the resolution with which control can be carried out. If
there are steps in the input function, then clearly a disturbance will
have to change enough to raise the perceptual signal to the next step,
and changes of the disturbance smaller than that amount will be ignored.
The input quantity will be allowed to change by, on the average, half a
step without creating resistance to the disturbance.
So the function you propose to insert will, if the system is stabilized
by using the right slowing factor, simply determine the resolution with
which the controlled variable can be set to any desired value. If you
wish to keep the perception matching the reference signal within 2 units,
and the step size is 25 units, you will feel annoyed, or if otherwise
inclined, uncertain about the actual controlled value of the perception.
Emotional reactions aside, you will be unable to maintain the perceptual
signal any closer to the reference signal than plus or minus half a step.
It will jump from half a step too high to half a step too low, and
back.
So that kind of uncertainty is a matter of resolution, meaning
measurement resolution or control resolution. The more
steps there are, the greater will be the control resolution of a properly
stabilized control system. And of course the information measure of the
perceptual signal will be the log to the base 2 of the number of steps in
the possible range of values, I think.

What other causes of uncertainty remain? Only statistical ones. Since
random variations can alter a signal or variable in the loop
independently of any other loop variable, they have to be classified as
disturbances. Given superposition, we have the equivalent of a noise-free
system affected by indeterminacy and control resolution if such are
significant problems, with a random disturbance added to one or more
variables in the loop.

We already know how to handle disturbances in the model. As matters are
now in TrackAnalyze, you can pick a difficulty number from 1 to 5 which
determines the bandwidth of a random variable that disturbs the
controlled quantity. Components of the disturbance at frequencies that
lie within the frequency band of good control have little effect on the
controlled variable or the perception; the “incessant
fluctuations” that Ashby spoke of are mostly canceled. Frequency
components well outside that bandwidth of good control simply disturb the
controlled variable and there are no output fluctuations to oppose them.
What happens at those higher frequencies to other variables inside the
system depends on the dynamic characteristics of the system functions. In
the TrackAnalyze model, the perceptual signal and error signal would
reflect high-frequency disturbances, but the output function, where the
slowing factor is located, would not respond to them.

So the possible causes of psychological uncertainty about the behavior of
a control system depend on determinacy, control resolution, and
unpredictable disturbances. I omit causes that amount to changes in
system properties or properties of the environment, and our understanding
of them, though the list could easily be expanded to include such
changes.

We have systematic ways of finding out about these three features of a
control model or a control system, and I assume there are systematic ways
of calculating information measures that correspond. Does this get us
somewhere?

Best,

Bill P.

···
[From Rick Marken
(2009.04.15.2200)]

I’m
adding [MT] and [RM] on behalf of David Goldstein. I may have missed
some, but at least I tried :slight_smile:

Martin Taylor (2009.04.13.23.01) --

[MT] I think you still don't understand about information as a
measure. It does
not depend in the slightest on any model. All it depends on is having two
or
more data sources.
[RM] I think the idea that perception is based on transmitting data from
one place (source) to another (receiver) is a model. My model of
perception has nothing to to with transmitting information; it's about
constructing representations.

So is mine. But “constructing representations” (in the form of
scalar variable values) can’t happen unless information is transmitted
(in the sense of “transmitted” that I have described before)
between the environment and the perception. So your comment is
self-contradictory.


[MT] The exact
same environmental situation can be perceived and controlled in a
literally
infinite number of different ways.
[RM] That seems to rule out the idea that perception is a process of
communicating to the mind what is actually out there in the
environment. If the same environmental situation can be perceived in
an infinite number of ways, the there is no information to be
transmitted about it.

That’s a mind-boggling comment. I have no idea where such a notion might
come from, or what part of my statement causes you a problem.
It’s a simple fact that the same environmental situation can be perceived
and controlled in a literally infinite number of ways.
It appears to be a fact that perception relates in some way to “what
is out there”, if only because we seem able to control at least some
of our perceptions some of the time.
So what is your problem? Do you not think that all those functions
equally could represent something about the environment, or is the
problem just that you don’t understand what is meant by transmitted
information?
If the nth of these infinitely many possible perceptual functions is Pn =
pn(environment), then an analyst could measure the “information
transmitted” from the environment to Pn by looking at how variations
in Pn relate to variations in the environment. The analyst could do
the same for Pi, Pk, … and all the others at the same time (if he had
infinite analytic capacity :-)). Why would you say “there is no
information to be transmitted about it
”?

[RM] Information theory assumes that there is a
message to be transmitted and received.
...

That is a
major misunderstanding about information theory. You are talking about
the original application environment that interested Shannon. But even he
in his original writing on “The Theory of Communication” was
careful not to restrict it in that way. “Information theory” is
not a very good name, really. I think “Uncertainty Analysis”
would be better, but “Information Theory” has been around for a
long time, so we have to stick with it despite its tendency to
mislead.

Uncertainty is the basic measure, and information the differential
measure that refers to how the uncertainty about X changes as a
consequence of knowing something about Y. Y could be a message from X,
but there’s no reason it has to be. Sometimes we might call Y an
“observation”, if the situation makes that a sensible name.
Knowing the value of Y (which could be a perceptual signal) may tell
something about the value of X (which may be a function of some
environmental variables) beyond what could have been known without
considering the value of Y. If so, Y can provide information about X, and
vice-versa. But only if X is causally related to Y would we say that X
transmits information to Y, and then we would NOT say Y transmits
information to X.

[RM] This is not the way I think perception works. Perception is not a
channel that brings a message about the "true" state of the
environment into the brain. The "true" state of the environment
could
be represented as a binary "message" like 1011010. But I don't
think
of this as a real message; it is just the state of a set of physical
variables. If what is perceived is, say, some linear combination of a
subset of the elements of this
"message",

As well it might be … So far, so good, though using the label
“message” seems calculated to lead you astray if you aren’t
careful. As it seems to have done.

[RM continued] then it makes no sense (it
seems to me) to ask how much information about the state of the
environment is communicated by the perceptual signal. It's just
doesn't seem like a relevant question.

There’s the nub. You don’t think it is a relevant question.

I think it’s highly relevant to the quality of control. If the perceptual
signal does not fluctuate in some reasonably consistent way when the
environmental variable influenced by the control output changes, control
will not be very good.

At one extreme, fluctuations in the perceptual signal might be unrelated
to changes in the real environment. (Remember that the real environment
is unknowable other than by way of perceptual signals, but it is assumed
to exist and to be that on which the output necessarily acts). Under
those conditions, the perceptual signal will be uncontrollable.

At the other extreme, the perceptual signal mirrors exactly any changes
in some function of variables in the real environment, and, assuming a
similar consistency in the effect of the output on those environmental
variables, the ability to control against any specified disturbance or to
any specified reference waveform is limited by properties of the feedback
loop such as gain and transport lag. Between these two extremes, the
latter of which is physically unattainable, lies a range in which there
is a balance between (a) constraints based on the speed and precision of
the relation between the perceptual signal and the real environment, and
(b) constraints based on the properties of the control loop. As the
precision and speed of perception increase, the properties of the rest of
the loop come to dominate the constraints; as the speed and precision
(channel capacity) of the perceptual functions decrease, these come to
dominate the accuracy of control.

Let’s ask about a hypothetical experiment, which shouldn’t be too hard to
write and test. It’s a simple pursuit tracking based on Bill’s
“TrackAnalyze” (I might even be able to do it myself – I think
I’m getting a better understanding of Delphi programming – but not until
after I get back from a NATO meeting in the first week of May for which I
spending most of my time preparing, and perhaps not until after I manage
to complete a different tracking task I am trying to program). The
Environment is the same as in TrackAnalyze: a target moves smoothly up
and down, and the subject influences a cursor to move up and down along a
vertical track close to the track of the target. The subject is
asked to control a perception of target and cursor being at the same
level. What differs from the version of TrackAnalyze in LCS III is the
display resolution.

This figure represents the normal TrackAnalyze experiment as well as the
suggested revision. It deliberately omits description of whatever control
loops are involved in the subject’s operations, and subsumes all the
perceptual levels between the display and the controlled perception into
a single small arrow, one marked red for the target, the other in black
for the cursor manipulated by the subject. The red pathway is the one of
interest. I specifically include the “Display” function in the
diagram, since the experiment uses it to emulate variations in the
channel capacity of the perceptual pathways up to the perceptual function
that creates the controlled perception. The objective is to determine how
variations in the channel capacity of the path Target->Controlled
perception affect the parameters of the modelled control loop, and the
accuracy of control.

[]

The “Real Target” and “Real Cursor” are exactly what
they are in the standard “TrackAnalyze” – objects whose
vertical location in the display space varies smoothly (within the limits
of display resolution). What is different between the suggested
experiment and the original is the role of the “Display”
function.

I suggest two variants, both of which use the channel capacity of the
“Display” function as an experimental variable (IV in Rick’s
terminology). In both variants the “Real Target” and “Real
Cursor” are influenced smoothly in the usual way by the disturbance
program and the Mouse respectively. The display resolution is varied in
space and/or time. At one extreme is the standard TrackAnalyze, and at
the other extreme the displayed cursor does not move at all when the
mouse moves.

1 (spatial information constraint) Between these extremes, the spatial
resolution of the display is varied. The displayed cursor is not a
pointed bhorizontal bar, but a rectangle that fills some vertical space
on teh screen. At 1 bit, the cursor is represented by a rectangular block
that extends from the mid-line to either the top or the bottom of the
display, depending on whether the Real Cursor is at that moment above or
below the mid-line. At 2 bits, the subject can see in which of four
regions of the display the cursor is – the rectangle fills the vertical
extent of whichever region contains the Real Cursor (the regions should
really be chosen so that the target spends an equal amount of time in
each of the four, but this probably doesn’t matter, numerically ). At 3
bits, the subject can see in which of 8 regions the Real Cursor is, and
so forth.

2 (temporal information constraint). Between these extremes, the temporal
resolution of the display is varied. The displayed cursor is the normal
TrackAnalyze horizontal bar. It moves stepwise, the steps being separated
by regular time intervals. At each interval, the subject could gain a
certain amount of information, the same at each step until the steps
became close enough that the successive locations showed some
correlation. This type of constraint actually provides a rough measure of
the rate of the temporal degrees of freedom for control in this
situation, since at some point, the data will be indistinguishable from
the data using the standard (fully displayed) TrackAnalyze.

Using both techniques, the actual information rate of the perceptual path
from the standard TrackAnalyse cursor to the controlled perception could
be measured reasonably accurately.

By varying the channel capacity of the “Display” function, the
experiment emulates the effect of different channel capacities of the
perceptual pathway that in previous messages we have called D->S2. By
finding the values for spatial and temporal resolution beyond which
improving the resolution makes no difference, we get a measure of the
informational limits of the perceptual pathway beyond the display from
sensors to the controlled perception.

A third variant is possible, in which the Display adds noise to the
location of the Real Target, by making the displayed target appear above
or below the correct location of the Real Target. Since the subject can
only control the perception, but can only act on the Real Cursor. this
added noise, which reduces the channel capacity of the
Target->Controlled Perception pathway, should have the effect of
reducing the subject’s ability to control the perception AS MEASURED BY
the Analysis Program.

[MT] In any case, whether the perceptual fluctuations are or
are not related to fluctuations in the environment, you can still
measure
the information you can get about one by observing the other.
[RM] You might be able to measure it but would it make any sense?

Depends on the circumstances. Sometimes yes, sometimes no. Crudely put,
it would make sense under much the same circumstances in which measuring
correlations or RMS deviations would make sense. On the other hand, if
there you can get a significant amount of information about one thing
from observing another that seems totally unrelated, you can be fairly
sure that either there is some communication between them or they are
both influenced by some common variable that might be worth seeking out.
The same is true, of course, for correlation.

[RM] Suppose
the environment is a 7 bit sequence, 1011010 being one possible state
of that environment. Also suppose that a perception of that
environment is p = a1b1+a2b2+...a7b7, where the a's are constants and
the b's are the bits in the environmental "message". I presume
you
could measure whether the fluctuations in p are related to variations
in the 7 bit environmental message.  But I don't think that would
tell
you much about what is being perceived or how perception works.

Quite true. A version of “The Test” is needed in order to
estimate roughly what is being perceived, out of the infinity of
possibilities of what might be being perceived. You would need to model
it using different values of the “a” parameters. Of course,
even if you did that, you wouldn’t really know that what was being
perceived was actually a linear combination of the “b”
variables, which is the assumption you make above. Even using The Test
with meticulous accuracy and even assuming that the true perceptual
function really is a linear sum, I think you would be hard put to it to
find accurate values for the relative magnitudes of the “a”
parameters. So you really don’t have any way of telling what is being
perceived using ordinary PCT techniques, whether or not you include
information-theoretic tools in your toolkit.

[RM continued] It
would be misleading, actually, because it would give the impression
that perception is about carrying information about the state of the
environment to the brain when that's not what's happening.

Is it truly not what’s happening? I have your authoritative word for
that?

What experimental data do, or could even in principle, allow you to make
that assertion? If you have a thought-experiment or a real one that
supports this claim, I’d be delighted to know of it. I propose a
thought-experiment above, which I expect would show that perception is
carrying information about the state of the environment to the brain, and
which would moreover allow an estimate of how much information that is in
respect of a specific perceptual control function (even though the
perceptual input function might not be well specified); but you
apparently believe that manipulating the channel capacity of the Display
function in the channel will have no effect on the ability of the subject
to control in this pursuit tracking task. Or do I misunderstand the
implications of “that’s not happening”?

[MT] If there's a
causal link, one would be likely to call it the "information
transmitted"
from the environment to the perceptual signal, at any and all levels of
perception. Remember that "information transmitted" does not
mean
information actually flows. It's just a measure that relates how much
you
can tell about variation in the perceptual signal given the
environmental
variation, or vice-versa. The "transmission" is a reference to
the causal
path that is assumed to connect the environment to the perceptual
signal.
[RM] Yes, and it makes it seem that perception is a particular kind of
causal path: a communication channel. I think perception is more
properly modeled as functions that map aspects of the environment into
perceptual signals that vary with variations in the degree to which
these aspects can be constructed given the state of the environment.

Why “more properly”? You make it sound as though there were
some kind of contradiction between the two ways of labelling the same
thing, whereas your second sentence is actually just a description of a
particular kind of communication channel.

[MT] What information is transmitted from environmental variation to
the
perceptual signal with any of these infinitely many functions depends
partly
on the function and partly on the noise.

[RM] OK, I think I can agree that if you _know_ the perceptual function
then you can measure the extent to which variations in that signal are
due to noise rather than environmental variations. I think I can do
that pretty well without using any information theory at all; I do
have to use control theory though.
I just don't see any value in information theory for the study of
purposeful (control) behavior. It may be of value in communication
engineering but I don't see how it could help me in my work on living
control systems. But of you like it, go for it.

Thank you for your permission.

It would be more welcome if I felt that it came from some understanding
of the issues.

Martin

No virus found in this incoming message.

Checked by AVG -
www.avg.com

Version: 8.0.238 / Virus Database: 270.11.58/2062 - Release Date:
04/16/09 08:12:00

[From Rick Marken (2009.04.20.1210)]

Bill Powers (2009.04.18.1323 MDT)--

All in all, I think Rick has it right: we can't simply speak of the
transmitted and received messages, when all we ever can observe is the
received messages.

I am so sorry, Martin. I have no idea what's gotten into Bill;-)

Best

Rick

···

--
Richard S. Marken PhD
rsmarken@gmail.com

[Martin Taylor 2009.04.20.16.39]

[From Rick Marken (2009.04.20.1210)]
Bill Powers (2009.04.18.1323 MDT)--

All in all, I think Rick has it right: we can't simply speak of the
transmitted and received messages, when all we ever can observe is the
received messages.
I am so sorry, Martin. I have no idea what's gotten into Bill;-)

Nobody ever does, since all we ever can observe is the received
message, as I have tried to point out on many occasions, going back as
I have now rediscovered from the archives, to at least 1992. In the
development of Layered Protocol Theory, at least in its published
forms, my emphasis on that point goes back 25 years, In fact, the core
of Layered Protocol Theory is to suggest how we manage to seem
to communicate moderately reliably in the face of this apparently
damning fact. It’s really a good part of the reason LPT was started in
the first place. So Bill isn’t saying anything that he hasn’t heard
from me. Maybe what’s got into him (“gotten” must be a US idiom) is an
awareness of that.

Even the analyst looking at the physical representations of the input
and output cannot know what message was sent or received, or whether
the message received is related to the message sent in the way the
sender intended. To determine that, the analyst would have to be able
to perceive the sender’s reference values and the actual form of the
sender’s perceptual function relating to the intended effects on the
recipient of the message, and since the analyst is not in a feedback
relationship with either sender or receiver, that’s pretty difficult to
do. The analyst COULD tell, however, the maximum amount of information
that the receiver could have obtained, given appropriate probability
distributions for the channel in question.

I’m not ignoring your request to provide a predictive information
analysis of a control loop, but as I had mentioned in the previous
message, it’s not something I’ve ever done, and I’d like to do it
properly. It might be worth a new page on the PCT portion of my Web
site, which is looking a bit moribund :frowning: . The basic points as I see
it right now, however, are to determine where is the information
bottleneck under different circumstances, and to work out the the
relation between the transport lag and the rate and precision of the
equivalent degrees of freedom at the bottleneck. I suspect the
bottleneck is more often at the output than in the perceptual input
region, which could be why Bill so often harps on the fact that
subjectively, conscious perceptions are clear and sharp, even when they
are uncertain or ambiguous. It’s an interesting possibility, worth
pursuing, I think.

Keith Hendy actually was using information-theory-based PCT in his work
on air accident analysis and on rethinking cockpit team roles, but I
didn’t follow his work (about ten or fifteen years ago). I may go back
and look it up when I get home from Paris. I remember thinking at the
time that his view of PCT was not entirely orthodox, but that was not
because of his use of information theory. I’d like to know what I would
think of his approach now. (He is no longer doing Science, but is the
kind of administrator being rousted on the “Control in a Company”
thread – inventing so much organization that the so-called working
scientists now complain that some weeks they are able to do nothing but
paperwork for the bureaucrats).

I’m not sure how much more reading of CSGnet I will do between now and
May 1. I probably won’t do much more writing.

Martin

[From Rick Marken (2009.04.20.2145)]

Martin Taylor 2009.04.20.16.39]

Rick Marken (2009.04.20.1210)]

Bill Powers (2009.04.18.1323 MDT)--

All in all, I think Rick has it right: we can't simply speak of the
transmitted and received messages, when all we ever can observe is the
received messages.

I am so sorry, Martin. I have no idea what's gotten into Bill;-)

Nobody ever does, since all we ever can observe is the received message, as
I have tried to point out on many occasions

I guess I was being too clever by half again. What had "gotten into
Bill" that surprised me was his agreeing with me over you. Bill will
usually goes out of his way to find agreement with almost anyone I
disagree with. He knows he doesn't have to be careful with me to keep
me in the fold;-)

But your post did suggest a possible reason why you think of
perception as being a process of communicating messages to the mind.
That is actually what is happening when one person tries to
communicate with another. The message being sent is the perception
that the sender intends to communicate; actually, it's probably
several levels of perceptual message. When you say "The Schouten
experiment tells us X" you intend for me to see the words you typed
(p1), understand those words (p2), see the principle you are trying to
explain (p3) and so on. To the extent that I have the perceptions you
intended to communicate (call them p1', p2' and p3') you could say
that information has been transmitted from you to me. And we could
measure this information as H(p,p'). But we can do this because we
know the messages to be transmitted (p1, p2 and p3) and the messages
received (p1', p2' and p3').

I think things get confusing when you apply this model to a perceptual
experiment. In such an experiment you are presenting stimuli which you
(the experimenter) see as perceptions -- like p1, p2 and p3 -- and
then you see how well the subject identifies these "messages". It
looks (to the experimenter) like the environment is a set of messages
and that perception is a process of communicating these messages to
the subject. But this is not how it works from the subject's point of
view. The subject's perceptions -- p1', p2' and p3' -- are not
necessarily functions of the environmental variables that result in
what you perceive as p1, p2 and p3. While p1 for you might be
a1x1+a2x2, for example, p1' for me might be b1x1+b2x2-b3x3. This could
be the case even if my answers to your questions about p1 are what you
would expect if p1' = p1.

I'm not ignoring your request to provide a predictive information analysis
of a control loop, but as I had mentioned in the previous message, it's not
something I've ever done, and I'd like to do it properly.

OK, though I find it very puzzling that you could confidently claim to
be able to predict the slowing and gain of a control system based on
information measures when you have never done it. You're not shining
me on, are you? :wink:

Best

Rick

···

--
Richard S. Marken PhD
rsmarken@gmail.com

[Martin Taylor 2009.04.21.0.52]

OK, this one really will be quick :slight_smile:

[From Rick Marken (2009.04.20.2145)]
Martin Taylor 2009.04.20.16.39]
I'm not ignoring your request to provide a predictive information analysis
of a control loop, but as I had mentioned in the previous message, it's not
something I've ever done, and I'd like to do it properly.
OK, though I find it very puzzling that you could confidently claim to
be able to predict the slowing and gain of a control system based on
information measures when you have never done it.

You confuse two things. One is the predictive description that you
requested, in response to a message in which I said I had never done
that. What I said [Martin Taylor 2009.04.18.13.53] was: *“You can,
of course, use
PCT methods to adjust those parameters optimally, without any analysis,
but informational analysis may show you why, and perhaps where the
sensitive points are. I haven’t done such an analysis, so here I’m only
speculating.”*The other is the kind of qualitative description of the directions
of effects to be expected when the information-theoretic properties of
sections of the control loop are changed. That was the kind of thing I
said in the first part of the same paragraph. In response to your:

It turns out that even when I inject fairly
substantial levels of noise, control can still be quite good (and
similar to the subject;s performance) if system gain and slowing is
adjusted properly.

I said: "
[MT] There’s the point, exactly! …“if system gain and slowing is
adjusted properly”. That’s precisely where estimates of channel
capacity come in handy. You can’t control to a precision better than
the precision with which the perception is influenced by variations in
the elements of the environment that are affected by the output. If
there is noise, perceptual precision takes time to develop, so you can
keep good control either by gain reduction (avoiding the effects of
correcting for too much noise) and/or by reducing the output bandwidth
(avoiding the effects of high-frequency noise).
"

The two things are rather different.

There. Six minutes. Not quite my 5-minute target, but near enough. Of
course, I didn’t respond to most of your message. If it’s still
relevant in a couple of weeks, maybe I will then.

Martin