redundancy; question re chaos

[Martin Taylor 960318 11:00]

Bill Powers (960316.0300 MST)

Martin Taylor 960315 11:30 --

    What the wasp-waist does is to take advantage of redundancy in the
    input patterns. If the redundancy is insufficient, your comment is
    correct, as I believe I noted when describing the system. But if
    the input patterns are sufficiently redundant, there is zero loss
    at the wasp waist.

What is "redundancy?"

Redundancy is a question of probability distributions. If you are talking
about a system with discrete states, the system is redundant if the states
don't all have the same probability. If you are talking about a continuous
system, it is redundant if the probability distribution over it is non-uniform.

What is "redundant" depends on what function is being used to extract a
perceptual signal.

No. What is redundant (in the postings on which you comment) is the input
to a set of perceptual functions. In other words, the output patterns from
sensory receptors may be redundant, and that redundancy can be used to
reduce the number of distinct signals at the first ("intensity") layer.
That's what allows:

Later in your post, you illustrate it this way:

    It gives rise to the convergence (wasp-waistedness) of our sensory
    systems--10^8 retinal cones converging to 10^6 optical nerve
    fibres, for example.

If one retinal cone is showing "bright", it is far more probable that at least
some of its neighbours will do so than that they are putting out erratically
different levels. If a sensor at x,y is "bright" and one at x+1,y is "dark"
then it is more probable than not that nearby (such as at x, y+1, x+1, y+1)
there will be another "bright-dark" pair, and it is highly unlikely, though
possible, that the neighbour pair will be "dark-bright." That's why the
peripheral visual system seems to be loaded with things the physiologists
call "edge detectors" "orientation, specific cells" and so forth.

Presumably, the world is more
detailed than the set of all sensors. Therefore each sensor's state is
potentially independent of all other sensors' states.

Potentially, yes. In practice, no. Redundancy is determined by what happens
in practice. It is the "potential" that provides the basis against which the
actual redundancy is measured in any given case. In what we are talking
about, the lowest-level "pattern" is the set of values being output at any
moment by the set of sensors.

This means that
the set of all possibly discriminable inputs is the same as the set of
all sensors, at the level of intensities. No true redundancy, therefore,
can exist at the level of intensities.

On the contrary, the redundancy is most extreme at the level of intensities,
simply because the set of possible patterns of sensor intensities is so
hugely greater than the set of patterns that actually occur. The world seems
to be made of objects, not of fluctuating points of light that change
independently and erratically at a kilohertz rate. Objects seem to have
somewhat coherent edges--if an "object" seems to be lighter than its
background at one point on its left-hand edge, it is likely to be lighter
than its background at a neighbouring point on the same edge. Not guaranteed,
but likely.

What is "redundant" depends on what function is being used to extract a
perceptual signal. For example, an "A" presented anywhere within the
retinal area of recognition will lead to the signal representing "A".

True. If this were the only function observing the input, then its output,
being the pattern to be observed by the next level, would be restricted to
"A" and "not A". That pattern would be redundant only if the probabilities
of the two outputs were different (or if the output were a continuous
variable of "A-ness" it would be redundant if all values within its range
were not equiprobable).

However, what is lost is the information about _where_ the "A" is seen,
how it is oriented, and what its size, color, or typeface are.

Those are aspects of the input pattern. If they are not constant, then
the "A-seeing" pattern level that loses them is not matched to the redundancy
of its input. It is losing information. You have defined an information-
losing perceptual level.

However, what is lost is the information about _where_ the "A" is seen,
how it is oriented, and what its size, color, or typeface are.

Of course. That's the situation you defined.

This is true of any
situation involving "redundant" inputs. The redundancy is always defined
in terms of the level of perception involved, and there is no redundancy
at lower levels.

That's a most extraordinary assertion. How could you back it up? Are you
asserting that at the lower levels EVERY relationship between individual
inputs occurs equally often? Your words say so.

The catch here is in speaking of input _patterns_. A _pattern_ may
repeat (a higher-level perception) without having the same _instance_ of
the pattern repeating (the lower-level components of that pattern).

The repetition of a pattern is the repetition of the identical set of
values for all components of the pattern. For example:

    But if the input patterns are sufficiently redundant, there is zero
    loss at the wasp waist. What is lost is the ability to describe
    patterns that never occur.

The patterns being discussed, as should have been crystal clear, are the
sets of values "applied" to the inputs of the MLP, or "appearing" at its

different perceptual function given the same lower-level components
would extract a different invariant,

Sure, and for any space of more than one dimension there are an infinite
variety of sets of perceptual functions that span the space, allowing
loss-free encoding of input patterns. The issue is how small a number
of perceptual functions can be used without loss, given the sets of
input patterns that actually occur.

Trying to trade exactness for clarity, I had talked about "patterns that
never occur" rather than the fact that a redundant set of patterns can
be represented _exactly_ by a smaller set of independent variables, even
though all of the input patterns might possibly occur. But it is so. I
mention it here to ward off the anticipated comment "but such a pattern
_might_ occur, even very improbably."

treating a different aspect of the
input set as the variable to be recognized and treating all other
aspects, including for example the alphabetic identity of the inputs, as
irrelevant redundancies. If the second perceptual function is looking
for a pattern of a given size, to that function an "A" of a given size
is redundant with a "B" of the same size.

The "size" output and the "A vs B" output are elements of the output
pattern. So would be outputs for "orientation" "location" and whatever
other aspects you assert are variable at the input. But now let's suppose
that in the input it so happens that large letters always occur near
the top, grading down to small letters at the bottom. If the input patterns
had that characteristic, there would be no need to represent both size
and height internally. The identical size and location could be reproduced
at the output if size alone were to be represented internally.

A more sophisticated version of this is provided by principal components
analysis, which gives the optimum _linear_ encoding of a redundant input
set of patterns.



We also use experiments. Sorry to keep hammering at this, but the
experimental approach gives us a probe into reality which we do not get
merely from abstracting. When we emit an act into the real world, what
we get back depends only and exactly on what is actually Out There.

Yes, that's the same as my statement, with which you agreed:

                                             if our control actions have
     reasonably reliable effects on some of the infinitely many possible
     perceptions, those are the perceptions that are likely to be
     retained. We are likely to lose perceptual functions that define
     CEVs one which our actions have no consistent effect. And we are
     likely to lose perceptual functions for which the control has no
     consistent effect on our intrinsic variables (specifically, those
     variables that affect the propagation of our genetic structure to
     future generations).

I don't really see much difference between the Scientist producing
abstractions called theories and any organism producing perceptual functions
that determine how it interacts with the world.

On chaos, you ask:

Has anyone tried to explain why chaos occurs? In the case of nonlinear
oscillators, one possible explanation would be that the nonlinearities
create different modes of oscillation at different frequencies. If the
different frequencies are harmonically commensurate, we get regular
oscillations of some irregular waveform. But if they are incommensurate,
we get nonrepeating superimposed oscillations which look quite random
but still contain overall orderliness. As the drive to a nonlinear
oscillator is increased, the various modes of oscillation will be
differently affected, and the commensurateness will also be affected. So
we would expect (perhaps) regions of regular oscillation with "chaotic"
oscillations between them, as the drive is gradually increased.

It can be quite difficult to distinguish between sets of non-harmonic
oscillators and chaotic systems, in practice. But they are quite different.

The reason I ask is that chaos seems to be an object of study in itself,
with no bridge to normal physical principles. It would be nice to know
what the bridge is.

Not at all. All chaos relates to feedback in non-linear systems. Perhaps
the simplest everyday example is a dripping tap. If the tap is leaking
ever so slowly, a bulge of water forms at the spout, held in by surface
tension. As the bulge grows, eventually it gets so big that surface
tenion can no longer hold it. It pinches off, and a drop falls. The remaining
water surface bounces back and oscillates like a little damped (!) spring.
While it is doing this, more water is coming in from the leak. If the
leak is slow enough, the oscillation will have essentially stopped before
the bulge has become dangerously big. In that case, the drip will be at
regular intervals. But if the leak is a little faster, some bouncing will
still be happening when the bulge is about to break off. If the remanent
bounce is accelerating the water surface upward, it acts to aid the
surface tension and delays the next drip, but if it is accelerating downward
it may advance the drip. If you increase the leak rate ever so slowly,
you find that the original drip.....drip.....drip gives way to
drip..drip....drip..drip....drip (a period-two oscillation), and then
to a pattern of four intervals, and very quickly to 8, 16---. There is
a well-defined point at which (to be mathematically incorrect) the number
of different intervals goes to infinity, and beyond that point, the drip
interval is chaotic, except for some leak rates where there again are seen
regular patterns of drip interval.

This is called the "period-doubling" route to chaos, and it happens in many
physical systems in which some feedback parameter has a non-linearity at
least as great as a square-law. There are other routes to chaos, but they
all (so far as I know) involve non-linear feedback.

Some long time ago, I mentioned the chaotic system of two ideal balls
bouncing in an ideal box. Here, the angle of bounce is non-linearly related
to the distance off the travel axis of the point of contact between the
balls. In the chaotic three-body gravitational interaction, the forces
among the objects relate non-linearly to their distances. The problem
doesn't arise in a two-body system because of the conservation of momentum
and energy that limits the degrees of freedom for motion of the two