redundancy

[From Bill Powers (960318.2030 MST)]

Martin Taylor 960318 11:00 --

     What is redundant (in the postings on which you comment) is the
     input to a set of perceptual functions. In other words, the output
     patterns from sensory receptors may be redundant, and that
     redundancy can be used to reduce the number of distinct signals at
     the first ("intensity") layer.
....
     On the contrary, the redundancy is most extreme at the level of
     intensities, simply because the set of possible patterns of sensor
     intensities is so hugely greater than the set of patterns that
     actually occur. The world seems to be made of objects, not of
     fluctuating points of light that change independently and
     erratically at a kilohertz rate.

I don't think we're on the same wavelength here. At the level of
intensity perception, the world _is_ made of fluctuating points of
light, bandwidth-limited by the response of the retina and optic nerve.
There are no objects at that level. There can be many intensity
distributions that are never recognized as objects. But when those
inputs get passed to the configuration level, where far fewer objects
can be recognized than independent intensities, different sets of
intensity signals lead to perception of the same object. So relative to
the object level, these equivalent sets of intensities are redundant
because they lead to the same perception at the higher level. That's an
amateur's version of redundancy, but it's the best I can do.

I really don't get a sense of what you mean by redundancy, in terms of
probability distributions. And I'm spending far too much time trying to
grasp something that is probably over my head. Let's let it go.

···

------------------------------
Thanks for the explication of the physical meaning of chaos.
-----------------------------------------------------------------------
Best,

Bill P.

[Martin Taylor 960319 10:00]

Bill Powers (960318.2030 MST)

I don't think we're on the same wavelength here.

Agreed. You describe as "redundancy" what I take to be a consequence
of using redundancy.

I really don't get a sense of what you mean by redundancy, in terms of
probability distributions. And I'm spending far too much time trying to
grasp something that is probably over my head. Let's let it go.

I don't think that's a good idea. Most of the unresolved arguments we have
had over the years, as well as your argument with Hans, have this issue
as an element. If I could put it in the form of an aphorism: "Not everything
that can happen does happen." And it's a good idea for a control system to
take care of what does happen. If it can also take care of what could happen,
at no cost, that's gravy.

Let's at least give it a bit more of a shot. I realize you are going "on tour"
and won't be able to respond immendiately, but here goes, anyway.

Firstly, let's consider your most recent comments.

At the level of
intensity perception, the world _is_ made of fluctuating points of
light, bandwidth-limited by the response of the retina and optic nerve.

Is it? I'd say, rather, "the world _could be_ made of ..."

Let's go even more peripheral than that, and talk about the retinal cones
and rods. Each is _capable_ of responding independently of what any of the
others is doing (or so I shall assume here, because I'm not sure it is
physiologically correct). A super-physiologist with a super recording
device could record the outputs from every rod and cone, and compare
them. This super-physiologist would very soon find that the outputs of the
different rods and cones were _not_ independent. If he looked at two
cones near each other--call them A and B, he would find that sometimes
they gave wildly different outputs, but most of the time their outputs
were quite similar. He would find this to be true for every pair of cones,
the more so the closer the pair. (Let's forget about rods from here on,
since the argument is the same with or without them.)

So our super physiologist would soon come to the conclusion that although
the cone outputs were capable of taking on mutually independent values,
the world is arranged so that they don't. That is another way of saying
that the patterns of outputs from the retinal cones is "redundant."

There are no objects at that level.

Correct. Our super-physiologist doesn't say there are. But what he does
say is that neighbour cones A, B, C, D,... X often have much the same
value, and that when they don't, the differences are almost always between
a small number of subgroups of differing near-uniform values. Since he
observes this to be true, he recognizes that he can save on storage for his
data by not recording the output of every cone individually. Instead, he can
record the overall average output for all the cones and rods, describe the
changing shapes of the regions of more or less uniform output level, and
record the deviation from average of the level of each subregion. To get
more detail, he might divide each subregion further and repeat the strategy,
or he might note that there are gradations of output across a subregion
and note the spatial derivative within the subregions, or....or...

The result of the super-physiologist's strategy is that his storage
requirements are dramatically reduced. He has taken advantage of the
redundancy he has observed in the sensor output. What he has lost is the
ability to detect a change in the character of the world being looked at.
If the person whose eye was being studied suddenly started seeing a world
in which the cone outputs did fluctuate independently and rapidly, our
super-physiologist would never know it, once he changed his method of
recording the data.

Now let's shift viewpoint, and instead of a "super-physiologist" think of
the optic nerve. The optic nerve has about a million fibres for each eye,
but there are about 100 million rods and cones. The optic nerve is serving
the same purpose as the super-physiologist's data-collection machinery. And
it does not transmit the independent output of each rod and cone. (Near
the centre of the visual field it probably does transmit each cone output
indpendently, but overall it falls short by a factor of 100). We could
not see properly a world in which the rod and cone outputs fluctuated
rapidly and independently. Does the "real world" behave this way? We
can't know by looking.

We can't know by looking, but we can say that if the world did fluctuate
in this way, it does not do so to an extent that threatens our survival,
or the survival of any other species with eyes (whether they be fly's eyes
or monolenticular eyes like ours). With reasonable confidence, we can
assert that the world--the real world that impacts use, not the world
of perception--is constructed so that our visual inputs are redundant.
(Visual inputs = photon rates at the sensors, to use our current metaphor).

But when those
inputs get passed to the configuration level, where far fewer objects
can be recognized than independent intensities, different sets of
intensity signals lead to perception of the same object.

Yes. This is possible in a useful way only when the intensity signal patterns
are redundant. If they were not, there would be no _useful_ selection of
configurations to perceive, out of all the configurations of itensities
that are conceivable. Any configuration would occur as often as any other
in the actual intensity patterns, and if the configuration level failed
to see any one of them, that failure might at some point prove fatal. Over
evolutionary time, however, organisms that tended often to miss configurations
that "should" have been controlled, were less likely to produce offspring
than those that were able to control the important configuration. And here
I mean specifically the real-world correlates of the configuration perceptions,
since it is what happens in the real world that kills.

Some configurations, if they occur at all, are very rare, whereas others
are very common. You see far more blue patches with sharp edges, for example,
than you see specific combinations of blue, green, purple, red, and white
speckles covering the visual field. If you are going to develop (over
evolutionary time) perceptual functions for a limited number of configurations,
you will be better off if that limited number is in the set of configurations
that occur. Just as our super-physiologist will store his data better if he
notes that some configurations occur and others don't, and records what
happens by storing the name of each configuration from his "it happens" list.
He will be wrong in his reconstruction of the original sensor data if
the world changes so that a "never happens" configuration actually occurs.
And so will we, if the world gets into a kind of configuration our ancestors
never experienced.

So relative to
the object level, these equivalent sets of intensities are redundant
because they lead to the same perception at the higher level.

It's the other way round--rather like the difference between PCT and S-R.
Because the intensity patterns are redundant, therefore it is possible
_usefully_ to perceive objects and configurations. The capabilities of
the system at each level are to respond to a wider variety of patterns
than the world actually provides. The next level takes advantage of this
to discard what almost never happens.

Redundancy always is the relation between what could happen and what does
happen in a given system. If some possibilities don't happen as often as
others, the system is redundant.

Only in unusual situations does everything that could happen, happen with
equal probability. In any of those unusual situations, one cannot talk of
"structure." If you have a white noise signal, you are quite at liberty
to pass it through a narrow-band filter and produce a reasonably steady
tone, but you can do the same with any filter you choose. It says nothing
about what's in the world (the signal source).

It is similar with the complex world in which we live. If all configurations
were equiprobable, we could be built with any configuration perceptual
functions whatever, but we would die if we had the wrong ones. But we survive,
as did our ancestors, with equipment that limits us to seeing certain
configurations, and see them we do; which suggests that those configurations
actually occur in the real world more often and more importantly
than do the configurations we are not equipped to see (given our sensor
equipment).

I really don't get a sense of what you mean by redundancy, in terms of
probability distributions.

I haven't used probability distributions in the above. I've tried to give
a sense of why redundancy is considered at the input, and the consequence
of redundancy is the ability to combine many possibilities into one
actuality. In the wasp-waisted perceptron that started all this off,
the point was that the wasp waist prohibits the perceptron from reproducing
at its output all possible input patterns, but it allows the perceptron
to reproduce all the input patterns that actually occur (unless the world
to which it is exposed changes character after it has been fully trained).

In the example I used, the perceptron was trained on black-and-white stripes
of various widths and orientations. Its code at the waist was assumed to
record the width and orientation of the stripes and nothing else (two
units having continuous output values would do it). At the output, it
will produce stripes and nothing else, no matter what the input. If the
input continues to be stripes, the output will be exact, no matter how
many input sensors there are. But any other pattern will also be interpreted
as stripes, wrongly. The "new" world would then have a different redundancy
from the old, and the perceptron would have to be retrained for this new
world.

There's an interesting article on this "wasp-waist" problem, taking a
slightly different viewpoint, in Science, 29 Sept 1995, p 1860-63;
"Replicator Neural Networks for Universal Optimal Source Encoding" by
Robert Hecht-Nielsen.

Have a good trip.

Martin

[From Bruce Gregory 960319.1605 EST]

Martin Taylor 960319.1000

   It is similar with the complex world in which we live. If all configurations
   were equiprobable, we could be built with any configuration perceptual
   functions whatever, but we would die if we had the wrong ones. But we survive,
   as did our ancestors, with equipment that limits us to seeing certain
   configurations, and see them we do; which suggests that those configurations
   actually occur in the real world more often and more importantly
   than do the configurations we are not equipped to see (given our sensor
   equipment).

Is it not true that all we can really say is that creatures equipped to see
certain configurations survive better than those not so equipped? What do
we gain by adding, "that those configurations actually occur in the real
world more often and more importantly than do the configurations we are not
equipped to see..." I survive because I am able to discriminate traffic
light colors, but this fact tells me nothing about the reality that those
lights allow me to avoid, or about its spectral characterization.

Bruce Gregory

[Martin Taylor 960319 16:30]

Bruce Gregory 960319.1605

I guess my attempt to describe redundancy in simple terms related to PCT
must have backfired, for reasons I don't immediately see.

Is it not true that all we can really say is that creatures equipped to see
certain configurations survive better than those not so equipped?

If there were no redundancy in the inputs to the configuration level, there
would be no particular configurations to see. Any arrangement of the inputs
would be as likely to occur as any other. But as the world happens to
affect us, some happen and some don't--at least not often enough to have
mattered over evolutionary time.

What do
we gain by adding, "that those configurations actually occur in the real
world more often and more importantly than do the configurations we are not
equipped to see..."

We gain the point that the redundancy of the lower level allows us to
avoid committing resources to the control of configurations that don't
happen, and to concentrate on those that would affect us adversely if
we didn't control them.

I survive because I am able to discriminate traffic
light colors, but this fact tells me nothing about the reality that those
lights allow me to avoid, or about its spectral characterization.

Nothing we perceive tells us about reality. What we perceive relates to
what we have found it (evolutionarily or personally) effective to control.

Spectral characterization is something not available from our sensors,
apart from the three degrees of freedom provided by our three kinds of
retinal cone, and at very low spatial resolution for the blue df, at that.
It's only the spatial redundancy that allows us to use the blue in seeing
the colours of objects. If there were not enough spatial redundancy in
the input from the real world to allow us to perceive objects as such,
we couldn't use blue very well in seeing the qualities of parts of the
world. I have been talking about redundancy at the sensors. What they
don't discriminate is unavailable to us except through the use of
manufactured instruments.

All I'm trying to get across is that if not everything that can happen
does happen, then it is possible to describe what does happen more
succinctly than by simply enumerating and evaluating the outputs of
every sensor independently. Becsue there are common types of correlations
among the values of neighbouring sensor outputs, it is possible to see
patches, edges, objects, relationships... None of that would be possible
if all sensor patterns happened with equal probability, unpredictably from
moment to moment. The practical success of the "wasp-waist" perceptron
testifies to the redundancy in toy systems such as speech waveforms or
static imagery.

I introduced the wasp-waist as a first step, which should have been accepted
immediately because it has been practically demonstrated (usually the
criterion on CSGnet). It was denounced for reasons I don't understand, which
has prevented the second step from being taken. The second step is the
short-circuiting of parts of the MLP, resulting in a standard HPCT
hierarchy. Using the wasp-waist allows us to see the standard HPCT structure,
Shannon's "neural net module" control system, Hans Blom's model-based
control, perhaps Bill Powers' Artificial Cerebellum, and probably other
things, as views on the same underlying structure, the scalar HPCT structure.
If one view applies more usefully in one situation, it doesn't mean another
view is wrong. The other view might be better under other circumstances.

The different structures might be differentiated by physical dissection,
but not, I think, by examining their behaviour. Once the behaviour of
the wasp-waist perceptron is understood, this line of discussion can be
resumed. But the understanding of "redundancy" must come first, or we run
into all sorts of specious arguments about the wasp-waist being unable to
reproduce its inputs, whereas it is demonstrably capable of reproducing its
inputs _when the redundancy of the input patterns matches its structure_.

It is of no interest to the discussion that there is an infinity of spectra
that provide the same output from red, green, and blue cones. It is of
interest whether those output values say anything about the output values
of nearby red, blue and green cones. If the inputs are redundant, they do.
And higher levels of the perceptual system have the possibility of existing.
And those higher levels allow you to see that there exists a "traffic light"
showing a certain kind of "colour signal."

Martin

[From Bruce Gregory 960319.1735]

Martin Taylor 960319 16:30

That was very clear. My confusion arose because I thought you were saying
something other than

  Nothing we perceive tells us about reality. What we perceive relates to
  what we have found it (evolutionarily or personally) effective to control.

Thanks for taking the extra step to clarify that.

Bruce Gregory