[Martin Taylor 960319 10:00]
Bill Powers (960318.2030 MST)
I don't think we're on the same wavelength here.
Agreed. You describe as "redundancy" what I take to be a consequence
of using redundancy.
I really don't get a sense of what you mean by redundancy, in terms of
probability distributions. And I'm spending far too much time trying to
grasp something that is probably over my head. Let's let it go.
I don't think that's a good idea. Most of the unresolved arguments we have
had over the years, as well as your argument with Hans, have this issue
as an element. If I could put it in the form of an aphorism: "Not everything
that can happen does happen." And it's a good idea for a control system to
take care of what does happen. If it can also take care of what could happen,
at no cost, that's gravy.
Let's at least give it a bit more of a shot. I realize you are going "on tour"
and won't be able to respond immendiately, but here goes, anyway.
Firstly, let's consider your most recent comments.
At the level of
intensity perception, the world _is_ made of fluctuating points of
light, bandwidth-limited by the response of the retina and optic nerve.
Is it? I'd say, rather, "the world _could be_ made of ..."
Let's go even more peripheral than that, and talk about the retinal cones
and rods. Each is _capable_ of responding independently of what any of the
others is doing (or so I shall assume here, because I'm not sure it is
physiologically correct). A super-physiologist with a super recording
device could record the outputs from every rod and cone, and compare
them. This super-physiologist would very soon find that the outputs of the
different rods and cones were _not_ independent. If he looked at two
cones near each other--call them A and B, he would find that sometimes
they gave wildly different outputs, but most of the time their outputs
were quite similar. He would find this to be true for every pair of cones,
the more so the closer the pair. (Let's forget about rods from here on,
since the argument is the same with or without them.)
So our super physiologist would soon come to the conclusion that although
the cone outputs were capable of taking on mutually independent values,
the world is arranged so that they don't. That is another way of saying
that the patterns of outputs from the retinal cones is "redundant."
There are no objects at that level.
Correct. Our super-physiologist doesn't say there are. But what he does
say is that neighbour cones A, B, C, D,... X often have much the same
value, and that when they don't, the differences are almost always between
a small number of subgroups of differing near-uniform values. Since he
observes this to be true, he recognizes that he can save on storage for his
data by not recording the output of every cone individually. Instead, he can
record the overall average output for all the cones and rods, describe the
changing shapes of the regions of more or less uniform output level, and
record the deviation from average of the level of each subregion. To get
more detail, he might divide each subregion further and repeat the strategy,
or he might note that there are gradations of output across a subregion
and note the spatial derivative within the subregions, or....or...
The result of the super-physiologist's strategy is that his storage
requirements are dramatically reduced. He has taken advantage of the
redundancy he has observed in the sensor output. What he has lost is the
ability to detect a change in the character of the world being looked at.
If the person whose eye was being studied suddenly started seeing a world
in which the cone outputs did fluctuate independently and rapidly, our
super-physiologist would never know it, once he changed his method of
recording the data.
Now let's shift viewpoint, and instead of a "super-physiologist" think of
the optic nerve. The optic nerve has about a million fibres for each eye,
but there are about 100 million rods and cones. The optic nerve is serving
the same purpose as the super-physiologist's data-collection machinery. And
it does not transmit the independent output of each rod and cone. (Near
the centre of the visual field it probably does transmit each cone output
indpendently, but overall it falls short by a factor of 100). We could
not see properly a world in which the rod and cone outputs fluctuated
rapidly and independently. Does the "real world" behave this way? We
can't know by looking.
We can't know by looking, but we can say that if the world did fluctuate
in this way, it does not do so to an extent that threatens our survival,
or the survival of any other species with eyes (whether they be fly's eyes
or monolenticular eyes like ours). With reasonable confidence, we can
assert that the world--the real world that impacts use, not the world
of perception--is constructed so that our visual inputs are redundant.
(Visual inputs = photon rates at the sensors, to use our current metaphor).
But when those
inputs get passed to the configuration level, where far fewer objects
can be recognized than independent intensities, different sets of
intensity signals lead to perception of the same object.
Yes. This is possible in a useful way only when the intensity signal patterns
are redundant. If they were not, there would be no _useful_ selection of
configurations to perceive, out of all the configurations of itensities
that are conceivable. Any configuration would occur as often as any other
in the actual intensity patterns, and if the configuration level failed
to see any one of them, that failure might at some point prove fatal. Over
evolutionary time, however, organisms that tended often to miss configurations
that "should" have been controlled, were less likely to produce offspring
than those that were able to control the important configuration. And here
I mean specifically the real-world correlates of the configuration perceptions,
since it is what happens in the real world that kills.
Some configurations, if they occur at all, are very rare, whereas others
are very common. You see far more blue patches with sharp edges, for example,
than you see specific combinations of blue, green, purple, red, and white
speckles covering the visual field. If you are going to develop (over
evolutionary time) perceptual functions for a limited number of configurations,
you will be better off if that limited number is in the set of configurations
that occur. Just as our super-physiologist will store his data better if he
notes that some configurations occur and others don't, and records what
happens by storing the name of each configuration from his "it happens" list.
He will be wrong in his reconstruction of the original sensor data if
the world changes so that a "never happens" configuration actually occurs.
And so will we, if the world gets into a kind of configuration our ancestors
never experienced.
So relative to
the object level, these equivalent sets of intensities are redundant
because they lead to the same perception at the higher level.
It's the other way round--rather like the difference between PCT and S-R.
Because the intensity patterns are redundant, therefore it is possible
_usefully_ to perceive objects and configurations. The capabilities of
the system at each level are to respond to a wider variety of patterns
than the world actually provides. The next level takes advantage of this
to discard what almost never happens.
Redundancy always is the relation between what could happen and what does
happen in a given system. If some possibilities don't happen as often as
others, the system is redundant.
Only in unusual situations does everything that could happen, happen with
equal probability. In any of those unusual situations, one cannot talk of
"structure." If you have a white noise signal, you are quite at liberty
to pass it through a narrow-band filter and produce a reasonably steady
tone, but you can do the same with any filter you choose. It says nothing
about what's in the world (the signal source).
It is similar with the complex world in which we live. If all configurations
were equiprobable, we could be built with any configuration perceptual
functions whatever, but we would die if we had the wrong ones. But we survive,
as did our ancestors, with equipment that limits us to seeing certain
configurations, and see them we do; which suggests that those configurations
actually occur in the real world more often and more importantly
than do the configurations we are not equipped to see (given our sensor
equipment).
I really don't get a sense of what you mean by redundancy, in terms of
probability distributions.
I haven't used probability distributions in the above. I've tried to give
a sense of why redundancy is considered at the input, and the consequence
of redundancy is the ability to combine many possibilities into one
actuality. In the wasp-waisted perceptron that started all this off,
the point was that the wasp waist prohibits the perceptron from reproducing
at its output all possible input patterns, but it allows the perceptron
to reproduce all the input patterns that actually occur (unless the world
to which it is exposed changes character after it has been fully trained).
In the example I used, the perceptron was trained on black-and-white stripes
of various widths and orientations. Its code at the waist was assumed to
record the width and orientation of the stripes and nothing else (two
units having continuous output values would do it). At the output, it
will produce stripes and nothing else, no matter what the input. If the
input continues to be stripes, the output will be exact, no matter how
many input sensors there are. But any other pattern will also be interpreted
as stripes, wrongly. The "new" world would then have a different redundancy
from the old, and the perceptron would have to be retrained for this new
world.
There's an interesting article on this "wasp-waist" problem, taking a
slightly different viewpoint, in Science, 29 Sept 1995, p 1860-63;
"Replicator Neural Networks for Universal Optimal Source Encoding" by
Robert Hecht-Nielsen.
Have a good trip.
Martin