[From Bill Powers (960515.1210 MDT)]
Peter Cariani (960515.1100) --
... multiplexing of periodicities is everywhere in the auditory
nerve once one thinks to look at the spike time patterns as signal
vehicles.
This isn't surprising, since the basic phenomenon of sound is periodic.
What are you saying that isn't also covered by saying that spike
patterns can be analyzed into superposed frequencies (i.e., Fourier
analysis)? Your subjective distinctions between "frequency" and
"interval" or "period" don't have any mathematical significance. The
only basis for the distinction is which representation yields the most
tractable computations. It's certain not surprising that if you view
neural signals in terms of periodicities, you see periodicities
everywhere. You see what you're prepared to perceive.
You say that multiplexing is an idea from theoretical biology. I believe
that you're mistaken. As far as I know, the term was originally used to
refer to sending signals on multiple radio frequencies to get around
ionic interference. It's a term from electronics, picked up by life
scientists to use as a metaphor.
I think in each sensory system information pertaining to that
modality is combined in some way or another. Temporal multiplexing
is automatically set up by either 1) arrays of receptors that
follow the temporal structure of the signal, e.g. cochlear hair
cells, mechanical receptors in the skin) or 2) early lateral
inhibition systems that set up particular time patterns/sequences
of disharges depending upon the stimulus.
This seems an unnecessarily elaborate way of saying that (a) neural
signals generated by receptors have (if you like) repetition periods
that follow the periodicity of the local stimulus, and (b) a layer of
neural functions applied to a set of input signals creates a new set of
signals that depend on the input signals. I don't see anything in the
latter statement that indicates multiplexing. To show that multiplexing
occurs you would have to show that the input information was originally
separated into individual channels, and was then combined into a single
signal in which the original information channels are preserved. I don't
believe that you can do this. The original channels you imagine don't
have separate existence.
These lateral inhibition systems have a common organization (e.g.
olfactory bulb, retina; Sventogothai, Shepard), and one sees them
for stimuli that does not have temporal structure that the
receptors can follow in its fine structure (e.g. chemical stimuli,
light). Even in vision, if you drift an image across a retina, the
retinal ganglion cells (and many cells in primary visual cortex)
will "lock" to the temporal structure of the contrast gradients
(edges) as they are presented at their corresponding place at the
retina. So, the idea of temporal multiplexing in sensory systems is
very natural if one thinks of the time patterns as a very
phylogenetically-primitive way of encoding stimuli.
This is far too complicated for my taste. What you're describing are
simply what I call input functions. The outputs of these circuits create
signals with measures that depend continuously on some aspect of the set
of raw sensory signals that enter them. These second-order signals
naturally vary with time, representing changes in those aspects. An
"aspect" is defined by the form of the input function.
When you say that sensory information is "combined in some way or
another," you are tacitly asserting that it was originally separated. If
it wasn't separated, it would not need to be combined. But if it wasn't
originally separated, then when hair cells generate temporally-varying
signals that follow the variations of the input stimulus, the
information in the stimulus for one hair-cell is still not separated; it
is still combined in a neural representation of the stimulus (this is
what you seem to mean by "multiplexing"). Of course the acoustics of the
cochlea have already created separate stimuli out of the single unitary
stimulus that entered the ear; a specific structure has been imposed on
the original data by the physical properties of the spiral organ. But
each neuron's signal is simply a continuous function of the stimulus
acting on it.
In the auditory nerve, every stimulus component below 5kHz produces
its own periodicities in the neural output, so these do start out
as "separate signals".
Nonsense, Peter; you know better than that. How do you get "stimulus
components" out of the stimulus? You apply a Fourier analysis, or
something equivalent, to some measure of the stimulus. You _create_ the
components by putting the original raw measures through a mathematical
input function. Then you apply the same analysis to the signal
representing the stimulus, and of course (if the signal is a
quantitative analog of the stimulus) you come out with the same
components. The components are in the eye of the beholder. If you
applied a different analysis to the stimulus, you would get different
components, both in the stimulus and in the signal that results.
I've thought much more about temporal codes and about possible
neural architectures for processing them than the vast majority of
computational neuroscientists ...
This could mean that you know more about this subject than anyone else,
or it could mean that you're obsessed with the idea of temporal codes
and insist on applying this idea even when it's not appropriate. Logic
is slave to premises, and I don't think you have developed your premises
to the point where you can justify this single-minded approach.
One can also argue the other way, that many, many stimulus
properties affect the discharge rate of each neuron, so that
different kinds of stimulus information (e.g. frequency, intensity,
location in auditory space, amplitude dynamics, etc.) are in effect
"multiplexed" in the firing rate of each neuron (this is what we
observe in the auditory system), and that the higher centers face a
very complicated (and perhaps impossible) task of simultanously
disambiguating all of this multiplexed information. Add to this
multiple perceptual objects (sounds, visual forms) and the problem
gets much, much harder (how does one decide which neural rates
should be included with each object?)
I agree that the way you present this view makes the problem
intractible. If frequency, location in auditory space, amplitude
dynamics, visual forms, objects, and all other experiential attributes
(like "sonority" and "beauty") really exist separately in the
environment, and are multiplexed together into a single signal, the
"disambiguation" problem is immense. This is why I gave up, 40 years
ago, on the idea that neural signals could carry markers indicating what
they mean: that somehow information about "objects", for example,
entered the nervous system at the first level and was somehow preserved
to be identified and plucked out at a later stage of processing. This is
not a viable view of perception, in my opinion.
I prefer to start very simply and not to try to impose general
principles before their time. Sound waves cause hair cells to swing back
and forth. As they move, they stimulate sensory endings which produce
trains of impulses. The simplest direct conversion from the pressure
generated by the hair cell on the sensor is from magnitude of force to
frequency of firing. You could also say that as the pressure increases,
the interval between spikes decreases, with the interval going to
infinity at zero sound level. This gives us the first level of
perception, an "intensity" or "magnitude" level in which the neural
signal indicates only the amount of stimulation. The set of all such
sensory signals constitutes the only world that any higher systems can
experience.
Since the hair-cell swings periodically, the basic neural signal
representing sound (at least for the lower frequencies) varies
periodically in its spike rate or interval. Other sensory signals
originating in other modalities do not vary periodically under ordinary
circumstances; they simply represent the intensity of the stimulus,
however the stimulus varies. This representation can have dynamic
aspects (rate-of-change emphasis), but these factors simply define what
amounts to "the stimulus."
I see the problem facing the higher parts of the brain (and in a larger
view, the species) as that of constructing a consistent world out of the
information present at the intensity level of representation. The world
that is constructed depends on the way the intensity signals are
combined in neural computing functions to produce new signals. Each new
layer imposes a new kind of order on information received from existing
lower layers. There is no question of simply "recognizing" information
that is already separately defined in the neural signals. Nothing is
defined until there is a neural input function to define it. And there
is no single best way to define a new signal; there are infinitely many
ways, and it is quite possible that no two organisms employ exactly the
same input functions. One of the age-old philosophical questions has
been, "How do we know that my experience of red is the same as your
experience of red? The answer is probably that my experience of red is
most likely different from your experience of red.
In your approach, you tacitly assume the reality of each aspect of the
auditory world that you, personally, experience. You speak as if such
attributes as period or frequency had an objective existence outside the
nervous system (as, of course, it seems to a human being that it does).
This, of course, raises the question as to how the attribute of
frequency or interval could become known to the nervous system; somehow
the information representing it must exist in the primary auditory
signal, where it is carried along from station to station until it
finally reaches a place where a station is prepared to respond to it in
terms of pitch. But the same auditory signal must also contain
information about loudness, timbre, phonetic forms, strings of phonetic
forms, musical intervals, harmony, and all other aspects of auditory
experience. This means that a neural signal must be coded to carry many
channels of information simultaneously, as many channels as there are
different kinds of information to be extracted from the signal. We
aren't talking about just two or three channels; we're talking about
hundreds, perhaps thousands. How many different spoken words must be
carried in those channels? How many different melodies? How many
different qualities of sounds from musical instruments played in
different ways?
I think it is far more reasonable to say that in the varying primary
auditory signal (however you measure it) there is a representation of
the varying intensity of stimulation of the receptors -- and that is
all. In the set of all such signals, there is apparently a potential for
order that can be realized by neural networks in many different ways --
not only the ways that we know about, but an infinity of other ways as
well. Of course we find it perfectly natural -- indeed, inescapable --
that one of the attributes to be found in the basic signal is something
called frequency, or inversely, interval. But before that aspect of the
signal can become real in the nervous system, it must be represented by
a signal. There must be something that can respond in a way that
distinguishes frequencies, and the most likely response is that of
generating a new signal representing frequency (since not every sensory
signal is directly connected to a muscle). And the simplest, most direct
way of doing that is with a tuned neural circuit.
I'm reluctant to always be "passing the buck to higher centers"
that are capable of arbitrarily-complex pattern recognitions.
There is no need for arbitrarily-complex pattern recognition. I could
present patterns to you that you would never recognize in a million
years. How about a simple vertical grid with spacings that correspond to
the successive digits of pi, or even the digits in my Visa card number?
We have to account only for those patterns that we DO recognize.
I can see simple ways of using time-structure to obviate the need
to do these complicated operations (interval representations for
pitch do this and there are simple ways of doing "auditory scene
analysis" in the time domain). This simplifies the problems that
higher centers face, and I think it brings the problem of form
perception back into the realm of tractability.
My hierarchical scheme has exactly the same aim: that of breaking down
the problem of high-level perception into stages, such that each stage
has to deal only with a simplied world. The "analyses" of which you
speak correspond to what I call perceptual functions.
There may be very elegant ways of extracting perceptual order in
the time domain (we need to investigate them).
And once that extraction is done, how is the result represented? In what
physical form does the result exist?
The percepts are informational distinctions, discriminations that
the perceiver makes.
Yes. What is the mechanism for making these distinctions, and when they
are made, in what physical form do the distinctions exist?
For this reason I don't talk in terms of "perceptual signals", but
rather signals underlying a particular percept. But I know what you
mean.
I'm not sure you do. In what physical form does a "percept" exist? I
agree that there are signals underlying a percept; I call them
perceptual signals of lower order. In my system, the percept is simply
the signal of higher order emitted by the perceptual function that
receives the underlying signals and creates a new signal that is a
function of them. What does "the percept" mean in your system?
[The Cariani system] is similar to your system, except that all
subsequent recipients have access to <all> aspects of the signal.
This means that a lower level processor need not determine in
advance what might be relevant to a higher level one.
This prevents the simplifications that make higher-level pattern
recognition feasible. Think of it this way: the higher systems construct
patterns out of the signals that the lower systems present to them. The
higher patterns do not exist first, requiring a search for information
to support them. They are simply whatever patterns can be made from the
signals that are available. If certain signals are not available -- you
can't detect straight lines -- then certain patterns are unobservable --
you can't detect quadrilaterals.
I do not assume that the world as represented at any level was sorted
out into "messages" before entering the nervous system.
I don't assume that either. Why would you think this?
Because you keep talking about "multiplexing" as if there were separate
channels of information being carried in the signal, and as if each
channel corresponded to some separate attribute of the stimulus.
No partitioning of the world exists prior to perception (except by
some other external observer).
Then why do you speak of "frequency" or "interval" or "loudness" or
"pitch" or "timbre" as if they were separate aspects of the stimulus?
"Multiplexing" entails nothing of the sort; it just means that
different aspects of the external world can be embedded in
different aspects of the neural signals that are caused by the
interaction of receptors with that world. Nothing more, nothing
less.
You see? You assume separate aspects of the external world that "become
embedded in" the neural signal, as if they had separate existence prior
to and independently of their neural representation. This is a specific
epistemological position, and I am arguing against it.
···
----------------------------------------------------------------------
Best,
Bill P.