Memory in the model

[From Bill Powers (921218.1030)]

Allan Randall (921217.1614) --

Sorry, Allan. This network seems to have the property of putting
everyone into a state of permanent overload while trying to get
other things done, too.

The memory model in BCP was intended only to explore the role
that memory might have in behavior. The initial impetus was
simply the question, "How do we 'do the same thing again?'" This
becomes a problem when you realize that the world is never the
same twice in a row, so that to "do the same thing" we must
actually produce different actions. Also, how do you know you're
doing (accomplishing) the "same" thing? The only answer I can see
is that you remember what happened last time. It must be your
memory of the past perception that serves as the reference signal
for this time. So how does that memory get selected? And the
rest, I think, follows.

The idea of mental models must entail memory, as you suggest.
There is one version of the HPCT model that we have tossed around
for some time without really deciding about it, in which the
world we experience is ALWAYS the modeled world, PLUS whatever
error comes from lower systems. It works like this:

                            >
                         ref>sig
                            >
             --percept-->COMPARE-->error--
            > >
            > >
         INPUT FUNC OUTPUT FUNC
          > -| |
          > ------- MODEL ------------
          > ->correction-/ |
          > /
          > ----<---------
          > >
          > ref>sig
          > >
           \ --error<--COMP<--percept<----
            > >
            > >

The bottom system is mirror-reversed to avoid crossing signals.

The higher system controls a perception made of the sum of a
MODEL output and the lower-level error signal.

The basic idea is that when the higher system acts to control the
model, its output enters both the model and the lower systems (as
reference signals). If the lower systems succeed in bringing
their individual perceptual signals to the demanded reference level, they will
send no error signals to the higher system. That
means that the model works; the outputs that control it have the
correct effects when also used as reference signals for the lower
systems.

If the lower systems can't produce the perceptual signals
demanded of them, their error signals will enter the higher
system, being added to the sensed behavior of the model. So the
higher system will experience a perception of the model's
behavior, plus the error signals, and the result will be that the
perception is controlled correctly even though the model is not
correct.

However, the presence of the error signal shows that the model
must be revised. I have only indicated a "correction" based on
the lower error signal, without saying how it's brought about.
This correcting process will gradually alter the model until no
error signals are being sent upward from the lower systems. Then
the model will again be a sufficient representation of the lower-
level world.

Algebraically, this revised model is exactly equivalent to the
standard model in its performance. But there is at least one
important difference. If some of the higher-level inputs from
lower systems are lost, the higher system can continue to run on
the basis of the model alone. This is like a tracking experiment
in which the display is momentarily blanked out, or like a baby
following a toy train with its eyes as the train disappears into
a tunnel and reappears at the other end (like Piaget's
experiment). The model supplies perceptions that for the moment
are not available from the environment. Now, in this version,
that is because the model is providing the perceptions _all of
the time_, with error signals from lower systems being used for a
continual, but slow, update of the model.

This is an intriguing possibility, because it says that the world
of experience is even less directly related to the environment
than under the old model. It says, in effect, that we're
imagining EVERYTHING, but that what we imagine is slowly being
corrected all the time to eliminate errors between our way of
imagining and the way the world actually works. The world we
experience would then literally be a model of the environment.

The model, of course, would be composed mainly of memories.

I'm still not seriously proposing this revision, because it has
to be checked out against experiment to see if it really adds
anything or is necessary. But it's good to keep in mind as a way
of handling some phenomena that the current model would have some
difficulty with.

As to the RNA model of memory, I don't have any investment in it.
Whatever the actual mechanisms, they have to account for the
things memory is required to do. Defining what memory is FOR can
be done without knowing how memory works.

···

-------------------------------------------------------------
Best,

Bill P.

[Martin Taylor 921218 15:30]
(Bill Powers 921218.1030)

I have a problem with the specific detail of Bill's proposal. Not with
the principle.

There is one version of the HPCT model that we have tossed around
for some time without really deciding about it, in which the
world we experience is ALWAYS the modeled world, PLUS whatever
error comes from lower systems. It works like this:

                           >
                        ref>sig
                           >
            --percept-->COMPARE-->error--
           > >
           > >
        INPUT FUNC OUTPUT FUNC
         > -| |
         > ------- MODEL ------------
         > ->correction-/ |
         > /
       **P** **R**--<--------- (P and R added by MMT)
         > >
         > ref>sig
         > >
          \ --error<--COMP<--percept<----
           > >
           > >

The bottom system is mirror-reversed to avoid crossing signals.

The higher system controls a perception made of the sum of a
MODEL output and the lower-level error signal.

The basic idea is that when the higher system acts to control the
model, its output enters both the model and the lower systems (as
reference signals). If the lower systems succeed in bringing
their individual perceptual signals to the demanded reference level,
they will send no error signals to the higher system. That
means that the model works; the outputs that control it have the
correct effects when also used as reference signals for the lower
systems.

This seems to work when there is a one-to-one relationship between lower
and higher systems, but will it work with many-to-many? The reference
signal in the lower ECS is not the output from the higher, but a function
of a vector of such outputs from higher ECSs. So when the lower perceptual
signal matches its reference, the perceptual signal returned may well be
quite different from the output that is sent (Points R and P that I added
to Bill's diagram. Likewise the perceptual signal in the higher ECS is
a function of a vector of signals like P from many lower ECSs.

If all the lower error signals are zero, then the higher one will presumably
also be zero, but the reverse is not true. The higher ECS can have a zero
error when none of the lower ones are satisfied. I'm not sure what this
would do to the correction aspect of the memory module. Should there be
a correction or not?

And what evokes a specific memory? In the BCP diagram, we mentally inserted
"reference vector" as an addressing code for a scalar memory output--a
table lookup function. That seems reasonable, but we were not at all clear
how it would or could work. And in the context of the planning exercise
that I described the other day (shopping robots in the supermarket), we
could not see how to incorporate the real-world constraints presented
verbally and never before experienced by any ECS operating through a real
world CEV. That question, also, still hangs in the air.

Basically, I like the concept that the ECS deals mostly with model predictions,
corrected by error. But I'm not sure what error, or what is the memory.
Is it a vector of sensory inputs, to be corrected by the vector of incoming
error signals (or more probably the incoming perceptual signals--more robust)?
And what evokes one memory rather than another?

Remember my point about the reduction of information rate as we go up the
hierarchy. It is this that gives play to predictive memory models.

Martin