top-level

Jeffrey_C_Hunter · August 18, 1992, 10:02pm

Here's a new topic, related to re-organisation.

I started wondering recently how a top-level Elementary
Control System (ECS) can remain connected to reality.

  To explain: let us take a high-level ECS in a Control Net.
(A high-level node is many levels from the raw input and output.)
Assume the net starts out untrained (or only partially trained)
for its environment. Finally we assume that random re-organization
is a major part of its training.
  Now the high-level ECS doesn't know what its inputs or
its reference mean. All it must do is control that they match.
It may be initially set up with input of (target-position -
finger-position) and reference of (0). However after a few random
re-organizations the input weight for "target-position" may have been
set to zero, and the input weight from "elbow-angle" to a positive value.
  This leaves the ECS training to control "elbow-angle +
finger-position" = 0. There is no way for the ECS (or for the random
re-organisation) to know that this new function is nonsense.

In general it seems impossible to keep the input "relevant"
to the reference without forcing it in some fashion (and thus adding
another set of properties to the ECS).

One approach to "forcing it" is found in our Little Baby (a
learning version of the Little Man). As in the Little Man the high-level
references involve the distance of the finger from the target
(as perceived in the right and left retinas).
The Baby has one (or more) layers of ECSs attached to the outputs
of its high-level ECSs. However the inputs are connected directly to
the Baby's inputs.

R* <- top-level reference

···

                --------
                > ECSs | <- top-level ECS
                --------
                / \
               > --------
      direct ->| | ECSs | <-untrained ECSs
      input | --------
      to top | / \
               > / \
      ...................................
                \ / \
                Input Output <- environment

The Little Baby is forced to learn to follow the target by
being provided with a fixed input function.
The Complex Environmental Variable (CEV) that the Baby is controlling
cannot be unlearned, however it can likewise never be learned.
This is a reasonable hack while we experiment
with re-organisation, but in the long run we can't always
hand-code/hard-code the inputs.

Bill seems to have also seen the problem since he has suggested
that the learning mechanism should not be completely blind. He wants
it to contain some simple CEVs (an oxymoron :?) which guide the
re-organisation. This may be necessary, but it also feels like a hack
to have a separate control hierarchy for learning.

I have a partial solution that does not add new variables or
structure to the existing hierarchy. Unfortunately the CEV in the
example is different than "finger on target".

Suppose we have Little Baby (Mark MCXLI) that can successfully
learn to control (i.e. we have solved some of the re-organisation
problem).
Now we wish to teach it to avoid a spot in its environment (say
the exact center of its cube).

We add an extra input (called Pain). We change the environment
so that Pain becomes large if the finger is close to the center of
the cube, but is very small elsewhere. (We now have a hot-spot.)

We also add a simple ECS that has Pain as input, zero as
reference, a large gain, and outputs to the arm muscles. This does
nothing while the finger is outside the hot-spot. If the Baby moves
the finger into the hot-spot this ECS will quickly yank it out, and will
then resume doing nothing. We have given Little Baby a pain reflex.

The Baby now avoids the hotspot very effectively, however
it will have trouble moving finger to target in some cases (assume
for the moment that we don't move the target into the hotspot).
If a trajectory goes through the hotspot the arm will jump. Some
target locations will even have the Baby caught in a cycle.

The rest of the Baby will presumably eventually re-organise
to avoid approaching the hot-spot. There are several strategies that
will succeed, and the one chosen depends on the learning mechanism.

                R* <- top-level reference
                >
             --------
             > ECSs | <- multi-level CS (Control System)
             --------
              / \
             / \ R* <- another top-level reference
            / \ |
       -------- -------- -------
       > ECSs | | ECSs | | ECS | <- pain reflex ECS
       -------- -------- -------
        / \ / \ / \
      .......................................
       > \ | \ Pain /
       > \ | \ /
       +-----------+ \ / Environment
         > \ \ /
       Inputs ------------Outputs

So why don't I consider this a cheat too? After all we have
hand-coded an ECS to perform a function. Well we haven't had to
add a separate learning hierarchy (as per Bill), or had to wire across
levels (as in the current Little Baby).

Below are the reasons I think we don't have to add any
new features to "force" the Baby to learn the task.

Simplicity:
The pain-reflex is easy to learn by simple means (such as
genetic algorithms or random search). We shouldn't need to hand-code
such control functions.

Effectiveness:
The pain-reflex is very effective at avoiding the hot-spot.
This is accomplished solely by setting the gain high on a simple task.

Stability:
The pain reflex is stable against random re-organisation.
Since it is "effective" it very seldom has a non-zero error.
(Persistant high local error should probably trigger re-organisation.)
Since it is "simple" it has very few weights. This makes it a small
target for a random mutation (compared to the rest of the net).
Lastly it is high gain. If there is a random change to an input
or output the Baby will thrash wildly. The strong accumulation of
local error should quickly cause a benign mutation.

Since the new top-level goal is quite stable the rest of the
Little Baby's brain is forced to re-learn.

Now for the proverb. I have realized that one of my original
assumptions was wrong. When I first learned PCT I assumed that
all the top-level ECSs (ones with fixed references) were also
high-level ECSs (far from the environment).
I now suspect that *most* of the top-level goals of an
organism are fairly close to the I/O level, and that most of the
high-level ECSs are just used to add efficiency to the satisfaction of
these low-level goals.

Top-level goals need not be high-level goals.

... Jeff
--
De apibus semper dubitandum est - Winni Ille Pu