Little Baby & Reorganization

[From Bill Powers (920818.2300)]

Jeff (910818) --

You are running into the same conceptual problems that eventually led me to
propose the reorganizing system as I did.

In this diagram:

                   R* <- top-level reference

···

                --------
                > ECSs | <- top-level ECS
                --------
                / \
               > --------
      direct ->| | ECSs | <-untrained ECSs
      input | --------
      to top | / \
               > / \
      ...................................
                \ / \
                Input Output <- environment

... you have both levels of system controlling the same input variable. If
you simply connected the output of the higher system to the output below
the line, you'd have a control system.

In effect, you're trying to introduce a teacher that already knows how to
control the input, and it's teaching another system to do it the same way.
This is the problem that all modelers of self-organizing systems eventually
have to face. The problem is that reorganization is goal-directed: it must
aim to build an organization that is good for the organism. Its goals,
therefore, must already take into account the "value" (to the organism) of
the variables it senses, and its actions must build behaving systems that
control the environment in a way that satisfies these values. But the
untrained ECS is not yet an ECS, so it doesn't know what to sense, or how
any particular output will affect what it senses. It doesn't know what a
"good" target setting of the sensed variable would be. So it seems that the
reorganizing system already has to be able to control the same sensed
variable that the untrained system is going to be controlling, and it must
know the reference level that is best, and it must know what outputs are
needed to effect control. If that's true, why doesn't the higher system
simply actuate the outputs to control the variable? If it already knows
what to do, why doesn't it just do it?

In the second version, you have a control system that avoids touching a hot
place. This system is built in, and its reference signal is fixed. You say

The rest of the Baby will presumably eventually re-organise
to avoid approaching the hot-spot. There are several strategies that
will succeed, and the one chosen depends on the learning mechanism.

The fixed control system is an isolated parasite in the hierarchy: it
opposes any attempt by the rest of the system to use the same position
control system to touch the hot spot. It does so by creating a conflict,
which arises from the fact that it acts by using the same muscles that all
other control systems must use. Its reference signal, however, does not
come from the hierarchy; it is fixed. This system is therefore like a
separate organism living inside the hierarchy. Its actions, being unrelated
to activities in the hierarchy, constitute a disturbance, how powerful
depending on its loop gain. The rest of the hierarchy will not learn to
avoid the hot spot; it will (if it can) learn to overcome the disturbance.
You have given the rest of the hierarchy no reason to avoid touching this
spot; that is the business of the built-in circuit. So there is no basis
for the rest of the hierarchy to avoid trying to touch the hot spot. You
have put the recognition of the value of touching the hot spot into one
low-level system; to the rest of the hierarchy, pain is then just another
sensory signal, and there's no _a priori_ reason to hold it at any
particular reference level.

You still haven't explained reorganization: the parasitic system doesn't
reorganize. You have only created a situation in which reorganization in
the hierarchy would be necessary if the parasitic system interferes with
control by the rest of the hierarchy. So we are still left with the
question of how the rest of the hierarchy reorganizes, and what the basis
for and mechanism of such reorganization would be.

The problem with any built-in system or teacher is that you end up having
to put all the intelligence needed for control into the built-in system.
For instance, if Baby's finger reaches out and touches a hot spot, the
"high-level" system has to understand that the remedy is to pull the arm
back. If the arm pulls too far back and the elbow touches another hot spot,
the same system has to understand that now the remedy is to move the arm
forward. If the arm is immobilized by an obstacle, the system has to know
that the other hand must be used to knock the hot object away. So this
system has to know about relationships between painful sensations on all
parts of the body and the effects of tensing various muscles attached to
the skeleton. Before you're through designing this high-level system, you
will have built in the entire control hierarchy -- which then, presumably,
copies itself into the untrained part of the CNS. But in that case, why
bother with the untrained part? You're describing a system whose behavior
is entirely instinctive from birth.

There may be some built-in control systems low in the human hierarchy, at
birth. There probably are. But few of the "instinctive" behaviors we are
born with last long; most of them are reorganized away very early in life,
or turn into low-level control systems whose reference signals are set from
above. It's been known for decades that the ENTIRE nervous system is highly
plastic, even at the level of the spinal cord, the so-called "reflexes." We
are not a mass of built-in reflexes with higher systems trying to behave
around them. As we learn the higher levels of control, we lose the built-in
behaviors, or their reference signals become variable.

Low-level systems with fixed references are disconnected from the
hierarchy. The more of them there are, the less freedom there is for the
higher systems to behave. Remember that ALL systems have to act by using
the same set of about 600-800 spinal control systems. Those control systems
must be accessible, via their reference signals, to ALL systems that need
to use them to generate actions. And that means all systems that produce
behavior.

Another problem with considering things like pain a "high-level" controlled
variable is that you then have to consider pain to be similar to things
like morals or patriotism, which can lead a person to suffer pain
deliberately, like G. Gordon Liddy. But if you delegate the control of pain
to a hierarchically low level of control, it becomes impossible to explain
why pain has a negative value at higher levels or how the higher levels can
deliberately accept pain.

Neither of the approaches you outline above can explain real behavior. If
there's a built-in control system for avoiding hot spots, how hot does the
spot have to be in order to prevent touching it? If approach to the hot
spot, and withdrawal from it, are under control of an automatic system, how
is it that a person can hold the hands at just the right distance from a
fire to keep them nicely warm? How is it that a person can take hold of a
hot object and quickly move it to a safer place, suffering pain for an
instant in order to avoid some higher-level problem like melting the handle
of a pot?

Even more important, how is it that a person can learn not to touch things
that are ARBITRARILY hot? The burner of an electric stove that has been
turned off a few minutes ago. A light bulb that looks perfectly ordinary
but was just turned off. A soldering iron that doesn't glow. A dark rock or
a sandy beach out in the sunshine. There's no way that a built-in system
can know what aspects of the environment are likely to give rise to pain.
It doesn't have enough complexity in its sensors, and even if it does,
those sensors couldn't have evolved to understand an environment that is
infinitely variable, and in which relationships exist which have never
existed before in evolutionary time. The real test that a reorganizing
principle must pass is that it must produce an organization that is good
for the organism in an environment where there is no _a priori_ reason for
any particular sensed state of affairs being either good or bad for the
organism. The built-in value of things like pain must guide reorganization
at ALL levels, including the highest.

Reorganization has to be based on the control of variables that have _a
priori_ value to the organism. Nothing in the external world, as sensed,
can have any such value, not for a human being. The external world is
simply too variable. So reorganization can't be based on the state of the
external world. It must be based on the state of the internal world.

The reaction to pain must be based not just on the pain signals but on the
REASON FOR WHICH pain signals are not good for the organism. That is, the
value of pain does not lie in the sensory signal; it lies in the state of
the organism that gives rise to the sensory signal. It is possible, in
short, that there are _a priori_ aspects of the state of the organism that
serve as indicators of viability. These are the things that a reorganizing
system must sense. In effect, the reorganizing system must sense and
control for _viability itself_ as indicated by certain critical variables
that it monitors.

In keeping with the same principles that apply in the hierarchy, the output
effects of the reorganizing system do not have to employ outputs that are
the same as its inputs. We control sensed force not by acting on the tendon
receptors, but by sending signals to the muscles. We control the appearance
of objects in the world not by somehow creating a direct effect on the
objects, but by setting reference signals for control systems that vary the
angles at the joints in arms, wrist, and fingers. We cause one controlled
variable to change as a means of controlling a different one that is
dependent on, but physically different from, the one we use as output.

This is my principle of reorganization. If the pH of the blood stream is
too high, the reorganizing system does not react to the drop in pH by
increasing pH directly; it doesn't know how. It does so by altering systems
that control for sensory variables. Eventually it will alter control of
just those sensory variables that indicate states of the environment that
bear on the state of blood pH. The organism as a whole learns not to hold
its breath too long, or to avoid doing things, like diving too deep and too
long under water, that have an adverse effect on blood pH. The reorganizing
system itself knows no more about HOW various behaviors affect blood pH
than it understands the physics of the environment that make one object's
behavior depend on another object's behavior. It doesn't care about those
external processes. All it cares about is blood pH. It will accept any
behavioral organization, however bizarre, that proves to bring the pH back
to its proper reference level. By "accept" I mean simply that it will stop
reorganizing the behaving systems.

An example. If you were building a photoelectrically recharged battery-
powered robot to roam around free indefinitely, one of the internal aspects
its reorganizing system would have to sense would be the state of the
battery voltage. Another might be internal temperature of its circuit
boards. It would not have an intrinsic reference level for being in
sunlight or even for avoiding places where sunlight is unavailable. To know
about such things, the reorganizing system would have to have exteroceptors
and complex enough perceptions to be able to know where it is in the
environment, and what sorts of places would tend to be shadowed. You'd end
up putting most of the behavioral hierarchy and its intelligence into the
reorganizing system.

By sensing only battery voltage and circuit board temperature, the
reorganizing system could institute random changes in the behavioral
hierarchy (on the circuit board) whenever the sensed states of these
variables departed from their reference states in either direction. This
reorganization would cease only when the behaving system had learned to
seek a perceptual state of affairs that, as a side-effect, provides enough
sunlight to keep the battery optimally charged but not so much as to make
the circuit board overheat.

Suppose that the environment contained signal lights: steady light means
high temperature in a particular place, blinking light means a shadowed and
cool place. But the control systems don't have to know about those
meanings. Through reorganization, they would end up controlling for the
blinking light being close some of the time, the steady light being close
another proportion of the time, and other things the rest of the time. What
the control systems end up controlling for has nothing _a priori_ to do
with battery voltage or internal temperature. Neither the control systems
nor the reorganizing system knows about the rule that says "steady means
hot, blinking means cool and no light." The reorganizing system knows only
about battery voltage and circuit temperature. The behaving systems know
only about blinking and steady lights. The link between the lights and the
internal variables is a secret of the physical environment, which neither
the reorganizing nor the behaving systems have to discover. Ever.

The power of this arrangement is that the reorganizing system does not
depend on the environment to contain any particular causal rules. If
someone comes along and switches the lights, so that blinking means hot and
steady means cool shade, the behaving system will initially seek the wrong
places. But the effect will be lowering of the battery voltage or heating
of the circuit board, and reorganization will start. It will end when the
behaving system has switched which light it visits under what circumstances
(or succumbs). The reorganizing system, set up as I visualize it, can
survive under ANY ARBITRARY set of environmental rules connecting behavior
to internal state, at least enough of the time to matter at the species
level.

Having said all that, I should also say that reorganization will work
better if there are sensory signals available to the behavioral hierarchy
that indicate at least roughly some of the most critical intrinsic
variables. It would be easier to design that robot for reorganizability if
there were available a sensory signal entering the behavioral systems like
any other sensory signal, one indicating battery voltage and another
indicating circuit temperature. These sensory signals don't have to have
any initial significance in the behavioral hierarchy. But if they're
available, like "emotion" signals, the behavioral hierarchy can come to
perceive relationships between them and exteroceptive signals, and it can
learn to control for certain states of these relationships. Perceiving low
battery voltage by the behavioral systems just means perceiving low battery
voltage; that has no particular value to the behaving systems. That could
be a good thing or a bad thing. The reorganizing system, however, which has
not only its own perception of battery voltage but a built-in reference
level for it, will cause reorganization of the behavioral systems until
they learn to treat a low battery voltage signal as an error, and until
they link this error to the adjustment of some sensed state of the external
world that does, however indirectly, result in raising the battery voltage.
After a while, this behavioral system will quickly sense low battery
voltage and do what it has learned to do to increase that signal to its
reference value, and it will do this before the reorganizing system sees
enough error to cause reorganization to begin. That's why we learn to eat
at mealtimes instead of when we're hungry. We've learned that if we eat
regularly, a certain signal will be prevented from appearing. That signal
is one which, if it gets large enough, results in an episode of
reorganization and disorganization. So we say to ourselves that that is a
"bad" signal. The feeling of hunger is a "bad" feeling.

There are lots of problems yet to be solved before a real test of my scheme
can be devised. I'm plugging away with tests of various kinds of
reorganizing in simple control systems just to find out what happens when
you do it in different ways. Some ways are terribly slow; some work so fast
that I'm suspicious of them. I've already ruled out one: reorganizing by
switching the polarity of the output effects from 1 to minus 1 doesn't
work, because it causes tremendous transients that can flip the system into
unrecoverable states. Maybe if I stick a 0 in the middle as a third
possibility it will work better. Or maybe output reorganization has to
occur on a continuum between 1 and -1. We have to find these things out
before we can come up with a workable system

My main point here is that reorganization has to be based on sensing
variables with _a priori_ meaning to the organism. This rules out all, or
essentially all, environmental variables -- all of the variables that the
behavioral control systems are most concerned with controlling.
-----------------------------------------------------------------------
Best,

Bill P.

[Martin Taylor 920820 19:15]
(Bill Powers 920818.2300)

Before your long tutorial on reorganization got here, Jeff, Allen, and I had a
discussion on Jeff's proposal. I hadn't fully understood it from the written
description, and I don't think you did, either. But your tutorial helped us
to refine our understanding of your approach to reorganization. Let me see
if I can rephrase Jeff's as I understand it. (If Jeff doesn't accept this as
his, then consider it a derived version that I think should be considered).

I'm not saying I buy Jeff's system, but I don't find it incredible, either.
It sounds workable, and has a good evolutionary rationale based on real
long-term consistencies in the environment.

Jeff starts with the proposition that I use in my version of reorganization,
that persistent and growing error in any ECS will cause THAT ECS to reorganize
its local environment. There are all sorts of ways it could reorganize, none
of which imply any knowledge about what it is doing within the hierarchy or in
the outer world. We all agree on the need for that kind of ignorance (although
I have, as we say, a "flag up" on that, for later discussion in connection
with symbolic AI).

Jeff incorporates a separate control system, which could be a small hierarchy,
that you call a "parasite." I'd prefer to call it a "symbiote." It is a very
primitive thing, in evolutionary terms. It takes advantage of regularities
that do occur in the environment, such as that too much thermal energy disrupts
biologically active molecules, and that therefore it is a good idea for ANY
organism to move away from sources of high-level thermal energy. The symbiote
doesn't know how to make this move happen, so what it does is to execute
(possibly random) actions in the motion effectors. It knows where they are,
relative to the source of the unwanted energy, because it has evolved as part
of the organism's ancestors for long enough that sucessful symbiotes have
selectively survived. But it doesn't know any more than to create some strong
action in,say, an arm or a leg. It is not controlling in the sense we usually
use, with precision, but it is controlling in the sense that flailing around
quite often moves one away from a localized source of trouble, and that brings
the symbiote's percept into a range near its reference level. (Aside: control
to be away from some central point is a quite different problem from control
to be near the central point. It's not just a question of reversing the
sign of the output. It's the difference between "No" and "Yes" in language.
"No" means "Anything except what was proposed," a wide range of possibilities.)

What happens when the symbiote induces this relatively violent activity?
ECSs experience large and momentarily growing error if they have as part of
their feedback loop the actuators affected by the symbiote. This (in my
view) is a condition that lasts for a moment, in which any affected ECS
might experience a reorganization event. That might involve a change in
their perceptual input function or in their output connections, or it might
only affect their gains, or conceivably their reference input connections.
But it would be a local change. These ECSs know nothing of the symbiote.
But so long as occasions arise in which the symbiote is activated, they will
reorganize, and thus will (if their perceptual input connections have the
information) become less likely to act in such a way as to activate the
symbiote.

The symbiotes are, as you point out, already set up to respond to particular
environmental states, or, more properly, to internal states that in many
cases have simple environmental correlates. All the symbiotes can do is to
cause muscular activity in the world, or to cause the organism to emit
chemicals or to change its appearance (the scent of fear, facial flushing
or the colour change of a chameleon, for example). They do not, as you
suggest, "teach another system to do it the same way." They have primitive
stereotyped behaviour that is often immediately effective. But the evolved
main hierarchy can learn much better ways of not getting into the bad
situation, or of getting out of it once there.

Gordon Liddy can burn his hand, because he has a high-gain (high insistence)
ECS for something of which that act forms a part. High-gain systems can
over-ride the primitive symbiotes, but it is hard. Most people couldn't
do it, except under extreme duress, where there are large deviations from
many conflicting reference levels (excuse the shorthand).

You are quite right that an acting symbiote reduces the degrees of freedom
available for the main hierarchy to act, possibly drastically. But that
loss seems validated by experience. We do not maintain our flexibility of
action in a "primitive stress" situation. We tend to focus on removing the
stressful condition, which (with luck) the main hierarchy has learned to do.
Adults don't kick and scream when hunger gets extreme. They try to find
food, with some determination and ignoring other tempting distractions, at
least until their systems lack the energy sources to keep going. Babies
kick and scream, which, if Jeff is right, is the effect of the symbiotes.
Adults focus on control that works in a much more complex environment than
just the proximity of Mother.

The symbiote's job, as I see it, is not to signal "Pain" but to create error
in the hierarchy, especially in a part whose reorganization is likely to
result in useful control of whatever the symbiote does not like. "Pain" is
an interpretation of stimulus conditions accessible to the main hierarchy,
stimulus conditions that have been associated with action of the symbiote
in earlier days (and perhaps still). There are many kinds of pain, and if
Jeff's notion is right, maybe they are associated with different symbiotes.

···

========================

I think we have three candidate reorganization systems that are plausible, and
I am happy to hear that this is what you are working on right now. We may
actually be able to help with real (simulation) tests before too long (not
me personally, but my faithful contractors).

The three, in cartoon form:

1) Powers: Intrinsic variables (such as body chemical states) have reference
levels fixed by evolution. Errors in them induce a tendency to reorganize
in the CNS hierarchy. The actions of the CNS hierarchy affect the levels of
the intrinsic variables only as side-effects of whatever happens in the outer
environment. There may possibly be some localization of the reorganization
to the upper levels of a growing hierarchy, leaving lower levels relatively
immune. As the hierarchy grows, new, labile ECSs are added on top of
existing ones, but layers can grow laterally as well.

2) Taylor: Intrinsic variables participate in a single hierarchy along with
the CNS. The ECSs that have intrinsic variable levels as (part of) their
perceptual input functions are always at the top level of a building
hierarchy. Persistent and growing error in any ECS induces reorganization,
which may include the generation of a new ECS that receives its reference
signals from the reorganizing one and that gets its perceptual input and
sends its reference output to places connected to the reorganizing ECS.
Other aspects of reorganization include the making, breaking, and reweighting
of output-to-lower-reference connections, and Hebbian modification of
perceptual inputs.

3) Hunter: Intrinsic variables provide inputs to primitive symbiote "control"
systems that have the main function of inducing error in ECSs in the main
hierarchy when the intrinsic variables depart far from their reference
levels. Reorganization in the main hierarchy proceeds roughly according to
the Taylor model, except that the top-level ECSs are not connected with
intrinsic variables, and new ECSs are "grown" on top of the hierarchy, in
the same way as the Powers proposal.

I'm plugging away with tests of various kinds of
reorganizing in simple control systems just to find out what happens when
you do it in different ways. Some ways are terribly slow; some work so fast
that I'm suspicious of them. I've already ruled out one: reorganizing by
switching the polarity of the output effects from 1 to minus 1 doesn't
work, because it causes tremendous transients that can flip the system into
unrecoverable states. Maybe if I stick a 0 in the middle as a third
possibility it will work better. Or maybe output reorganization has to
occur on a continuum between 1 and -1. We have to find these things out
before we can come up with a workable system

The Little Baby is intended to provide a reorganization testbed, with a view
to using its findings for developing much more complex networks for dynamic
perceiving systems--speech recognition is the target. Chris Love is building
it, not Jeff and Allan, but they do talk together. I wish you had a Macintosh,
for 2 reasons: it is so much easier than a PC to work with, and we could
communicate programs. We are using the Object-Oriented visual dataflow
language Prograph. It's a bit hard to convert between that and a text-oriented
language like C, though algorithmic concepts and formulae are relatively easy.

We would love to know anything you find out about reorganization from your
experiments.

Maybe the transient when an output changes from -1 to 1 could be avoided if
the gain dropped during the reorganization time for a period close to the
inverse bandwidth of the feedback loop? Or perhaps the problem is that your
hierarchy is too symbolic, in the sense that a given ECS has a function very
distinct from all the others at its level, and therefore has only a very few
connections at a lower level? If an ECS provided references to many lower
ECSs that had highly correlated perceptual input functions, then the flipping
of the sign of one link would not produce a large transient. It would
be much like having a continuum of output weights, and I think it might also
provide for more flexibility in the final (reorganized) system. But it would
be computationally expensive to simulate.

Just a couple of suggestions.

Martin