[From Bill Powers (920818.2300)]
Jeff (910818) --
You are running into the same conceptual problems that eventually led me to
propose the reorganizing system as I did.
In this diagram:
R* <- top-level reference
···
--------
> ECSs | <- top-level ECS
--------
/ \
> --------
direct ->| | ECSs | <-untrained ECSs
input | --------
to top | / \
> / \
...................................
\ / \
Input Output <- environment
... you have both levels of system controlling the same input variable. If
you simply connected the output of the higher system to the output below
the line, you'd have a control system.
In effect, you're trying to introduce a teacher that already knows how to
control the input, and it's teaching another system to do it the same way.
This is the problem that all modelers of self-organizing systems eventually
have to face. The problem is that reorganization is goal-directed: it must
aim to build an organization that is good for the organism. Its goals,
therefore, must already take into account the "value" (to the organism) of
the variables it senses, and its actions must build behaving systems that
control the environment in a way that satisfies these values. But the
untrained ECS is not yet an ECS, so it doesn't know what to sense, or how
any particular output will affect what it senses. It doesn't know what a
"good" target setting of the sensed variable would be. So it seems that the
reorganizing system already has to be able to control the same sensed
variable that the untrained system is going to be controlling, and it must
know the reference level that is best, and it must know what outputs are
needed to effect control. If that's true, why doesn't the higher system
simply actuate the outputs to control the variable? If it already knows
what to do, why doesn't it just do it?
In the second version, you have a control system that avoids touching a hot
place. This system is built in, and its reference signal is fixed. You say
The rest of the Baby will presumably eventually re-organise
to avoid approaching the hot-spot. There are several strategies that
will succeed, and the one chosen depends on the learning mechanism.
The fixed control system is an isolated parasite in the hierarchy: it
opposes any attempt by the rest of the system to use the same position
control system to touch the hot spot. It does so by creating a conflict,
which arises from the fact that it acts by using the same muscles that all
other control systems must use. Its reference signal, however, does not
come from the hierarchy; it is fixed. This system is therefore like a
separate organism living inside the hierarchy. Its actions, being unrelated
to activities in the hierarchy, constitute a disturbance, how powerful
depending on its loop gain. The rest of the hierarchy will not learn to
avoid the hot spot; it will (if it can) learn to overcome the disturbance.
You have given the rest of the hierarchy no reason to avoid touching this
spot; that is the business of the built-in circuit. So there is no basis
for the rest of the hierarchy to avoid trying to touch the hot spot. You
have put the recognition of the value of touching the hot spot into one
low-level system; to the rest of the hierarchy, pain is then just another
sensory signal, and there's no _a priori_ reason to hold it at any
particular reference level.
You still haven't explained reorganization: the parasitic system doesn't
reorganize. You have only created a situation in which reorganization in
the hierarchy would be necessary if the parasitic system interferes with
control by the rest of the hierarchy. So we are still left with the
question of how the rest of the hierarchy reorganizes, and what the basis
for and mechanism of such reorganization would be.
The problem with any built-in system or teacher is that you end up having
to put all the intelligence needed for control into the built-in system.
For instance, if Baby's finger reaches out and touches a hot spot, the
"high-level" system has to understand that the remedy is to pull the arm
back. If the arm pulls too far back and the elbow touches another hot spot,
the same system has to understand that now the remedy is to move the arm
forward. If the arm is immobilized by an obstacle, the system has to know
that the other hand must be used to knock the hot object away. So this
system has to know about relationships between painful sensations on all
parts of the body and the effects of tensing various muscles attached to
the skeleton. Before you're through designing this high-level system, you
will have built in the entire control hierarchy -- which then, presumably,
copies itself into the untrained part of the CNS. But in that case, why
bother with the untrained part? You're describing a system whose behavior
is entirely instinctive from birth.
There may be some built-in control systems low in the human hierarchy, at
birth. There probably are. But few of the "instinctive" behaviors we are
born with last long; most of them are reorganized away very early in life,
or turn into low-level control systems whose reference signals are set from
above. It's been known for decades that the ENTIRE nervous system is highly
plastic, even at the level of the spinal cord, the so-called "reflexes." We
are not a mass of built-in reflexes with higher systems trying to behave
around them. As we learn the higher levels of control, we lose the built-in
behaviors, or their reference signals become variable.
Low-level systems with fixed references are disconnected from the
hierarchy. The more of them there are, the less freedom there is for the
higher systems to behave. Remember that ALL systems have to act by using
the same set of about 600-800 spinal control systems. Those control systems
must be accessible, via their reference signals, to ALL systems that need
to use them to generate actions. And that means all systems that produce
behavior.
Another problem with considering things like pain a "high-level" controlled
variable is that you then have to consider pain to be similar to things
like morals or patriotism, which can lead a person to suffer pain
deliberately, like G. Gordon Liddy. But if you delegate the control of pain
to a hierarchically low level of control, it becomes impossible to explain
why pain has a negative value at higher levels or how the higher levels can
deliberately accept pain.
Neither of the approaches you outline above can explain real behavior. If
there's a built-in control system for avoiding hot spots, how hot does the
spot have to be in order to prevent touching it? If approach to the hot
spot, and withdrawal from it, are under control of an automatic system, how
is it that a person can hold the hands at just the right distance from a
fire to keep them nicely warm? How is it that a person can take hold of a
hot object and quickly move it to a safer place, suffering pain for an
instant in order to avoid some higher-level problem like melting the handle
of a pot?
Even more important, how is it that a person can learn not to touch things
that are ARBITRARILY hot? The burner of an electric stove that has been
turned off a few minutes ago. A light bulb that looks perfectly ordinary
but was just turned off. A soldering iron that doesn't glow. A dark rock or
a sandy beach out in the sunshine. There's no way that a built-in system
can know what aspects of the environment are likely to give rise to pain.
It doesn't have enough complexity in its sensors, and even if it does,
those sensors couldn't have evolved to understand an environment that is
infinitely variable, and in which relationships exist which have never
existed before in evolutionary time. The real test that a reorganizing
principle must pass is that it must produce an organization that is good
for the organism in an environment where there is no _a priori_ reason for
any particular sensed state of affairs being either good or bad for the
organism. The built-in value of things like pain must guide reorganization
at ALL levels, including the highest.
Reorganization has to be based on the control of variables that have _a
priori_ value to the organism. Nothing in the external world, as sensed,
can have any such value, not for a human being. The external world is
simply too variable. So reorganization can't be based on the state of the
external world. It must be based on the state of the internal world.
The reaction to pain must be based not just on the pain signals but on the
REASON FOR WHICH pain signals are not good for the organism. That is, the
value of pain does not lie in the sensory signal; it lies in the state of
the organism that gives rise to the sensory signal. It is possible, in
short, that there are _a priori_ aspects of the state of the organism that
serve as indicators of viability. These are the things that a reorganizing
system must sense. In effect, the reorganizing system must sense and
control for _viability itself_ as indicated by certain critical variables
that it monitors.
In keeping with the same principles that apply in the hierarchy, the output
effects of the reorganizing system do not have to employ outputs that are
the same as its inputs. We control sensed force not by acting on the tendon
receptors, but by sending signals to the muscles. We control the appearance
of objects in the world not by somehow creating a direct effect on the
objects, but by setting reference signals for control systems that vary the
angles at the joints in arms, wrist, and fingers. We cause one controlled
variable to change as a means of controlling a different one that is
dependent on, but physically different from, the one we use as output.
This is my principle of reorganization. If the pH of the blood stream is
too high, the reorganizing system does not react to the drop in pH by
increasing pH directly; it doesn't know how. It does so by altering systems
that control for sensory variables. Eventually it will alter control of
just those sensory variables that indicate states of the environment that
bear on the state of blood pH. The organism as a whole learns not to hold
its breath too long, or to avoid doing things, like diving too deep and too
long under water, that have an adverse effect on blood pH. The reorganizing
system itself knows no more about HOW various behaviors affect blood pH
than it understands the physics of the environment that make one object's
behavior depend on another object's behavior. It doesn't care about those
external processes. All it cares about is blood pH. It will accept any
behavioral organization, however bizarre, that proves to bring the pH back
to its proper reference level. By "accept" I mean simply that it will stop
reorganizing the behaving systems.
An example. If you were building a photoelectrically recharged battery-
powered robot to roam around free indefinitely, one of the internal aspects
its reorganizing system would have to sense would be the state of the
battery voltage. Another might be internal temperature of its circuit
boards. It would not have an intrinsic reference level for being in
sunlight or even for avoiding places where sunlight is unavailable. To know
about such things, the reorganizing system would have to have exteroceptors
and complex enough perceptions to be able to know where it is in the
environment, and what sorts of places would tend to be shadowed. You'd end
up putting most of the behavioral hierarchy and its intelligence into the
reorganizing system.
By sensing only battery voltage and circuit board temperature, the
reorganizing system could institute random changes in the behavioral
hierarchy (on the circuit board) whenever the sensed states of these
variables departed from their reference states in either direction. This
reorganization would cease only when the behaving system had learned to
seek a perceptual state of affairs that, as a side-effect, provides enough
sunlight to keep the battery optimally charged but not so much as to make
the circuit board overheat.
Suppose that the environment contained signal lights: steady light means
high temperature in a particular place, blinking light means a shadowed and
cool place. But the control systems don't have to know about those
meanings. Through reorganization, they would end up controlling for the
blinking light being close some of the time, the steady light being close
another proportion of the time, and other things the rest of the time. What
the control systems end up controlling for has nothing _a priori_ to do
with battery voltage or internal temperature. Neither the control systems
nor the reorganizing system knows about the rule that says "steady means
hot, blinking means cool and no light." The reorganizing system knows only
about battery voltage and circuit temperature. The behaving systems know
only about blinking and steady lights. The link between the lights and the
internal variables is a secret of the physical environment, which neither
the reorganizing nor the behaving systems have to discover. Ever.
The power of this arrangement is that the reorganizing system does not
depend on the environment to contain any particular causal rules. If
someone comes along and switches the lights, so that blinking means hot and
steady means cool shade, the behaving system will initially seek the wrong
places. But the effect will be lowering of the battery voltage or heating
of the circuit board, and reorganization will start. It will end when the
behaving system has switched which light it visits under what circumstances
(or succumbs). The reorganizing system, set up as I visualize it, can
survive under ANY ARBITRARY set of environmental rules connecting behavior
to internal state, at least enough of the time to matter at the species
level.
Having said all that, I should also say that reorganization will work
better if there are sensory signals available to the behavioral hierarchy
that indicate at least roughly some of the most critical intrinsic
variables. It would be easier to design that robot for reorganizability if
there were available a sensory signal entering the behavioral systems like
any other sensory signal, one indicating battery voltage and another
indicating circuit temperature. These sensory signals don't have to have
any initial significance in the behavioral hierarchy. But if they're
available, like "emotion" signals, the behavioral hierarchy can come to
perceive relationships between them and exteroceptive signals, and it can
learn to control for certain states of these relationships. Perceiving low
battery voltage by the behavioral systems just means perceiving low battery
voltage; that has no particular value to the behaving systems. That could
be a good thing or a bad thing. The reorganizing system, however, which has
not only its own perception of battery voltage but a built-in reference
level for it, will cause reorganization of the behavioral systems until
they learn to treat a low battery voltage signal as an error, and until
they link this error to the adjustment of some sensed state of the external
world that does, however indirectly, result in raising the battery voltage.
After a while, this behavioral system will quickly sense low battery
voltage and do what it has learned to do to increase that signal to its
reference value, and it will do this before the reorganizing system sees
enough error to cause reorganization to begin. That's why we learn to eat
at mealtimes instead of when we're hungry. We've learned that if we eat
regularly, a certain signal will be prevented from appearing. That signal
is one which, if it gets large enough, results in an episode of
reorganization and disorganization. So we say to ourselves that that is a
"bad" signal. The feeling of hunger is a "bad" feeling.
There are lots of problems yet to be solved before a real test of my scheme
can be devised. I'm plugging away with tests of various kinds of
reorganizing in simple control systems just to find out what happens when
you do it in different ways. Some ways are terribly slow; some work so fast
that I'm suspicious of them. I've already ruled out one: reorganizing by
switching the polarity of the output effects from 1 to minus 1 doesn't
work, because it causes tremendous transients that can flip the system into
unrecoverable states. Maybe if I stick a 0 in the middle as a third
possibility it will work better. Or maybe output reorganization has to
occur on a continuum between 1 and -1. We have to find these things out
before we can come up with a workable system
My main point here is that reorganization has to be based on sensing
variables with _a priori_ meaning to the organism. This rules out all, or
essentially all, environmental variables -- all of the variables that the
behavioral control systems are most concerned with controlling.
-----------------------------------------------------------------------
Best,
Bill P.