# The Monster

[From Bill Powers (970620.1322 MDT)]

I have created a monster. Gary Cziko asked for a program illustrating
reorganization, so after looking over some of my previous attempts I
decided to start over and write another one, based on my experience with
the others. While I was at it, I included something that had come up in
previous discussions, an arbitrary environment in which lots of variables
interact with each other, and where the effects of the outputs of the
control systems enter this collection at random, and the variables to be
sensed are also selected at random.

In the present version, there are 15 control systems operating, which
perceive variables made up of weighted sums from 12 sensor inputs each, and
which produce (single) output signals. These control systems act on 75
environmental variables.

The environment is structured as follows. There is an output vector with a
length equal to the number of environmental variables, loaded with randomly
selected pointers to the control system outputs; thus each output variable
is affected by a randomly-selected output from the control systems. There
is also a square matrix of randomly-selected numbers between 1 and -1,
which establishes a contribution from each environmental variable to each
other environmental variable.

On each iteration, all the environmental variables are set to zero. Then
the outputs of the control systems are added to each variable according to
the output vector, and finally the square matrix is scanned, each
environmental variable in turn having the weighted value of all the other
environmental variables added to it. The order of scanning, of course,
makes a difference in the final result. This makes the values of the
environmental variables depend on the control system outputs in a very
strange way, which I haven't bothered to characterize yet. The idea was
just to provide a very complicated environmental feedback function, with
all kinds of internal relations that are beyond my immediate grasp.

The connections from the environment to the sensors of the control systems
are also selected at random. Each of the multiple sensors at the input to
each control system is either connected or not connected to each of the
environmental variables, in a three-dimensional matrix of randomly-selected
boolean variables, each with about a 30% chance of being TRUE.

Once these environmental vectors and matrices have been loaded (part of the
initialization), they can be saved in a file so the same environmental
setup can be used again and again. Right now, however, a new environmental
setup is created each time the program is run.

Also as part of the initialization, a set of reference signal values is
chosen at random in the range from -100 to 100.

The control systems are all integrating control systems with an output
integration factor of 0.001. The perceptual signal in each system is a
weighted sum of 12 sensor inputs, derived through the input Boolean matrix
from the environmental variables. The weights are what get reorganized.

In each input function there are 12 sensors and 12 weights, and also 12
"DeltaWt" variables that get added to the existing weights on every
iteration. Reorganization consists of selecting a new set of 12 DeltaWt
variables at random, with values between -1 and 1.

The square of the error signal in each control system is compared with its
value from the past iteration. If the squared error is increasing, a
reorganization takes place: a new set of DeltaWt variables is calculated.
The DeltaWt variable is multiplied by the absolute value of the error
signal and also by a general SPEED constant (set to 1e-6), and whether
there is a reorganization or not, the product is added to the existing
weight. The weights given to the sensor inputs are thus always changing,
but they change more slowly as the absolute error signal becomes smaller.

Believe it or not, this set of control systems usually converges to the
point where the errors in all the systems are nearly zero -- sometimes zero
to three decimal places. During convergence there are often episodes where
the errors get larger again and fluctuate, but they almost always manage to
get smaller again. In the present version, the reference signals remain
constant and there are no disturbances -- one step at a time!

When the program is run, a new environmental structure is set up, and
iterations then proceed until a key is struck. On the screen at the left,
the reference signal, perceptual signal, and error signal from each control
system are displayed. On the right, all 75 environmental variables are
shown in three columns. And that's it -- that's all the program does right
now. There is a certain hypnotic fascination in watching the numbers change.

know what the final input weights are, or how the outputs are connected to
the environmental variables. I don't know if the input functions have
achieved orthogonality at the end. What we have here is a monster that
needs further study. I welcome all efforts by others to find out what's
going on. All I know at the moment is that we have here a set of 15
individually-reorganizing control systems which are capable of making their
inputs match a particular reference value independently of what the others
are doing. They are reorganizing their perceptual input functions to do
this, by the E. coli method. The environment comes to some steady state,
the significance of which I do not know. But it seems to work.

The source code is appended. I will post the executable code on my ftp
site, and make a link to it on my Web page.

Best,

Bill P.

···

================================================================
program reorg2;

{
NSYS control systems act on an environment by putting out NSYS output
signals, o. The environment contains NVAR environmental variables, v,
each of which is affected by zero to NSYS of the NSYS output signals,
according to weights WOUT. The environmental variables interact
according to physical laws that define the properties of the environ-
ment.

Each of the control systems contains an input function that has 1 to NVAR
sensors, each sensor detecting the state of one of the environmental
variables. The weighting given to each sensor signal, InWt, is adjustable
by a reorganization system associated with each control system. The
weighted sum of the sensor signals produces a perceptual signal p, which
is compared with a reference signal r to produce an error signal e. The
error signal in each control system is multiplied by an output gain
factor g and integrated to produce the output signal.

Reorganization is driven by the error signal in each control system.
The rate of reorganization is determined by the rate of change of
the squared error, ErSqNew - ErSqOld. Each input weight has associated
with it a DeltaWt, which is multiplied by the squared error and added
to the input weight on every iteration. Thus the input weights
continually change. Every time a reorganization occurs, the size of
DeltaWt is selected randomly from the range between plus and minus
MAXDELTA.

}

uses dos, crt;

const NSYS = 15;
NVAR = 75;
NIN = 12;
SPEED = 1e-6;
MAXERR = 0.0;

type csystype = record
p,r,e,g,o: single;
Sensor: array[1..NIN] of single;
InWt: array[1..NIN] of single;
DeltaWt: array[1..NIN] of single;
ErSqNew, ErSqOld: single;
end;

csysptr = ^csystype;

envtype = record
envvar: array[1..NVAR] of single;
EnvMatrix: array[1..NVAR,1..NVAR] of single;
OutMatrix: array[1..NVAR] of integer;
SenseMatrix: array[1..NSYS,1..NIN,1..NVAR] of Boolean;
end;

var csys: array[1..NSYS] of csystype;
env: envtype;
envfile: file of envtype;
ch: char;

procedure initializesys;
var i,j,k: integer;
begin
for j := 1 to NSYS do
with csys[j] do
begin
for i := 1 to NIN do InWt[i] := 0.0;
o := 0.0;
r := 200.0*(random - 0.5);
g := 0.001;
end;
assign(envfile,'environ.mat');
end;

procedure setenvironment;
var i,j,k: integer;
begin
i := 1;
{\$i-}
reset(envfile);
{\$i+}
if (IOResult = 0) and (i = -1) then read(envfile,env)
else
begin
with env do
for i := 1 to NVAR do
begin
for j := 1 to NVAR do envmatrix[i,j] := 1.99*(random - 0.5);
OutMatrix[i] := random(NSYS-1) + 1;
end;
for i := 1 to NSYS do
for j := 1 to NIN do
for k := 1 to NVAR do
env.SenseMatrix[i,j,k] := random > 0.7;
rewrite(envfile);
write(envfile,env);
end;
close(envfile);
end;

procedure control;
var i,j,k: integer;
begin
for i := 1 to NSYS do
with csys[i] do
begin
p := 0.0;
for j := 1 to NIN do
begin
sensor[j] := 0;
for k := 1 to NVAR do
with env do
if SenseMatrix[i,j,k] then {if connection to env variable}
sensor[j] := sensor[j] + envvar[k];
p := p + InWt[j]*Sensor[j];
end;
e := r - p;
ErSqOld := ErSqNew;
ErSqNew := e*e;
o := o + g*e;
end;
end;

procedure environment;
var i,j: integer;
begin
with env do
begin
for i := 1 to NVAR do {zero env variables}
envvar[i] := 0.0;
for i := 1 to NVAR do {output effects}
envvar[i] := envvar[i] + csys[OutMatrix[i]].o;
for i := 1 to NVAR do
for j := 1 to NVAR do {internal effects}
envvar[i] := envvar[i] + EnvMatrix[i,j] * envvar[j];
end;
end;

procedure reorganize;
var i,j: integer;
begin
for j := 1 to NSYS do
with csys[j] do
begin
if (ErsqNew - ErSqOld) > MAXERR then
for i := 1 to NIN do DeltaWt[i] := 2.0*(random - 0.5);
for i := 1 to NIN do
InWt[i] := InWt[i] + DeltaWt[i]*abs(e)*SPEED;
end;
end;

procedure showresults;
var i,j,k: integer;
begin
window(1,1,40,NSYS+1);
gotoxy(1,1); clreol;
for i := 1 to NSYS do
with csys[i] do writeln('r=',r:8:3,' p=',p:8:3,' e=',e:8:3);
window(41,1,80,25);
clrscr;
for i := 0 to NVAR-1 do
begin
gotoxy((i div 25)*9 + 1,(i MOD 25) +1); with env do write(envvar[i+1]:6:0);
end;
end;

begin
clrscr;
randomize;
initializesys;
setenvironment;
while not keypressed do
begin
environment;
control;
reorganize;
showresults;
end;
end.

[Hans Blom, 970701c]

(Bill Powers (970620.1322 MDT))

I have created a monster.

You sure have ;-). And it is, as you note, impossible to "understand"
the creature that you built. Reminds me of Frankenstein. And of all
those ALife demos where "creators" intuitively grab a number of
"laws" to describe their creatures and their environment, do a
simulation, present the simulation results, and proceed to make grand
generalizations based on their single example "world".

This approach does not attract me at all. Usually, very complex
behavior results from the few very simple "laws" that govern the
artificial "world". I see two major problems. The first is that the
situation is normally far too complex to allow any meaningful
analysis at all ("why do we observe exactly this behavior given these
laws?"). The second is that a single simulated world is hardly a good
enough "data base" for the frequently far-reaching conclusions: a
sample of size one does not allow statistically meaningful
generalizations.

... an arbitrary environment in which lots of variables interact
with each other, and where the effects of the outputs of the control
systems enter this collection at random, and the variables to be
sensed are also selected at random.

In the present version, there are 15 control systems operating,
which perceive variables made up of weighted sums from 12 sensor
inputs each, and which produce (single) output signals. These
control systems act on 75 environmental variables.

Sure, you could spend the rest of your life studying the behavior of
your Creature and come up with some answers why this Creature behaves
as it does. But why _this_ Creature, other than the reason that you
have invented it, or that it happens to be there? An in-depth study
on something that might lead to a more general theory than a mere
explanation of some random Creature.

On the other hand, creating a Monster and proving that you are able
to control it is the stuff of which many myths and tales are woven:
it's a heroic challenge...

Greetings,

Hans

[From Bill Powers (970701.0713 MDT)]

Hans Blom, 970701c--

I have created a monster.

You sure have ;-). And it is, as you note, impossible to >"understand" the

creature that you built. Reminds me of >Frankenstein. And of all those
ALife demos where "creators" >intuitively grab a number of "laws" to
describe their creatures and >their environment, do a simulation, present
the simulation results, >and proceed to make grand generalizations based on
their single >example "world".

Right, I agree with you. My pseudo-model of the environment was
really just a desperation move made because I was truly stuck. I hoped to
get lucky, and didn't. I already knew that by using an orderly environment
I could get the reorganization thing to work, but I wanted to see if it
would still work without my deliberately putting a known kind of order into
the environment. Well, I still don't know.

Behind this apparently senseless approach there is really an important
question: is there really order in the world, or do we simply create an
orderly world by the way we perceive it and act on it? This is a tough
epistemological question, and all the answers people have offered turn out
to assume something that can't be shown to be true. So I thought, what
reorganization process sort out the effects of randomly-connected outputs
on a set of randomly-connected input signals? Will the set of control
systems find independent dimensions of control, in spite of the fact that
there aren't any?

In one sense the answer was "yes." As long at I gave the control systems a
randomly-selected set of CONSTANT reference signals, all the control
systems would eventually end up with zero error. The 15 different ways of
producing perceptual signals and generating outputs ended up with the
perceptual signals all matching 15 arbitrarily-selected reference signals.
All the weights given to the sensor signals came to steady values, and all
meaning changing more and more slowly.

I've done this again, this time with a less blindly-created environment.
What I did was to make each environmental variable depend on each control
system output (summed) through a weighting factor ranging randomly from -1
to 1, without the fancy internal interactions that I didn't understand. I
still get the same result: the input weights adjust until each perceptual
signal matches an arbitrarily set reference signal. This process takes 10
minutes or so to be sure convergence will happen. So obviously, if I
changed the reference signal _very slowly_ (like one new value every hour),
the set of 15 systems would continue to keep the errors very close to zero
in all systems.

Unfortunately, the equilibrium thus reached is unstable. If I turn off the
reorganization process, the errors, very gradually at first, start to creep
up again, and then increase faster and faster until the whole thing blows
up with a floating point error. Apparently, this arrangement of control
systems finds a point where all the perceptions match all the reference
signals, but instead of being like finding the bottom of a valley, the
process is like finding the top of a hill. As soon as you stop
hill-climbing, you fall off.

This would be interesting in itself if the control systems were really
controlling independently. But as soon as I make the reference signals
start to vary in a smoothed random way, the convergence goes away. The
reorganizing system can't tell the difference between errors caused by
incorrect input weightings and errors caused by changes in the reference
signals. Maybe if I let the program run for several months, there would be
a statistical bias in the right directions, but if there's any tendency of
that kind I can't see it over a period of an hour or two. Actually, I think
that the instability we see even with constant reference signals simply
prevents convergence when the reference signals are changing -- the
tendency of the control process is _away_ from the apparent convergence
condition. There is weak but unavoidable positive feedback in the whole
arrangement -- somewhere.

All this would be even more interesting if I could be sure that I wasn't
studying some stupid programming error. I can't find any, but that doesn't
prove there aren't any. It would be nice if someone else got interested in
this; at least a second person wouldn't make the SAME stupid programming
errors (unless God REALLY doesn't want us prying into this secret of
Creation).

I know that this sort of made-up problem is quite likely a waste of time.
But suppose it proved to be either true or false that a set of
self-reorganizing control systems could find order and get control in an
environment where the connections between output and input were really
random -- in other words, to create order out of total disorder. Either
result would be important, epistemologically.

Suppose it could be shown that in order to get stable control over an
environment, relative to _variable_ reference signals, there must be a
corresponding kind of order in that environment -- really, truly, existing
in that environment, independently of the observer/actor. Suppose this
instability I am seeing is really an inevitable implication of having an
environment with no inherent internal order. I think this would be a very
strong abstract proof of something that philosophers can only assume, and
scientists can only hope for.

The opposite conclusions would be equally interesting, if not so
encouraging with respect to "knowledge of nature." It would say that the
apparent orderliness of the world -- and of our own control systems -- is
only the result of some endless process of reorganization, that maintains
apparent order in a world that would soon degenerate into chaos if the
reorganizing process ever stopped.

All this is a hell of a lot to try to get out of a 200 or 300 line program,
but I don't seem to have a lot of control over what interests me. Right now
this seems to interest me, so I'll give it a little more effort. The next
thing I want to try is to put some order into the environment -- attributes
that are independently affected by the outputs of the systems -- and see if
we get stable control out of this thing even after reorganization is
turned off.

Best,

Bill P.

[Hans Blom, 970701e]

(Bill Powers (970701.0713 MDT))

I already knew that by using an orderly environment I could get the
reorganization thing to work, but I wanted to see if it would still
work without my deliberately putting a known kind of order into the
environment. Well, I still don't know.

like x = A * u, where x and u are vectors and A is a constant matrix
with random coefficients? If so, I might have something useful to
contribute. I guess that is what you mean, because you say:

What I did was to make each environmental variable depend on
each control system output [u] (summed) through a weighting factor
ranging randomly from -1 to 1 [so -1 < A(i,j) < 1], without the
fancy internal interactions that I didn't understand.

If my guess is right, you have the "standard" environment equation,
but now in vector/matrix form. The control goal is, but in vector
notation, quite standard as well: minimize |x - s|, where s is the
setpoint vector and the |x - s| is the sum of squares of the
individual components of x and s, i.e. the square of the Euclidean
distance between x and s. If all this is so, we seem to have a
standard multi-input multi-output control problem of a constant
"plant" with unknown but constant parameters -- and no dynamics.

I still get the same result: the input weights adjust until each
perceptual signal matches an arbitrarily set reference signal. This
process takes 10 minutes or so to be sure convergence will happen.

The problem might be the speed of convergence. How do you adjust the
input weights? Randomly or using some sort of hill-climbing?

Greetings,

Hans

[From Bill Powers (970701.1300 MDT)]

Hans Blom, 970701e--

If my guess is right, you have the "standard" environment equation,
but now in vector/matrix form. The control goal is, but in vector
notation, quite standard as well: minimize |x - s|, where s is the
setpoint vector and the |x - s| is the sum of squares of the
individual components of x and s, i.e. the square of the Euclidean
distance between x and s. If all this is so, we seem to have a
standard multi-input multi-output control problem of a constant
"plant" with unknown but constant parameters -- and no dynamics.

I'm not sure this is exactly what I have (although you're right that there
are no dynamics). Let v[i] be the ith environmental variable, and o[j] the
output of the jth control system. We then have

v[i] = SUM{A[i,j]*o(j)},
j

where A[i,j] is a matrix with entries randomly selected between -1 and 1.
Is that the "standard" setup?

At the inputs to the jth control system, there are k sensor signals. Each
of the sensor signals comes from _one_ of the i environmental variables,
selected at random at initialization time. Each sensor for the jth control
system is associated with a weight, w[j,k], and the perceptual signal for
the jth control system is

p[j] = SUM{w[j,k]*s[k]}
k

where s[k] is the signal from sensor k.

To give an example, if there are 15 sensors per control system, and 75
environmental variables, then sensors for a given control system are
connected to, at most, 1 environmental variable out of 5.
The sensors for each control system are connected to environmental
variables at random, by associating an index number with each sensor. If
NIN is the number of input sensors per control system, and NVAR is the
number of environmental variables, then the connections are established

for k := 1 to NIN do
index[k] := random(NVAR-1) + 1;

where NIN is the number of input sensors for a given control system,
and NVAR is the number of environmental variables.

Duplicates are permitted.

A sensor signal is identical to the environmental variable v it is
connected to, so the equation for the perceptual signal is really

p[j] = SUM{w[j,k]*v[index[k]]}
k

The overall objective is to minimize the scalar (r - p) in each control
system separately (not the global |r - p|), where r is the
reference-signal. This is done by reorganizing the weights applied to the
sensor inputs for each control system: the set of w[k] vectors. In
principle, each perceptual signal could come from a different combination
of environmental variables, through a different weighting vector, so that
negative feedback is maintained for each of the control systems
independently of the others. The hope is that the reorganizing process will
find sets of input weights for every control system that permit this
independent control.

I still get the same result: the input weights adjust until each
perceptual signal matches an arbitrarily set reference signal. This
process takes 10 minutes or so to be sure convergence will happen.

The problem might be the speed of convergence. How do you adjust the
input weights? Randomly or using some sort of hill-climbing?

Each control system reorganizes independently of the others. The basis for
reorganization is the rate of change of the squared error signal (the
absolute value also works). If the present squared error signal is greater
than the previous value, a reorganization takes place. I sample the error
on every iteration; it would also be possible to sample it less frequently,
to detect smaller rates of change of the squared error.

Reorganization is done by associating a weight-change vector dw with each
weight vector w. On each iteration, whether there is a reorganization or
not, the weight is changed according to

for k := 1 to NIN do
w[j,k] := w[j,k] + dw[j,k]*SPEED*abs(error).

The value of SPEED is set to anywhere from 1E-5 to 1E-8, in a CONST
declaration. When a reorganization occurs, all that happens is that all the
values w[k] are replaced by new random numbers between 1 and -1. This
determines a new direction and speed with which the weight vector will
change on successive iterations. If the directions of change are all or
mostly correct, the error will be continually decreasing and there will be
no reorganizations. Only when the weights change enough to cause errors to
start increasing again will another reorganization take place. So this is
the E. coli principle. There will be periods of rapid reorganization when
wrong directions of change in the weights occur, with occasional periods of
no reorganization while the weights are changing in the right direction.
The weights will slowly change, on the average, in the direction that
results in fewer reorganizations.

For a single isolated control system, this is guaranteed to bring the error
to an arbitrarily small value. Notice that the amount of weight change
depends on the amount of error, so as the error gets smaller, the rate of
change of the weights also gets smaller, in proportion.

When many control systems share the same environment, affecting not only
their own inputs but the inputs of the other systems as well, each control
system is being disturbed by the control actions of the others. It seems
that when the reference signals remain constant, each system can find a set
of input weights that will minimize its own error: this has yet to fail.
But as I noted, when the reference signals are varying, there are
considerable problems with convergence -- it doesn't happen on any human
time scale. Of course there are lots of parameters to play with, and I may
just have picked the wrong ballpark. I don't know.

This model obviously has applications to social systems as well as multiple
systems inside a single organism.

Best,

Bill P.

[Martin Taylor 970701 17:50]

Bill Powers (970701.1300 MDT) to Hans Blom, 970701e--

I'm not sure this is exactly what I have (although you're right that there
are no dynamics). Let v[i] be the ith environmental variable, and o[j] the
output of the jth control system. We then have

v[i] = SUM{A[i,j]*o(j)},
j

where A[i,j] is a matrix with entries randomly selected between -1 and 1.

etc.

Bill, are there no disturbances in your environment? If there aren't, are
you really solving the control problem (i.e. finding weights such that
control errors are maintained low against variation in disturbance), or
are you really finding meta-stable balance points among the control
systems?

I note that you use

The basis for
reorganization is the rate of change of the squared error signal (the
absolute value also works).

In that part of the Little Baby project that worked, we found it better
to use both the magnitude and the derivative of the squared error:

a*e*e + b*e*de/dt (or c*e*(e + k*de/dt)).

If you just use the fact that error is increasing fast, or just the fact
that error is large, you may be induced to change a well-constructed
control unit that is currently subject to an uncorrected disturbance
(from the other control units, in your case). But if the error is
large and increasing, it's probably a good idea to modify the control
unit:-)

None of this may be relevant to your actual problem, but they are notions
that occur to me.

Martin

[From Bill Powers (970701.2051 MDT)]

Martin Taylor 970701 17:50]--

Bill, are there no disturbances in your environment? If there >aren't, are

you really solving the control problem (i.e. finding >weights such that
control errors are maintained low against >variation in disturbance), or
are you really finding meta-stable >balance points among the control systems?

The only disturbances acting on any one control system are the outputs of
the others. It will take some analysis to find out why the system behaves
as it does -- it's possible that there are some positive feedback loops
embedded among the negative ones. I can find out by letting the system come
to equilibrium, and then stop it, break the loop, and look at how each
output signal alone affects the corresponding perceptual signal. All the
negative loops will automatically resist disturbances, but of course any
positive loops would amplify their effects.

... the basis for reorganization is the rate of change of the >>squared

error signal (the absolute value also works).

In that part of the Little Baby project that worked, we found it >better

to use both the magnitude and the derivative of the squared >error:

a*e*e + b*e*de/dt (or c*e*(e + k*de/dt)).

The magnitude (unsigned) of the error enters when I add the deltas to the
weights. I don't know how your system used the value of the above
expression, but in my system it wouldn't work: there is no complicated
logic in it. If d/dt(e^2) > 0, reorganize. Otherwise, don't.

If you just use the fact that error is increasing fast, or just the >fact

that error is large, you may be induced to change a well->constructed
control unit that is currently subject to an uncorrected >disturbance (from
the other control units, in your case). But if the >error is large and
increasing, it's probably a good idea to modify >the control unit:-)

It's not interference from the other control systems that causes a problem;
the errors all converge to zero quite nicely, as long as I don't start
varying the reference signals. I would probably have the same problem if I
introduced variable external disturbances. One step at a time.

Best,

Bill P.

[Hans Blom, 970702]

(Bill Powers (970701.1300 MDT))

I'm trying to make sense of the Monster problem. It still isn't fully
clear to me. Let me check with you where I go wrong.

Let v[i] be the ith environmental variable, and o[j] the output of
the jth control system. We then have

v[i] = SUM{A[i,j]*o(j)},
j

where A[i,j] is a matrix with entries randomly selected between -1
and 1. Is that the "standard" setup?

Yes. In vector/matrix notation, we simply write v = A * o. I assume
that you're not very familiar with it, but this notation v = A * o
simply translates into the equivalent

v[i] = SUM{A[i,j]*o(j)},
j

or, in words: every component of v is the weighted sum of all the
components of o. Weights A[i,j] can, of course, be zero. What is
essential when using this notation is to give the dimensions of the
vectors (and thus of the matrices): how many components do v and o
have?

At the inputs to the jth control system, there are k sensor signals.
Each of the sensor signals comes from _one_ of the i environmental
variables, selected at random at initialization time.

A vector/matrix notation might be s = R * v, where R is a randomly
chosen matrix with the property that it selects one component of v to
be passed to s. That is, most of R's entries are 0 and some are 1.

Combining v = A * o with s = R * v gives s = R * A * o.

Each sensor for the jth control system is associated with a weight,
w[j,k], and the perceptual signal for the jth control system is

p[j] = SUM{w[j,k]*s[k]}
k

where s[k] is the signal from sensor k.

In vector/matrix notation, this would be p = W * s.

To give an example, if there are 15 sensors per control system, and
75 environmental variables, then sensors for a given control system
are connected to, at most, 1 environmental variable out of 5.

Let me try to summarize the above in a block diagram. The numbers in
parentheses are the dimensions of the vectors. Given your example, I
assume that there are 15 control systems with 15 reference levels r
and 15 perceptual inputs p. The environment is thus controlled by the
15 outputs of the controllers, and it generates 75 environmental
variables v, where each component of vector v is a weighted sum of
the 15 control outputs and the weights A[i,j] are random and drawn
from a uniform distribution between -1 and 1. Of these environmental
variables, 15 are selected (repetitions allowed), resulting in 15
signals s. The controller is W, and the control task is to adjust the
matrix W in such a way that |r-p| is minimized (on average and over a
sufficiently long time period).

r ----- o = r-p (15)
--->|+ |----------------------------
(15)| - | |
--^-- (15) (75) |
p | ----- s ----- v ----- |
(15)----| W |<---| R |<---| A |<---
----- ----- -----

This is how I translate your problem statement into a block diagram.
Note one advantage of the vector-matrix notation: it is still
possible to use the same old block diagrams...

I find that your problem statement results in a fairly unorthodox
controller compared to standard models, in which it is usually the
output function that is adjusted rather than the input function. So
please correct me where I misunderstand you.

Note that, if I understand you correctly, the "effective" environment
is A and R combined, i.e. the (15*15) matrix A*R, which you could
have generated directly rather than A and R independently. One
immediate conclusion is that matrix A is not _observable_: because
only some components of v can be observed, it will be impossible for
any system, however clever, to identify all entries of A. This is
unimportant for control purposes, however, because it is "only" the
matrix A*R that generates the perceived signals.

The sensors for each control system are connected to environmental
variables at random... Duplicates are permitted.

In the worst case, this means that all sensors will observe the same
single environmental signal. Very much like "walking in the dark"...
The chance that this will happen is very small, though. It is pretty
likely, however, that some sensor signals will give the exact same
information. That is (observations are noise-free; if they were
noisy, redundancy would help) equivalent to removing some of the
sensors. In that case, the matrix A*R will be singular and the
overall control error most likely cannot go to zero.

The overall objective is to minimize the scalar (r - p) in each
control system separately (not the global |r - p|), where r is the
reference-signal.

A scalar r-p has no minimum; its square or its absolute value does. I
assume from your post that you want to minimize all the scalars
(r[i]-p[i])^2, for every i. If so, this would be equivalent to
minimizing their sum, which is the global error

global error = SUM (r[i]-p[i])^2
i

Note that this global, overall error can go to zero only if every
individual error term can be zero simultaneously. Global optimization
is mathematically equivalent to the individual optimization of each
individual term, at least if the overall error (and thus all
individual errors simultaneously) _can_ go to zero. Or do you mean
something different?

This is done by reorganizing the weights applied to the sensor
inputs for each control system: the set of w[k] vectors.

Yes, this is the overall goal: to choose the entries of matrix W in
such a way that a "good" or even "best" (optimal) controller results.

Each control system reorganizes independently of the others.

Why did you make this choice?

Reorganization is done by associating a weight-change vector dw with
each weight vector w.

An alternative could be: Reorganization is done by associating a
weight-change matrix dW with weight matrix W, i.e.

W := W + dW

Would doing so be reprehensible in any way? Why can an element of a
vector know of the other elements of that vector, but would it be
forbidden for an element of a matrix to know of the other elements of
that matrix? Are there neuro-anatomic reasons for your assumption?

For the moment, I'm just trying to understand your problem. A first
step is to formulate it in such a way that the problem formulation
cannot be misunderstood (by me! ;-).

Greetings,

Hans

[From Bill Powers (970702.1607 MDT)]

Hans Blom, 970702--

Yes. In vector/matrix notation, we simply write v = A * o. I assume
that you're not very familiar with it, but this notation v = A * o
simply translates into the equivalent

v[i] = SUM{A[i,j]*o(j)},
j

or, in words: every component of v is the weighted sum of all the
components of o. Weights A[i,j] can, of course, be zero. What is
essential when using this notation is to give the dimensions of the
vectors (and thus of the matrices): how many components do v and o
have?

OK, we're together this far.

Each sensor for the jth control system is associated with a weight,
w[j,k], and the perceptual signal for the jth control system is

p[j] = SUM{w[j,k]*s[k]}
k

where s[k] is the signal from sensor k.

In vector/matrix notation, this would be p = W * s.

I probably didn't write this correctly. _Each_ control system is equipped
with NIN ("number of inputs") sensors, which are randomly connected to the
environmental variables. If there are 15 control systems with 12 sensors
each, there is a total of 180 sensors. So I probably should have written
s[j,k] instead of just s[k] -- j is the index indicating which control
system is involved, and k indicates which of that control system's sensors
is meant. So I guess it should be

p[j] := SUM{w[j,k]*s[j,k]}.
k

Since k is an index of NIN pointers to environmental variables v, and the
sensor signal is just the value of the environmental variable pointed to,
the final form would be

p[j] := SUM{w[j,k]*v[j,index[k]]}
k

Let me try to summarize the above in a block diagram. ...

r ----- o = r-p (15)
--->|+ |----------------------------
(15)| - | |
--^-- (15) (75) |
p | ----- s ----- v ----- |
(15)----| W |<---| R |<---| A |<---
----- ----- -----

This is how I translate your problem statement into a block diagram.
Note one advantage of the vector-matrix notation: it is still
possible to use the same old block diagrams...

The problem here is that there are really 15 independent (supposedly)
controllers that share the same environment; this notation makes it look as
if there are only 15 "s" signals, whereas there are actually 180 of them,
12 for each controller. The only problem is in the part of the diagram
running right to left from A to W. We really need 15 R matrices feeding 15
W matrices, with each R matrix giving a different sampling of the variables
in v. The outputs of ONE R matrix is summed according to the corresponding
W matrix to form ONE of the p signals, p[j]. I can change your diagram
above to show how this would look for ONE of the 15 control systems; maybe
you can see how to write the compact matrix expression that includes all of
them (if there is one).

r ----- o = r-p (1)
--->|+ |----------------------------
(1)| - | |
--^-- (15) (75) |
p | ----- s ----- v ----- |
(1)----| W |<---| R |<---| A |<---
----- ----- -----

You'll have to look closely to see what I did: I just changed the
dimensionality of p, r, and o to (1) instead of (15).

I find that your problem statement results in a fairly unorthodox
controller compared to standard models, in which it is usually the
output function that is adjusted rather than the input function. So
please correct me where I misunderstand you.

Maybe what I'm doing would be easier to see if we change the meaning of the
diagram. Instead of thinking of 15 control systems inside one overall
system, think of 15 different people trying to control their own
perceptions by acting on an environment that is common to them. The
combination of the R and W matrices would, ideally, allow each person to
perceive a different aspect of the common environment of v's. If the
outputs were carefully arranged for each control system to affect only the
perceptual signals for the corresponding system, and if the R*W matrices
were all orthogonal, then each system could control its own perception
quite independently of the others. One might be controlling position,
another orientation, and a third brightness.

The situation I'm dealing with is different. No person knows what the
actual effects of outputs on the v's really are; the only information comes
in through the perceptual signal p. So the question is, given an unknown
(random) set of output effects on the v's, can each control system find a
way to perceive the environment, so that all the control systems still can
exert independent control of their own perceptual signals?

Obviously there is one case where this is impossible: namely, the
equations expressing the effects of the outputs on the v's have no solution
-- that the effects are linearly dependent, or the Jacobian of the result
is zero. I think that's right. But to achieve that situation, the
randomly-assigned weights given to the outputs on each v would have to be
just that one unique set of weights that would create the singularity, and
the chances of that are vanishingly small. More likely is that there would
be some set of input weights for each control system that would still allow
independent control -- with some interaction with the other systems, but
not enough to prevent control. Anyway, that's the idea, whether there's
something wrong with it or not.

The sensors for each control system are connected to environmental
variables at random... Duplicates are permitted.

In the worst case, this means that all sensors will observe the same
single environmental signal. Very much like "walking in the dark"...
The chance that this will happen is very small, though. It is pretty
likely, however, that some sensor signals will give the exact same
information. That is (observations are noise-free; if they were
noisy, redundancy would help) equivalent to removing some of the
sensors. In that case, the matrix A*R will be singular and the
overall control error most likely cannot go to zero.

Having explained that there are 15 sensors _per system_, not altogether, I
hope this makes the chances of the situation you mention negligible. All
that duplicates really do is multiply the weight of a given v.

The overall objective is to minimize the scalar (r - p) in each
control system separately (not the global |r - p|), where r is the
reference-signal.

A scalar r-p has no minimum; its square or its absolute value does.

Right. Sorry. abs(r-p) or (r-p)^2 is what I use.

I assume from your post that you want to minimize all the scalars
(r[i]-p[i])^2, for every i. If so, this would be equivalent to
minimizing their sum, which is the global error

global error = SUM (r[i]-p[i])^2

i

The fact that there are 180 weights to adjust makes this approach less
feasible. I have actually tried the global-error approach using the E. coli
method, and when the environment is more orderly it does work -- very
slowly. In the present project, each control system reorganizes
individually -- as would be the case if each one were a different person.

Note that this global, overall error can go to zero only if every
individual error term can be zero simultaneously. Global >optimization
is mathematically equivalent to the individual optimization of each
individual term, at least if the overall error (and thus all
individual errors simultaneously) _can_ go to zero. Or do you mean
something different?

No, I realize this. In fact the display shows a running average of the
total RMS error, to show progress. This works the other way, too: if the
individual squared errors go to zero, so does the sum.

Each control system reorganizes independently of the others.

Why did you make this choice?

Because the E. coli method becomes slower very rapidly as the number of
weights to be optimized increases.

Reorganization is done by associating a weight-change vector dw >>with

each weight vector w.

An alternative could be: Reorganization is done by associating a
weight-change matrix dW with weight matrix W, i.e.

W := W + dW

I guess that's what I'm doing, except that the DW matrix gets multiplied by
a scalar Speed constant and the absolute amount of error (to make the
corrections smaller as the error approaches zero). If you think of dW as
indicating a direction of change in hyperspace, reorganization amounts to
making random changes in the direction, and delaying the next
reorganization if the current direction of change causes the squared error
to decrease.

Would doing so be reprehensible in any way? Why can an element of a
vector know of the other elements of that vector, but would it be
forbidden for an element of a matrix to know of the other elements >of

that matrix? Are there neuro-anatomic reasons for your >assumption?

E. coli does this in three dimensions without, I presume, any computing
capabilities to speak of. Anyway, I don't see how an element of a matrix
can "know" anything. In neural terms, it's just the weighting of a synaptic
connection.

Several people seem interested in this. Appended is the source code of the
latest version I've been working on (I've checked the box that says "Put
text attachments in body of message" -- hope it works.

Best,

Bill P.program reorg3;

{
NSYS control systems act on an environment by putting out NSYS output
signals, o. The environment contains NVAR environmental variables, v,
each of which is affected by zero to NSYS of the NSYS output signals,
according to weights WOUT. The environmental variables interact
according to physical laws that define the properties of the environ-
ment.

Each of the control systems contains an input function that has 1 to NVAR
sensors, each sensor detecting the state of one of the environmental
variables. The weighting given to each sensor signal, InWt, is adjustable
by a reorganization system associated with each control system. The
weighted sum of the sensor signals produces a perceptual signal p, which
is compared with a reference signal r to produce an error signal e. The
error signal in each control system is multiplied by an output gain
factor g and integrated to produce the output signal.

Reorganization is driven by the error signal in each control system.
The rate of reorganization is determined by the rate of change of
the squared error, ErSqNew - ErSqOld. Each input weight has associated
with it a DeltaWt, which is multiplied by the squared error and added
to the input weight on every iteration. Thus the input weights
continually change. Every time a reorganization occurs, the size of
DeltaWt is selected randomly from the range between plus and minus
MAXDELTA.

}

uses dos, crt,graph, grutils;

const NSYS = 8;
NIN = 12;
NVAR = 75;
SPEED = 1e-7;
GAIN = 10.0;
MAXERR = 0.0;

type csystype = record
p,r,e,g,o: real;
r1,r2: real;
Sensor: array[1..NIN] of real;
InWt: array[1..NIN] of real;
DeltaWt: array[1..NIN] of real;
ErSqNew, ErSqOld: real;
end;

csysptr = ^csystype;

envtype = record
envvar: array[1..NVAR] of real;
EnvMatrix: array[1..NVAR,1..NVAR] of real;
OutMatrix: array[1..NVAR] of integer;
SenseMatrix: array[1..NSYS,1..NIN,1..NVAR] of Boolean;
end;

var csys: array[1..NSYS] of csystype;
env: envtype;
envfile: file of envtype;
ch: char;
reorg,changeref,showp : boolean;
i,countdown: integer;
sumsq, iterations: real;

procedure initializesys;
var i,j,k: integer;
begin

for j := 1 to NSYS do
with csys[j] do
begin
for i := 1 to NIN do InWt[i] := 0.0;
o := 0.0;
r := 200.0*(random - 0.5);
r1 := 0.0; r2 := 0.0;
g := GAIN;
end;
assign(envfile,'environ.mat');
end;

procedure setenvironment;
var i,j,k: integer;
begin
i := 1;
{\$i-}
reset(envfile);
{\$i+}
if (IOResult = 0) and (i = -1) then read(envfile,env)
else
begin
with env do
for i := 1 to NVAR do
begin
for j := 1 to NVAR do envmatrix[i,j] := 0.5*(random - 0.5);
OutMatrix[i] := random(NSYS-1) + 1;
end;
for i := 1 to NSYS do
for j := 1 to NIN do
for k := 1 to NVAR do
env.SenseMatrix[i,j,k] := random > 0.8;
rewrite(envfile);
write(envfile,env);
end;
close(envfile);
end;

procedure legend1;
begin
window(1,NSYS+4,50,25);
clrscr;
writeln;
writeln('(Above) Control system variables');
writeln('(Right) Environmental variables x 1E-6');
writeln('Show input weights: p');
writeln('Variable reference signals: v (toggle)');
writeln('Zero reference signals: z');
write('Quit program: q');
end;

procedure legend2;
begin
window(1,NSYS+2,80,25);
clrscr;
writeln;
writeln;
writeln('Show control system variables and env. variables: p');
writeln('Quit program: q');
end;

procedure showresults;
var i,j,k: integer;
begin
dec(countdown);
if countdown > 0 then exit;
countdown := 10;
window(1,1,50,NSYS+3);
gotoxy(1,1);
for i := 1 to NSYS do
with csys[i] do
writeln('SYS ',i:2,' r=',r:6:1,' p=',p:6:1,' e=',e:6:1, ' o=',o:8:1);
writeln(' RMS error = ',sqrt(sumsq/NSYS):8:1);
writeln(' Iteration = ',iterations:8:0);
window(51,1,80,25);
for i := 0 to NVAR-1 do
begin
gotoxy((i div 25)*9 + 1,(i MOD 25) +1);
with env do write(1E-6*envvar[i+1]:7:4);
end;
end;

procedure showwt;
var i,j: integer;
begin
dec(countdown);
if countdown > 0 then exit;
countdown := 10;
window(1,1,80,NSYS+2);
clreol;
writeln(' INPUT WEIGHTS x1E4 -->');
for i := 1 to NSYS do
begin
write('SYS ',i:2,' ');
clreol;
with csys[i] do
begin
for j := 1 to NIN do
write(1e4*InWt[j]:6:1);
writeln;
end;
end;
end;

procedure varyref;
var i: integer;
begin
for i := 1 to NSYS do
with csys[i] do
begin
r1 := 5000*(random - 0.5);
r2 := r2 + 0.005*(r1 - r2);
r := r + 0.005*(r2 - r);
end;
end;

procedure control;
var i,j,k: integer;
begin
sumsq := 0.0;
for i := 1 to NSYS do
with csys[i] do
begin
p := 0.0;
for j := 1 to NIN do
begin
sensor[j] := 0;
for k := 1 to NVAR do
with env do
if SenseMatrix[i,j,k] then {if connection to env variable}
sensor[j] := sensor[j] + envvar[k];
p := p + InWt[j]*Sensor[j];
end;
e := r - p;
ErSqOld := ErSqNew;
sumsq := sumsq + e*e;
ErSqNew := {abs(e);} e*e;
o := o + g*e - 1e-3*o;
end;
end;

procedure environment;
var i,j: integer;
begin
with env do
begin
for i := 1 to NVAR do {zero env variables}
envvar[i] := 0.0;
for i := 1 to NVAR do {output effects}
envvar[i] := envvar[i] + csys[OutMatrix[i]].o;
for i := 1 to NVAR do
for j := 1 to NVAR do {internal effects}
if i <> j then
envvar[i] := envvar[i] + EnvMatrix[i,j] * envvar[j];
end;
end;

procedure reorganize;
var i,j: integer;
begin
for j := 1 to NSYS do
with csys[j] do
begin
if (ErsqNew - ErSqOld) > MAXERR then
for i := 1 to NIN do DeltaWt[i] := 2.0*(random - 0.5);
for i := 1 to NIN do
InWt[i] := InWt[i] + DeltaWt[i]*abs(e)*SPEED;
end;
end;

begin
clrscr;
randomize;
initgraphics;
restorecrtmode;
clrscr;
initializesys;
setenvironment;
ch := chr(0);
reorg := true;
changeref := true;
showp := false;
iterations := 0.0;
countdown := 10;
legend1;
while ch <> 'q' do
begin
environment;
control;
if reorg then reorganize;
if showp then showwt else showresults;
if changeref then varyref;
if keypressed then
begin
case ch of
'r': reorg := not reorg;
'v': changeref := not changeref;
'p': begin
clrscr; showp := not showp;
if showp then legend2 else legend1;
end;
'z': for i := 1 to NSYS do csys[i].r := 0;
end;
end;
iterations := iterations + 1.0;
end;
closegraph;
end.From ???@??? Wed Jul 02 21:50:59 1997
Return-Path: owner-csgnet@POSTOFFICE.CSO.UIUC.EDU
Received: from hubbub.cisco.com (mailgate-sj-1.cisco.com [198.92.30.31]) by pilgrim.cisco.com (8.8.5-Cisco.1/8.6.5) with ESMTP id UAA14213 for <bnevin@pilgrim.cisco.com>; Wed, 2 Jul 1997 20:34:33 -0400 (EDT)
Received: from postoffice.cso.uiuc.edu (postoffice.cso.uiuc.edu [128.174.5.11]) by hubbub.cisco.com (8.8.4-Cisco.1/CISCO.GATE.1.1) with ESMTP id RAA10328 for <bnevin@CISCO.COM>; Wed, 2 Jul 1997 17:34:32 -0700 (PDT)
by postoffice.cso.uiuc.edu (8.8.5/8.8.5) with SMTP id TAA178418;
Wed, 2 Jul 1997 19:32:27 -0500
(LISTSERV-TCP/IP release 1.8b) with spool id 5798953 for
CSGNET@POSTOFFICE.CSO.UIUC.EDU; Wed, 2 Jul 1997 19:32:26 -0500
Received: from animas.frontier.net (root@animas.frontier.net [199.45.141.1]) by
postoffice.cso.uiuc.edu (8.8.5/8.8.5) with ESMTP id TAA36580 for
<CSGNET@POSTOFFICE.CSO.UIUC.EDU>; Wed, 2 Jul 1997 19:32:25 -0500
Received: from dro-1-23.frontier.net (dro-1-23.frontier.net [199.45.201.23]) by
animas.frontier.net (8.8.4/8.6.5) with SMTP id SAA07306 for
<CSGNET@POSTOFFICE.CSO.UIUC.EDU>; Wed, 2 Jul 1997 18:40:58 -0600
X-Sender: powers_w@frontier.net
X-Mailer: Windows Eudora Light Version 3.0.1 (32)
References: <3.0.1.32.19970701221116.00692b68@frontier.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Message-ID: <3.0.1.32.19970702183132.0068c12c@frontier.net>
<CSGNET@POSTOFFICE.CSO.UIUC.EDU>
Sender: "Control Systems Group Network (CSGnet)"
<CSGNET@POSTOFFICE.CSO.UIUC.EDU>

···

Date: Wed, 2 Jul 1997 18:31:32 -0600
Reply-To: "Control Systems Group Network (CSGnet)"
From: Bill Powers <powers_w@FRONTIER.NET>
Subject: Re: Flip-flops
To: Multiple recipients of list CSGNET
<CSGNET@POSTOFFICE.CSO.UIUC.EDU>

[From Bill Powers (970702.1747 MDT)]

Martin Taylor 970702 15:30--

Particularly at the category
level and above, but also at the lower levels, we have to conceive >of a

Reference Input Function that combines the contributions of th >higher
level outputs. Whenever you have dealt with this explicitly, >you have
always taken this Reference Input Function to be a simple >summation, and
this can often work. That's the same kind of thing I >do when I say that
all the Perceptual Input Functions together act >like a multi-layer
perceptron--each PIF is seem as a sum-and-squash >node.

Much of what you say, I think stems from your concept of the levels in HPCT
as a big perceptron. But the problem with the perceptron approach as an
explanation of _everything_ is that there are kinds of perceptions that
"sum-and-squash" doesn't explain. A perceptron seems to be good for one
kind of thing: categorizing. So it's no wonder that you want to see
categories at every level. However, a perceptron wouldn't be much good at
perceiving, for example, an event, which is simply a space-time package of
lower-level signals. Nor would it be much good at doing logic, or grasping
principles.

Consider the proposal I made in B:CP for how a sequence detector might
work. It starts on p. 141, in the chapter on Control of Sequence (which at
that time was at the level now called "events". The final arrangement for
recognizing a 3-element sequence is drawn on p. 144. It uses a series of
three single-neuron set-reset flip-flops connected so that an output occurs
only if the three input signals occur in the right order. My point is not
that this is the best or only design possible, but that it bears no
resemblance at all to a perceptron. It uses a grand total of 3 neurons, one
for each element of the sequence. I point out in the book that this way of
recognizing phoneme-sequences as words would require only an estimated 1/6
cubic centimeter of brain tissue for a vocabulary of 150,000 words -- with
the most wasteful possible design..

One of the concepts behind HPCT is that each level is characterized by a
kind of input transformation typical of that level, across all modalities.
Going from intensities to sensations seems to require nothing more than
simple weighted summations. But going from sensations to configurations
involves extracting invariances with respect to orientation, size, color,
and so forth -- while preserving _continuous_ relationships. The kinds of
computations required at this level must include things approximating
trigonometric functions, and other processes nobody has even guessed at
yet. I have never heard of a perceptron that comes even close to
reproducing the human ability to recognize objects in any orientation, and
to allow us to perceive the attributes of configurations as _continuously
variable_. I don't know of any perceptron that can put out a signal
representing the orientation of a hand, a signal that varies smoothly from
one value to another as the hand is rotated from front to back.

The perceptron is clever, it's mathematically understood, it's been
demonstrated to work in very simple situations, but I can't accept it as
the general model of all the levels of perception. I don't think there is
ANY principle of computation that is common to all the levels. I think that
each new level introduces principles of computation that don't exist at any
lower level. The performance of perceptrons has been given a very generous
evaluation by those who are eager to see even slight hints of complex
capabilities in it, but anyone who looks coldly on what has been
accomplished in this way would be far less impressed. About the only thing
a perceptron IS good at, as far as I know, is categorization. But that is
just one level of perception among many.

I really don't want to get into discussions of detailed models and what is
wrong with them. If you can pull your various ideas together into a
convincing, internally consistent, and workable model, by all means do so.
But I think you have a way to go.

Best,

Bill P.

[Hans Blom, 970703]

(Bill Powers (970702.1607 MDT))

Thanks for the feedback, which made clear that I misunderstood your
problem. I thought I was to understand that all control systems
operated within a single organism. Only in that case would it be
allowed to have an overall (reorganizing and/or control) system that
W[i,j]. Now, however, it seems that you discuss "social control":

Instead of thinking of 15 control systems inside one overall system,
think of 15 different people trying to control their own perceptions
by acting on an environment that is common to them.

That changes the problem: now there can be no _overall_ controller,
and we have to be satisfied with designing individual ones.

Now a system can become a good controller only if it somehow adjusts
its characteristics to the environment. This can be done with or
without building an explicit model of the environment. It frequently
helps to consider what kind of model _could_ be built, even if one
doesn't actually want to implement one. There are, in theory, two
approaches: 1) attempt to build a complete model of the environment,
in this case including models of the other actors, or 2) build a
partial model of the environment, which will therefore include
uncertainty ("noise", "disturbances"). It appears (but I'm not sure,
because this might depend on the values of NIN versus NSYS) that in
your Monster insufficient information is available for building a
complete model, even though the other actors have extremely simple
characteristics. The best approach is thus, I believe (for now), to
model the actions of the other controllers as noise.

Let's do so, and modify the problem statement and its diagram
accordingly. Each individual controller gets the following block
diagram:

r ----- o = r-p (1)
--->|+ |----------------------------
(1)| - | |
--^-- (12) (75) |
p | ----- s ----- v ----- |
(1)----| W |<---| R |<---| A |<---
> > > > > ><---- d (14)
----- ----- -----

Note my changes to your diagram: I explicitly added the 14 other
controller outputs as the disturbance vector d. I also changed the
dimension of s from 15 to 12, assuming that each controller has 12
sensors, since you say:

_Each_ control system is equipped with NIN ("number of inputs")
sensors, which are randomly connected to the environmental
variables. If there are 15 control systems with 12 sensors each,
there is a total of 180 sensors.

I see another problem.

Instead of thinking of 15 control systems inside one overall system,
think of 15 different people trying to control their own perceptions
by acting on an environment that is common to them. The combination
of the R and W matrices would, ideally, allow each person to
perceive a different aspect of the common environment of v's.

The environment (A in the block diagram) has 15 inputs and 75
outputs. Since the environment is linear (simply weighted summation),
constant (matrix A does not change over time) and noise-free (A has
no random components), those 75 outputs do not span a 75-dimensional
space but only a 15-dimensional sub-space: there is a great deal of
redundancy in those 75 outputs; all of them are some linear
combination of only 15 inputs. Thus, in principle, it would be
sufficient to have A generate 15 (independent!) outputs; the other 60
outputs do not provide extra information.

Also, let me restate the goal of the parameter adjustment of vector
W. You say:

No person [individual controller] knows what the actual effects of
outputs on the v's really are; the only information comes in through
the perceptual signal p. So the question is, given an unknown
(random) set of output effects on the v's, can each control system
find a way to perceive the environment, so that all the control
systems still can exert independent control of their own perceptual
signals?

I restate the Monster's problem as follows: The goal of learning (the
adjustment of the components of W) is to maximally pass the signal o
to p, while maximally suppressing the influences of d to p. In other
words, maximize the signal (o) to noise (d) ratio in p. Sounds like a
filtering problem, when expressed this way...

Your Monster problem also reminds me of the "learning" that takes
place in Rick's "mind reading" demo, if you change vector W into a
selector (which passes one component of signal s) with adjustable
gain, rather than the weighted summator that it is now.

Another consideration is each controller's loop gain A*R*W. Have you
analyzed within which limits it can vary, given your assumptions, and
assuming perfect disturbance suppression (d=0)?

Having explained that there are 15 sensors _per system_, not
altogether, ...

Are there 12 or 15 sensors per system? What may (or may not, I'm not
sure yet) be important is whether the number of sensors per system is
(effectively, copies excluded) as large as or larger than the number
of controllers. Only if so one might hope that W can be adjusted so
as to suppress _all_ the disturbances caused by the control actions
of the other controllers. But whether this is a correct supposition
requires more analysis. Alas ;-).

Note that in your program code this problem does not arise: the
number of controllers NSYS is only 8 rather than the 15 that I
assumed above, whereas the number of sensors NIN is 12. The chance
that less than independent 8 components of s are available to W is
thus negligible.

I suggest one change to your Monster program. To make the problem
more tractable and have well-defined dimensions (particularly NIN
compared to NSYS), I suggest to construct selector R in such a way
that sensory copies cannot occur.

Do I understand you so far? And am I making sense?

Greetings,

Hans

[Martin Taylor 970703 09:50]

[Hans Blom, 970703] to (Bill Powers (970702.1607 MDT))
The best approach is thus, I believe (for now), to
model the actions of the other controllers as noise.

I agree with this.

Let's do so, and modify the problem statement and its diagram
accordingly. Each individual controller gets the following block
diagram:

r ----- o = r-p (1)
--->|+ |----------------------------
(1)| - | |
--^-- (12) (75) |
p | ----- s ----- v ----- |
(1)----| W |<---| R |<---| A |<---
> > > > > ><---- d (14)
----- ----- -----

The environment (A in the block diagram) has 15 inputs and 75
outputs. Since the environment is linear (simply weighted summation),
constant (matrix A does not change over time) and noise-free (A has
no random components), those 75 outputs do not span a 75-dimensional
space but only a 15-dimensional sub-space: there is a great deal of
redundancy in those 75 outputs; all of them are some linear
combination of only 15 inputs.

This is correct.

Thus, in principle, it would be
sufficient to have A generate 15 (independent!) outputs; the other 60
outputs do not provide extra information.

But here we have a problem. Within the 75-dimensional environmental
space there are two 15-dimensional subspaces, not one. (I take the
dimensionality of s to be 15, not 12, since Bill P has used 15 in most
of his messages). The subspace of s is not necessarily the same as the
subspace of o, and indeed is likely to have a very small projection on o.
This means that with most random weight choices, almost all of o is
side-effect, very little of it affecting s directly. I don't know what
the distribution of effect would be for 15-dimensional subspaces in a
75-dimensional space, but we can look at averages. If I'm right, the
expected direction cosine of the projection of each output dimension
onto a single basis direction of the output space would be sqrt(1/75)
or about 0.12, and the total projection onto the 15-dimensional subspace
but it sounds right). If that's right, more than half the energy of the
output goes to disturbing the other control units, and more than half
of the energy of each of the other outputs goes to disturbing this one,
on average.

Now take a slightly different look at the probabilities.

There are 15 different control units. Considering only the output of
control unit k, we can ask what is the probability that the projection
of ok (output subspace of control unit k) onto sk (input subspace of
control unit k) is larger than its projection onto some one of sj (j != k).
That probability is presumably 1/14. In other words, it is highly
probable that the actions of unit k have more effect on the perception
of some unit j (pj) than on its own perception (pk). And it is, by
the same logic, highly likely that the greatest influence on pk comes
from some unit j other than k.

Following this notion further, for any unit k, on average there will be
7 other units whose outputs have a greater effect on pk than does ok.

In principle, this may not be a problem, since pk and pj are as likely
to be nearly orthogonal as are pk and ok, so that the _average_ magnitude
of the effect of ok on pj is the same as on pk, and ok is affected
by its influence on pk but not on pj.

Now reorganization enters the picture, but it does not affect the 15
dimensional subspace of s within the 75-dimensional environment. That's
fixed by the environmental feedback function. It affects only the
projection from that subspace onto the one-dimensional pk. So each
control unit can find the maximum projection of its output vector
onto the one-dimensional perception vector _within the fixed subspace sk_.
And it can also minimize the projection of oj onto pk summed over all j.
Best control is where the ratio of the projection of ok onto pk to
the projection of the summed oj onto pk is maximized.

What is the probability that this projection ratio exceeds unity? I
don't know the answer to this question, but I suspect that the probability
is fairly low, and will depend on the ratio of the dimensionality of
the environmental space to the output and sensor dimensionalities. The
current ratio is 5 in each case. It may be too large to allow the
control units _ever_ to reorganize to have more effect on their own
perceptions than the combined influence of the other control outputs.
Perhaps it comes where the projection of the output subspace onto the
input subspace is greater than 0.5. If that's so, then Bill's reorganized
systems should not explode when the dimensionality of the environment
is less than 60 (for 15 sensors and 15 outputs per control unit). I
guess that's a tentative prediction that is easily tested, to see if these
notions are on a reasonable track.

···

------------------

This whole analysis suggests that the reorganization procedure needs a
second string to its bow.

Reorganization can crudely be described as changing "what to perceive"
and "what to do about it". Bill's procedure assumes a fixed sensor
array for each control unit, within which reorganization affects what
is to be perceived. That's a naturally realistic and valid thing to do.
But on the output side it is not necessarily realistic that the
reorganization be restricted to acting only on pre-fixed variables in
the environment. That would be like requiring a person to control the
flow of water from a faucet while restricting the control actions to
flicking light switches.

The "monster" might converge better if there were a slow process that
allowed the 15-dimensional outputs to change their subspaces within the
environmental space--in other words, to disconnect from some variable
and reconnect to another at random. Presumably this would be done by
reducing the weight on some variable that the e-coli measures suggest
is having an unfavourable effect and when any weight gets to zero
bringing in a new variable at zero weight. (The "unfavourable" effect
presumably would occur because the variable has little direct influence
on the controlled perception and therefore is more strongly influencing
other units' perceptions in a way that causes their outputs to counter the
induced disturbance, back-creating disturbances for the control unit
being reorganized).

I say a _slow_ process above, to avoid getting into the kind of vicious
cycle that Bill observed when he reorganized both inputs and outputs
at the same rate in an earlier experiment. The feedback dynamics of
that process (changes in output affect optimum perceptual weighting
and vice-versa) led to unstable dynamics, which should not happen if
the spectra of the two reorganizing processes are sighnificantly
different.

Of course, one has to allow also for sign-flipping reorganization as well
as dimension-changing, and I'm not clear how to do this without
inducing large transients if it can't be done by passing the weight
smoothly through zero. There's probably a suitable trick. Perhaps
the old output variable remains a candidate for choice.

Hope this helps.

Martin

[Martin Taylor 970703 23:59]

Bill Powers (970703.1018 MDT) to Hans Blom, 970703--

Commenting on my message about the 15-dimensional subspace of output
having small probable overlap with the 15-dimensional subspace of
perceptual input.

similar comment and offered an estimate. However, Martin's estimate was
based on a misinterpretation (easily fixed) of the situation. In fact, EACH
system's output affects ALL 75 environmental variables, through weights
chosen at random from -1 to 1. So the chances of a system's output
affecting its own input, or the inputs of other systems, are not at all
small. This extreme diffusion of output effects is probably causing some of
my problems.

Yes, I did make a bad mistake, but not the one you claim. As you say,
I was thinking that the output connected to 15 variables, rather than
all 75. But that wasn't the _bad_ mistake. The bad mistake was to forget
that the output is _one_ dimensional, immovable within the 75-D
environmental space. That makes the situation much worse than my
analysis--or it would do so if I hadn't made a compensating mistake
of _forgetting_ that I thought the output was 15-dimensional, and
actually treating it as unidimensional:-)

So although the conception was wrong, the statement was wrong, and
the analysis didn't agree with the statement of problem, the stupidities
all seem to have cancelled out, leaving a valid analysis--I hope.

The point is that with a fixed set of weights connecting output k to
the 75 environmental variables, the projection of that one dimension
into the 15-dimensional subspace of the sensor system is quite small
on average.

There's a common situation that may account completely for your finding:
the electric blanket where the two controls have inadvertently been
swapped to the wrong sides of the bed. He finds it too cold, and turns
up the heat. She finds it too hot and turns down what she thinks is her
side, making him colder. Positive feedback. In the case of the monster,
something similar probably is happening. The control unit that can
flip the light switch affects the lights perceived by the guy that
handles the faucet, so the light-switch guy _can_ control the tap
after reorganization has flipped the signs appropriately, and perhaps
the light-switch guy can also control the lights.

But there's a likelihood of conflict here, in that the loop feedback
is either positive or negative and the loop is stable only if the
feedback is negative. If it's negative, it seems as if the light
and the tap will tend to stabilize somewhere on a one-dimensional arc
through the 2-D space of light-level and water rate. Probably the
two reference levels will not be on this arc, and there will be conflict.

The situation in the monster isn't as bad as that, but if I'm right
(I'm not at all sure I am) that with an environment of 75 variables of
which a random 15 are sensed by each control system, only 0.44 of the
output goes to the "own" perceptual signal and 0.56 goes to disturbing
the others. Why I'm not at all sure I'm right is that 0.44 seems
likely to be the total influence on the 15-dimensional subspace
(unless I've take a square root when I shouldn't, in which case
0.44 should become 0.2). I don't think that the optimally reorganized
one-dimensional perceptual vector can collect all of that 0.44.
So the ratio is probably a lot worse than I suggested, and the
destabilizing effects stronger.

If we think again of light-switches and faucets, it's as if the
control unit perceiving the water flow could move a handle that affects
the water flow weakly and the light strongly, and the one perceiving
the light could move a handle affecting the light weakly and the
water strongly. There is less tendency to instability than if all
the control goes through the other party, but the instability is
still there (the "other"s force can overwhelm the "own" force if
there is a conflict).

I think it is still the case that the output weights should be
reorganizeable if the "monster" is to be stably controlled. The
output vector in the 75-D space must project more strongly onto
its own perceptual vector than the outputs of the other control
units do, at least if its own side effects are influencing the
others as strongly as they are influencing it. One cannot alter
one's sensor systems, but one can alter what to hit, pull, push
or tweak in the environment when one is trying to affect what
one senses.

Martin

[From Bill Powers (970704.0621 MDT, Autonomy Day)]

Martin Taylor 970703 23:59 --

I was thinking that the output connected to 15 variables, rather >than all

75. But that wasn't the _bad_ mistake. The bad mistake was >to forget that
the output is _one_ dimensional, immovable within the >75-D environmental
space. That makes the situation much worse than >my analysis--or it would
do so if I hadn't made a compensating >mistake of _forgetting_ that I
thought the output was 15->dimensional, and actually treating it as
unidimensional:-)

The point is that with a fixed set of weights connecting output k to
the 75 environmental variables, the projection of that one dimension
into the 15-dimensional subspace of the sensor system is quite small
on average.

Since the output weights range from -1 to 1, the average projection should
be zero save for the limited number of random variables being averaged.

Your treatment suggests another way to look at it. Any control system's
output affects all 75 environmental variables. The input function samples
as many as 12 of them (there could be some duplicates), meaning that each
sample is affected by that system's own output. If all 12 input samples had
the same sign and magnitude of weighting, the expected effect of output on
input would be zero, because the output effects are symmetrically
distributed around zero.

However, the input weights are adapting, so if the adaptation were ideal,
each negative weight Wout from an output would go with a negative input
weight Win, each positive weight with a positive weight, and the net gain
through the environment would be the sum of the 12 absolute values of
Wout*Win. So the environment function would be a positive nonzero constant
as it needs to be (given error = (r-p)).

The "disturbance" that Hans showed in his environment function,
representing the effects of the other 14 systems, is the sum of 14 random
variables with an expected value of zero, multiplied by weights with an
expected value of zero. I say "random" variables because the reference
signals of the other 14 systems are varying randomly around zero (through a
low-pass filter) and the weightings given to the other 14 output signals
are randomly distributed with a mean of zero, for each environmental
variable. In principle, if there were an infinite number of other systems,
the average disturbance applied to any given control system would be zero,
I think.

The main problem in the E. coli style reorganizing method is that when the
reference signal is allowed to vary, the error signal varies because of
changes in the reference signal instead of because the parameters being
reorganized are changing. What should be a continual series of changes
toward the right values becomes a series of reorganizations that happen
when they shouldn't happen, or don't happen when they should. If the error
is diminishing because the reference signal is changing toward the value of
the perceptual signal, reorganizations that should happen don't happen, and
so on. This is probably why the reorganizing process works when the
reference signals are constant, but not when they are changing.

In order to make the reorganizing effects dominate the effects of changing
the reference signals, it would be necessary, I think, to make the
reference signals change only very slowly. The rate at which the weights
are changing should have a greater effect on the rate of change of error
signal than the changes in reference signal have. I'm not sure this is a
practical requirement for a behavioral system. For an intrinsic system,
where the reference signals do change very slowly, it might work.

I've been trying to think of a way to detect the state of the control
system that would provide a better basis for reorganization without being
disturbed by variations in the reference signal. Try this on:

The effective environmental function at any time (including the input
weights) is a slowly-changing constant, the changes being due to
reorganizing effects. If the constant is positive as it should be, the
perceptual signal will vary proportionately to the reference signal. It
will be, in fact, a constant proportion of the reference signal. What is
desired is for that proportion to be positive and close to 1. So if the
reorganizing processes sensed the ratio p/r, and compared it with a
reference value of 1, the error signal in the reorganizing system would be
constant even if the reference signal were changing. The rate of change of
the error signal would indicate whether the system were approaching better
control or getting worse.

There is a problem in sensing a ratio when the denominator goes through
zero. So our ratio computations would be made only for significantly
nonzero values of the perceptual signal. But it would be better to avoid
the range near zero altogether, if possible. This suggests (and I'm sure
this has already occurred to you) that we should deal with one-way control
systems. And if we're doing that neurologically-correct thing, the next
idea that suggests itself is that it would be even more neurologically
correct to make the perceptual signal represent the logarithm of the sum of
sensor inputs. In that case, the reference signal would specify the desired
log value, and the error signal would then become the difference between
two logarithms -- which is the log of their ratio!

That seems worth trying.

···

-------------------------------
With all that said, I have to report an uncomfortable discovery. When I
change the constants in the program so there is just 1 control system with
1 input, reorganization proceeds very slowly if at all, even with a
constant reference signal. This is what I was afraid of -- there is
something wrong with the program. When I watched the changes in the SINGLE
weight being reorganized, it would increase slowly as it should, but when
it reached some value like 350 (I've been playing with the other
parameters), it would suddenly drop to 50 or -20 and then start rising
again from there -- or it would go so far negative that the system ran away
and halted with a floating point error. This should be impossible, because
the reorganizing steps are very small. I'm showing only every 10th
iteration, but even so, it should not be possible for a weight increasing
at a rate of 1 or 2 per display period (at most) to suddenly decrease by
200. Once, just before this sudden decrease, I saw that the weight had
increased in the space of 10 iterations from 300 to 2500.

This strongly suggests some sort of arithmetic error such as an overlow or
underflow, or else that the control loop has become unstable. The latter
makes no sense, since we routinely model control systems of this kind and
the _supposed_ parameters in the loop don't come anywhere near the level
where instability could result. It is possible that since this program
involves multiplying very large numbers by very small ones, and adding the
resulting small number to a largish one, that there isn't enough dynamic
range in the floating-point calculations for them to be performed correctly.

Well, there's no point in speculating; I just have to start over with a
simple control loop and before introducing reorganization, make sure it
works over a wide range of parameters, and then try to track down the
problem. It's really embarrassing to go through all these analyses of the
results when the program itself has a glitch in it.

Best,

Bill P.

[Martin Taylor 970705 10:55]

Bill Powers (970704.0621 MDT, Autonomy Day)]

Martin Taylor 970703 23:59 --

The point is that with a fixed set of weights connecting output k to
the 75 environmental variables, the projection of that one dimension
into the 15-dimensional subspace of the sensor system is quite small
on average.

Since the output weights range from -1 to 1, the average projection should
be zero save for the limited number of random variables being averaged.

The _arithmetic_ average projection of the output vector onto any one
sensor is zero. But that's not measuring the expected magnitude of the
effect of the output vector on a randomly weighted sensor. The effect is
there whether it is in the positive or the negative direction. What you
want is the RMS value, not the arithmetic average. It's an energy
consideration--how much of the energy of the output is likely to affect
the sensor. And then, since the 15 sensors are orthogonal in the
space of the 75 environmental variables, the total energy is 15 times
the energy expected to be projected onto one sensor. The amplitude
of the effect is the square root of the total energy, and that gives
the effective gain of the environmental feedback function. I computed
that as sqrt(15/75), or 0.44 on average.

One can look at it geometrically. The output vector defines a direction
in 75-dimensional environment space. The sensor is linked to one variable
of the environment, so it defines one of the 75 axes. By Pythagoras'
theorem in 75 dimensions, the square of the length of the output vector
is the sum of the squares of its projections onto the 75 axes. Since the
weights of the output vector onto the 75 axes are randomly chosen,
the expected length of the projection onto one axis is the same as
onto any other axis. So the expected projection onto any one axis is
sqrt(1/75) of the output. Some will be higher, some lower, but that's
the average.

Now take the space defined by 15 randomly chosen sensors. The projections
onto these 15 axes define a vector in this 15-dimensional space. It has
a length whose square is the sum of the squared lengths of the individual
projections. That sum is 15/75, so the length of the output vector
projected into the 15-dimensional subspace is expected to be sqrt(15/75).
or 0.44. That's the maximum proportion of the output that can be detected
by the best optimized set of sensors connected to those particular
variables. Again, some sets of 15 will do better than this, some worse.

In a 15-D sensor space randomly placed in a 75-D environmental space,
no more than 0.44 of the output (on average) will be seen by the sensors,
and 0.56 of the output will be side-effect. On any specific "other"
15-D subspace, there will also be an effect averaging 0.44 of the output.

I think this gives a clue also to the probable magnitude of the
"disturbance" effects of the other control systems. If the effect of
one's own output on the other systems is expected to be 0.44, then
it's effect on oneself is also going to average 0.44 regardless of
how its sensor weights adapt during reorganization. One's own effect,
feeding back through the action of the other, will have an expected
value of 0.44^2, 0.20 within the 15-D subspace. This "disturbance"
vector will not be in the same direction within the 15-D space as
the projection of one's own output. It has a projection onto the
"optimum" vector that can be calculated by the same argument:
sqrt(0.2^2/15) or about .051. The 14 "other disturbances" are
independent, so the total loop-back disturbance is about 0.2*sqrt(14/15).

This is not noise, but part of the "own" environmental feedback
function that involves some extra delay. There are presumably
also second, third,... order loops that also should be considered,
but I haven't thought about them. I guess each probably is less
by a similar ratio, and it would be important to know whether the
infinite sum converges or diverges.

One cannot assign an overall loop gain like this, though, because it
doesn't factor in the gains within the other control systems. However,
since the whole structure is symmetrical, the "own" gain should be
expected to have the same value on average as the gain of each of
the other control units (unless the reorganization process induces
some kind of positive feedback that splits them apart). If this
is the case, the effect of one's own output fed back delayed through
the outputs of the "other" control units can be treated as a
contribution to one's own loop gain that is a bit less than half
the gain through one's direct environmental feedback.

If this approach is right, the infinite sum converges to something
near a factor of 2: the "other" control units have nearly twice as
much effect on one's own perception as one does directly. This means
that their adaptations affect one's performance more than does one's
reorganization to a "crossed electric blanket" syndrome, with
runaway parameters. But that's only intuition, and a rather uncertain
intuition at that.

To the loop-back through the other 14 control units must be added
the noise-like disturbance produced by those other units, each of
which on average produces the same magnitude of output, giving a noise
again equal to about 0.44*sqrt(14/15)--almost equal to the influence
of the "own" output. It's still a difficult situation, even when
the programming gets fixed.

-------------------------------
With all that said, I have to report an uncomfortable discovery.
...

Well, there's no point in speculating; I just have to start over with a
simple control loop and before introducing reorganization, make sure it
works over a wide range of parameters, and then try to track down the
problem. It's really embarrassing to go through all these analyses of the
results when the program itself has a glitch in it.

I sympathize; I can't say how often similar things have happened when I
try programming. It's a good part of the reason I don't try it much
any more.

Martin

[From Bill Powers (970705.1353 MDT)]

Martin Taylor 970705 10:55--

The _arithmetic_ average projection of the output vector onto any >one

sensor is zero. But that's not measuring the expected magnitude >of the
effect of the output vector on a randomly weighted sensor. >The effect is
there whether it is in the positive or the negative >direction. What you
want is the RMS value, not the arithmetic >average. It's an energy
consideration--how much of the energy of the >output is likely to affect he
sensor. And then, since the 15 sensors >re orthogonal in the space of the
75 environmental variables, the >total energy is 15 times the energy
expected to be projected onto >one sensor. The amplitude of the effect is
the square root of the >total energy, and that gives the effective gain of
the environmental >feedback function. I computed that as sqrt(15/75), or
0.44 on >average.

If you start with the outputs of a given control system, each environmental
variable will be connected to the output of the system with a specific
weight a, and the sensor signal will be weighted by another specific number
w prior to the summation. Once these weights have been selected, the
overall effect is simply that of a multiplier constant converting the
output into a value of the perceptual signal.

output sensors input
weights weights
>--a1-----v1 --------->S1 ----w1---->|
>--a2-----v2 --------->S2 ----w2---->|
o --->| . . . |---SUM ---> p
>--a3-----vn---------->Sn ----wn---->|

Note that the variables not directly involved in the transfer function from
o to p simply do not appear.

One can look at it geometrically. The output vector defines a >direction

in 75-dimensional environment space. The sensor is linked >to one variable
of the environment, so it defines one of the 75 >axes. By Pythagoras'
theorem in 75 dimensions ...

I think you're approaching the problem in much too complex a way. What we
have here is a simple summation:

p = SUM(an*wn)*o
n

Your foray into 75-dimensional space may be appropriate when trying to
analyze the reorganization process, but to determine the average loop gain
of a given control system it's not necessary.

One cannot assign an overall loop gain like this, though, because it
doesn't factor in the gains within the other control systems.

This is true. However, the more "other" systems there are, the more likely
it is that the disturbance applied to them will result in returned
disturbances of all magnitudes and both signs, with an average effect of
zero. Keep in mind that the inputs weights can be positive or negative;
much of what you say seems to imply that the weights are all positive.

Best,

Bill P.

[Martin Taylor 970705 17:30]

Bill Powers (970705.1353 MDT)]

If you start with the outputs of a given control system, each environmental
variable will be connected to the output of the system with a specific
weight a, and the sensor signal will be weighted by another specific number
w prior to the summation. Once these weights have been selected, the
overall effect is simply that of a multiplier constant converting the
output into a value of the perceptual signal.

output sensors input
weights weights
>--a1-----v1 --------->S1 ----w1---->|
>--a2-----v2 --------->S2 ----w2---->|
o --->| . . . |---SUM ---> p
>--a3-----vn---------->Sn ----wn---->|

Note that the variables not directly involved in the transfer function from
o to p simply do not appear.

Precisely. But they do appear in the effect o has on the other
control units!!!!!!!

One can look at it geometrically. The output vector defines a >direction

in 75-dimensional environment space. The sensor is linked >to one variable
of the environment, so it defines one of the 75 >axes. By Pythagoras'
theorem in 75 dimensions ...

I think you're approaching the problem in much too complex a way. What we
have here is a simple summation:

p = SUM(an*wn)*o
n

Your foray into 75-dimensional space may be appropriate when trying to
analyze the reorganization process, but to determine the average loop gain
of a given control system it's not necessary.

The loop gain of a given control system is controlled by the
output function as well as those weights. The diagram describes
the environmental feedback gain.

One cannot assign an overall loop gain like this, though, because it
doesn't factor in the gains within the other control systems.

This is true. However, the more "other" systems there are, the more likely
it is that the disturbance applied to them will result in returned
disturbances of all magnitudes and both signs, with an average effect of
zero.

The average effect is proportional to the square root of the number
of "other" systems, and does not approach zero.

Keep in mind that the inputs weights can be positive or negative;
much of what you say seems to imply that the weights are all positive.

NOTHING I've said has that implication. NOTHING WHATEVER.

I guess I give up. You said originally you didn't understand about
basis spaces, and I had hoped that my simple descriptions would
help. I was wrong. Sorry to have wasted your time.

Martin

[Hans Blom, 970707b]

(Martin Taylor 970703 09:50)

Reorganization can crudely be described as changing "what to
perceive" and "what to do about it".

Great! That's one I will remember!

Bill's procedure assumes a fixed sensor array for each control unit,
within which reorganization affects what is to be perceived. That's
a naturally realistic and valid thing to do. But on the output side
it is not necessarily realistic that the reorganization be
restricted to acting only on pre-fixed variables in the environment.
That would be like requiring a person to control the flow of water
from a faucet while restricting the control actions to flicking
light switches.

Yes, I have some problems with Bill's model, too. The most important
one is that the Monster's sensory equipment is utilized very badly:
although Bill _says_ that it has 12 (or 15?) sensors, it "really" has
only one, because it must combine all sensory information into only
one one-dimensional "perception". That's bad; it "wastes" all of the
sensors but one. Things are only slightly improved in that it can
pick a "best" perception. Control would be far better if the Monster
could utilize a full 12-dimensional observation. Even in Bill's model
this would be possible, e.g. by varying the input weights ("time-
multiplexing"), but not if the weights are to converge to constant
values.

The other deficiency is, as you note, that there is no multi-
dimensional output function. Actually, there is no output function at
all: the loop gain is currently determined by the input function.
That may create problems: a higher loop gain will create tighter
control, even though this is a process with diminishing returns at
high gains. But since _all_ controllers will discover this, there
will most likely be a run-away process where every controller will
increase its gain without limit. Very much like an evolutionary race
where, if predators become better, the prey has to become better as
well.

I find Bill's problem fascinating, though, in several respects. It
poses a number of important questions. First, how can a controller
function well (if at all) if it perceives only a small part of all
the dimensions of the environment in which it lives. We all have that
problem. For instance, we see only 180 degrees of a possible visual
field of 360 degrees; rabbits do much better :-). Are there ways in
which we can add the missing information, if only imperfectly? Or
ways in which the missing information becomes less important? We
usually hardly notice this problem, I guess, but in spy novels the
hero -- who lives in a very unpredictable environment -- routinely
sits down with his back to the wall in that restaurant where he knows
all the rear exits.

Second, if the currently proposed Monster cannot stably remain in
control (my current guess), could it be a good controller if it had
other Monsters imparted it with information? If so (my current
guess), _overall_ ("social") control would become possible -- and
essential. Humans have the same problem. Due to their limited visual
field, for instance, soldiers in battle often (even spontaneously)
seem to assume back-to-back positions.

This also seems a problem where information theory might come in
handy. The Monster will not be able to control if 1) it perceives too
little and 2) can act in too few dimensions. What must it perceive at
the very least and how should it be able to act at least if control
is to succeed? There are a number of "levels" of consideration. In
one, the assumption is that each Monster may vary its reference level
arbitrarily at any time. In another, each Monster has the additional
information that the other Monsters have a goal (actually, the same
goal), and thus that they will _not_ vary their reference levels
arbitrarily.

The Monster is a Monster indeed, and a full analysis -- of its
behavior as it is now and how its behavior could be given extra
assumptions -- might well take a lifetime...

Greetings,

Hans

[Hans Blom, 970707c]

(Bill Powers (970703.1018 MDT))

I see the control systems inside an organism as operating
autonomously, save for the settings of their reference signals, so
if we posit reference signal inputs, we have the same situation as
in 15 different people.

I doubt, even though it's a good start, whether that is a good
comparison. I routinely observe that others tell me a great deal both
and about their goals. This information, too, has to be transmitted
through the environment, of course. Strictly seen, I can only observe
others' _actions_, words included. But it seems that a great deal of
those "actions" tells me something about others' higher levels. It
may even be that without this type of information no control will be
possible. Since both reference levels and weights may change
continually, for perfect control to be possible it appears necessary
describing) the other Monsters' goals and the weights of their
perceptual input functions. I'm aware that you do not strive for
perfect control, but a certain -- as yet unestablished -- minimum of
knowledge is required in each Monster for it to be a "good-enough"
controller.

The actual dimensionality of the environment, as far as any one
control system is concerned, is the number of input sensors the
system has (and even those are boiled down to one dimension as far
as the perception, reference, error, and output are concerned).

The actual dimensionality of each Monster's environment is one as
soon as the weights have settled down to constant values. As long as
the Monster can "manipulate" the weights, it sees more. If seeing
more helps, why would the Monster even want to freeze the weights?

My original idea in using 75 dimensions was to provide an
environment with an approximation to an _infinite_ number of
dimensions -- at least lots more dimensions than there are sensors
in any one control system.

Yes, that's what makes the problem so interesting to me: is it
possible to be in control even though we perceive only very little of
the environment?

In other words, maximize the signal (o) to noise (d) ratio in p.

That will be the _effect_ achieved, I agree.

Sounds like a filtering problem, when expressed this way...

Yes, but the disturbance occupies exactly the same bandwidth that
the perceptual signal occupies (it is produced by identical
systems).

I wasn't thinking of bandwidths but of the "cocktail party effect",
where we have this uncanny ability to selectively hear the voice of
one speaker while relegating everything else to background "noise".

Another consideration is each controller's loop gain A*R*W. Have
you analyzed within which limits it can vary, given your
assumptions, and assuming perfect disturbance suppression (d=0)?

Since the values of W are incremented and decremented on every
iteration, in principle there is no limit to the possible loop gain,
unless raising the gain starts to increase the squared error and
start reorganization going.

In a different Monster, the input weights could be chosen such that
the input gain would remain 1, whereas the (one-dimensional) output
gain could be varied. That would be a very similar Monster, but the
reference level would have a different meaning.

I'm sure you're right. Say, maybe we could learn something by giving
each controller just 1 sensor! I'll try it.

That could make things only worse.

I'm trying to avoid doing anything too systematic, for fear of
imposing my own order on the environment. At least I'm trying to
leave the system as unordered as possible. I'm sort of pretending to
be an amoeba who doesn't understand anything at all Out There.

But note that the environment is not unordered at all: if each
Monster knew only each other Monster's reference level and its
weights, each Monster would have full knowledge of everything! In
fact, I suppose that if the Monsters were to conspire and somehow
develop "social control" by exchanging private information, they'd be
much better controllers!

Greetings,

Hans

[Hans Blom, 970707f]

(Bill Powers (970705.1353 MDT)) to (Martin Taylor 970705 10:55)

Martin:

One can look at it geometrically. The output vector defines a
direction in 75-dimensional environment space. The sensor is linked
to one variable of the environment, so it defines one of the 75
axes. By Pythagoras' theorem in 75 dimensions ...

One cannot assign an overall loop gain like this, though, because it
doesn't factor in the gains within the other control systems.

Bill:

This is true. However, the more "other" systems there are, the more
likely it is that the disturbance applied to them will result in
returned disturbances of all magnitudes and both signs, with an
average effect of zero. Keep in mind that the inputs weights can be
positive or negative; much of what you say seems to imply that the
weights are all positive.

Martin is right here, I believe. Bill, think of a particle in
Brownian motion. It is a quite paradoxical phenomenon. The best
_prediction_ -- and maybe the naive intuition -- is that the particle
will not move, since there are a very great many water molecules that
uniformly hit it from all sides. But that is only on average, and it
neglects the effects of statistical fluctuations in number of hits.
Even though there is an "average effect of zero", there is still
macroscopic movement.

Greetings,

Hans

[From Bill Powers (970707.2127 MDT)]

Hans Blom, 970707b--

Yes, I have some problems with Bill's model, too. The most important
one is that the Monster's sensory equipment is utilized very badly:
although Bill _says_ that it has 12 (or 15?) sensors, it "really" >has
only one, because it must combine all sensory information into only
one one-dimensional "perception". That's bad; it "wastes" all of the
sensors but one. Things are only slightly improved in that it can
pick a "best" perception. Control would be far better if the Monster
could utilize a full 12-dimensional observation. Even in Bill's >model

this would be possible, e.g. by varying the input weights

("time-multiplexing"), but not if the weights are to converge to >constant

values.

I think you and Martin are misconstruing what my intentions are with the
"monster" investigations. I am not proposing a model of learning; I am
merely exploring some properties of this kind of learning model. At the
moment, I am try to see what happens when there is a fixed set of sensors
and actuators connected in a unknown way to an environment with a lot of
degrees of freedom. The current investigation assumes fixed output
weightings and variable input weightings. The two other cases remain to be
looked at: fixed input weightings and variable output weightings, and
variable input and output weightings.

In principle, if each environmental variable is affected by all the outputs
of the control systems through randomly selected weights from -1 to 1, a
single control system could adjust its own input weights so they were
exactly complementary to the output weights: that is, negative output
weights would lie in paths where the input weight was also negative for the
same control system. This would maximize the effect of a system's output on
its own input, giving the greatest degree of control (in comparison with
weightings that did not match in sign).

The other deficiency is, as you note, that there is no multi-
dimensional output function. Actually, there is no output function >at
all: the loop gain is currently determined by the input function.
That may create problems: a higher loop gain will create tighter
control, even though this is a process with diminishing returns at
high gains. But since _all_ controllers will discover this, there
will most likely be a run-away process where every controller will
increase its gain without limit. Very much like an evolutionary race
where, if predators become better, the prey has to become better as
well.

Each control system is given a basic output gain of 100 or 1000, but this
is only to put the numbers in a convenient range for display. The loop gain
is also affected by the gain through the environmental part of the loop,
but that is irrelevant here; what is controlled is a perceptual signal, not
an objective state of the environment.

There need be no competition among the various control systems. Each one
affects the environment through a different pattern of weightings, and if
reorganization succeeds each control system perceives it through a
different pattern of weightings. I think that when a solution exists, the
reorganizing process will end up making the dimensions of control as
orthogonal as possible.

Where output patterns are far from orthogonal, we do indeed see large
outputs developing which are _almost_ in opposition to other large outputs.
But the "almost" is important; the fact that the outputs are not _exactly_
opposing means that it is still possible to arrive an independent control
of the various perceptual signals.

The reorganizing process occurs at a speed proportional to the absolute
error signal. Thus as the error decreases, the amount of change of the
parameters per iteration also decreases. However, as you say it is possible
that the gain might eventually become large enough to make the system
unstable (although I haven't seen that yet). The cure is simple: let the
parameter values being reorganized gradually decay toward zero, or require
that the error be above some threshold for the parameters to be changed at
all. I have had these fixes in mind for some time, but since the basic
problem hasn't yet appeared I haven't bothered to incorporate either of
them into the program.

The problem right now is that something is basically wrong with the
program, and I have to find time to track it down and fix it before the
rest of the investigation can mean anything. I will be rather busy this
week, so will probably not make any progress until next weekend.

Best,

Bill P.