[From Bill Powers (970124.0500 MST)]
I suppose I will go to my grave still trying to figure out this problem, but
I can't seem to let go of it. I know there is something very wrong with the
"modern control theory" approach, _as a model of how living systems work_,
but each time I try to see what it is I come up against my mathematical
handicap. This leads to disdainful comments such as "Oh, well, that's a
pity, because you will never grasp the subtleties of this concept unless you
can follow the mathematics." It's a bit like being told by a Frenchman that
there are some concepts which only the French language can express, and that
we English speakers can't even imagine.
Somehow I can't bring myself to believe either the mathematician or the
Frenchman. Why is that? Is it just a matter of protecting my ego, so I can
go on pretending to have had a few good ideas despite my mental retardation?
Am I really an _idiot savant_ who can tell you instantly what day of the
week fell on January 5th in the year 199,307 BC, but can't figure out how a
farmer is going to divide his livestock among his sons? Am I simply a child
among adults, pretending to knowledge while the grown-ups smile at each
other and pat my head condescendingly? Oh, I tell you, some awful thoughts
occur to me as I grapple with this problem, interfering seriously with my
ability to think clearly about it. The poor ego tries to protect itself from
invalidation, all the while fearing that every horrible criticism is the truth.
This morning I almost have the answer, although as usual the idea in my head
as I woke up is changing shape before my eyes. Before it slips entirely
away, let me put it into inadequate words.
It is possible to devise methods of control using mathematical calculations.
For example, let's consider Bruce Abbott's model of the E. coli system, in
which a complicated logical process was proposed and actually implemented as
a computer program.
In this program, the system computed four logical conditions in which a
either an increase or a decrease of the reinforcer occurred at a time when
the previous behavioral act of tumbling had either improved the situation or
made it worse. According to the outcome, the probability density of the next
tumble was increased or decreased. Bruce repeatedly affirmed that this was
not meant to be a model of the real E. coli, but only an exercise in
applying reinforcement theory.
Yet I was unconvinced that this program was or even could be a simulation of
any organism doing anything. Not knowing how to justify my lack of
conviction I kept (relatively) silent. I had to admit that Bruce's model
"worked." It produced calculated behavior that would indeed have the right
kind of effect.
This morning, I would put my objection this way. The calculations performed
in Bruce's program would indeed produce the right kind of result, but that
result could be obtained only by an organism that could have written Bruce's
program. BRUCE could obtain that result, but the organism he was supposedly
modeling couldn't, unless it could also write programs and translate their
results into actions. In a way that I still can't put clearly into words,
Bruce's program is a description of the behavior of the organism, but not a
simulation of the means by which that behavior is produced. In other words,
Bruce's program would apply just as well to an E. coli organism that
actually creates the observed behavior by a means that involves neither
logic nor probability densities. It may be true that these logical
conditions and probability densities can be seen to hold in the observed
behavior, but while they _describe_ what we observe, they don't _make the
observed behavior occur_.
Now let's get to the harder problem: modern control theory. The basic
principle behind modern control theory is a way of transforming a reference
input into an action that will bring the state of an external plant to a
specified condition matching the reference specification. If r is the
reference signal, u the output, and x the external variable to be
controlled, then
x = f(u),
u = f^-1(r)
and therefore,
x = r
So if the output u is transformed into a state of x by the environmental
function f, we can make x = r simply by inserting the inverse of the
environmental function between r and u. The rest of the model, the part
including a forward world-model of the environment and the Extended Kalman
Filter method of adjusting the model, is just a way of adjusting the
parameters in the function f^-1(r) so they produce the right result. This
makes the model adaptive, which is important, but let's just focus on the
final result, after adaptation is complete. The above equations describe the
_point_ of using the EKF method. The means of achieving the adaptation is
secondary to the question of what it is that results from successful
adaptation, however it is achieved.
As Hans Blom has mentioned once or twice in passing, this basic method
depends on the existence of an inverse of the function f over at least some
limited region. Nothing is said about how the inverse is calculated. While
we can perhaps imagine an actual simulation of the environment existing in
the forward form, it is hard to imagine how the inverse of that same form
could also be calculated. Since, after adaptation, is it only the existence
of the inverse that is critical, we have to ask how the correct inverse
could be obtained.
The answer given to this question is simply an exposition on matrix algebra.
The reciprocal of a matrix is the adjoint of the matrix divided by its
determinant, or so says my old textbook. If you look up how this is
calculated in detail, you find a large array of calculations done in compact
form with multiple indices and conventional geometric ways of laying out the
underlying equations.
Now obviously, if someone is designing a control system and is able to
compute the inverse of the environmental function using matrix algebra, that
person will be able to compute the output u that will make x = r. It will
help a lot if those computations can be done automatically, because
computing the inverse with pencil and paper will be slow, and doing the
matrix multiplications to calculate the required output at each moment will
also be slow, too slow to keep up with all but the most glacial control
processes. A fast digital computer would be essential.
This is a perfectly valid model of an organism controlling something. But it
is a valid model only if the organism is a human being who understands
matrix algebra and knows how to program a computer to carry out the
mathematical calculations. It could not possibly be, for example, a model of
the way a dog or a chimpanzee, or a person who does not understand matrix
algebra, controls something. It's a particular way of controlling that
depends on a particular brain's ability to handle symbols according to the
relevant rules.
Of course some people do understand matrix algebra and matrices of Laplace
transforms and all that, and they know how to program computers and they
have machine shops and electronics shops that can build the required
interface components, so it is quite possible that they can build successful
control systems that work exactly in this way. It's even conceivable that
they could build systems surpassing the capabilities of natural systems,
although I might reserve judgement on that.
However, sooner or later the engineer-mathematicians who use this approach
are going to come up against exactly the same problem that faced the
old-time control engineers who used analog methods: finding the inverse of
the environmental equation, or a close enough approximation to it to allow
good stable control. The real environment is subject to physical dynamics,
so its continuous representation must be cast in terms of matrices of
(nonlinear) differential equations. Finding the inverse is the equivalent of
solving these equations, and unfortunately very few real environments have
forms for which any solutions are obtainable by analytical means. The
student gets an entirely wrong impression of the relation of mathematics to
reality, because the problems that are presented always HAVE solutions; the
writer of textbooks works backward from the solutions to the problems, and
so knows that there is an answer. But most problems encountered in the real
world lead to intractible equations; that is why engineering graduates find
that they know very little when they actually get to their jobs in industry.
They find that they have to limit their designs to fit the kinds of analyses
that do lead to answers, or resort to cut-and-try. And they might find that
stubborn old-timers who don't have the same sophisticated backgrounds
sometimes manage to produce designs that work even better than the
rationally-designed mechanisms work.
But that's the ego sticking up for itself. Let's try to get back to the main
thought, William.
What's the difference between a simulation and a calculation? At first
glance there doesn't seem to be any difference; a simulation is just a bunch
of calculations carried out iteratively in simulated time. But consider
Little Man Version 2.
Inside this model, we have a representation of the way a jointed arm with
three degrees of freedom responds to torques applied at the joints. As the
torques are varied, the arm flails around in space, going through all sorts
of contortions.
The program contains the differential equations that describe all the
accelerations and the interactions due to inertial effects of moving masses.
The accelerations are integrated to yield angular velocities, and the
angular velocities are integrated to yield joint angles. From the lengths of
the limb segments and the joint angles, the program calculates where the tip
of the arm and the elbow will be at all times. The little man drawn on the
screen shows the arm positions graphically.
This calculation involves some fairly complicated program steps which are
iterated over and over. But by making this calculation, are we asserting
that the arm moves by means of making calculations? Absolutely not. We are
not asserting by this model that any calculations at all take place in the
real arm. There are no multiplications and divisions, no sines and cosines,
no summations of numbers at small intervals. Instead there are physical
forces acting on physical masses in particular configurations, and we are
only trying to describe the relationships by using mathematical forms.
What we are simulating is a physical system, not a set of calculations. In
the program there are places where some expressions that are used several
times are used to give values to dummy variables which are then used here
and there in the ensuing program steps. Is there any implication that these
dummy variables correspond to anything in the real system? Not at all. There
are, in fact, many ways in which this program could have been written, ways
that conserve computing time or make the steps easier for the programmer to
understand. If the program itself were considered a simulation, then each
different way of writing the program would amount to a different proposition
about the real system. But that isn't the meaning of the program steps. What
the program does is to make certain variables dependent on others in
particular ways that correspond to properties of the physical arm. How that
is done is irrelevant. Behind the calculations is a conception of the
physical properties of the arm, which are actually responsible for its behavior.
The point is that the arm itself does not perform the calculations by which
we represent its response to applied torques.
Now consider a model of the nervous system that drives the arm. Here again,
we have some calculations. For example, the calculation of the output signal
that leaves the second level of control converts the error signal e into an
output signal o by
o := o + gain*e
What does this calculation represent? It certainly does not assert that the
output signal is obtained by this program step. But it does assert that the
output signal increases as a cumulative function of the magnitude of the
error signal. In other words, a integrator is proposed. A neural integrator,
by my way of modeling it, is a neuron in which the effects of
neurotransmitters accumulate inside the cell as each input impulse arrives,
while being reduced at the same time by diffusion, chemical reactions, and
the production of output impulses. If the losses are small, the rate of
production of output impulses will increase steadily with time while the
rate of incoming impulses is constant. To make the output impulse rate
decline rapidly, inhibitory input impulses are needed, the equivalent of a
negative input signal. The overall conversion of input to output is
approximately in the form of an integration, and we represent an integration
by the above program step. If we want to make the integrator leaky (taking
into account the natural diffusion of reaction products), we would write
o := o + gain*e - k*o
where k governs the decay rate.
But this is not a proposal that the neuron performs these program steps, or
that it computes integrals in a mathematical way, using symbols and rules.
By writing the simulation as a series of program steps, we are not by any
means proposing that the brain contains programmable digital computers that
execute program steps.
When we're modeling the nervous system, however, our calculations are coming
much closer to what the nervous system actually does. We expect that most of
the programming steps, those not done for convenience or speed, represent
components of the real nervous system. As above, a program step that does an
integration is supposed to correspond to a component of the nervous system
that does something very similar. The comparator that we model as an
algebraic subtraction is really supposed to correspond to something like a
neuron with both excitatory and inhibitory inputs. We don't for a moment
think that the neurons are using the mathematical rules of subtraction, but
we are using those rules to approximate what the neuron really does through
the medium of biochemical reactions.
Basically, every component of a simulation of the nervous system is supposed
to be a description of some component of the real nervous system. And this
creates a sticking point for me in my acceptance of the modern control
theory approach. I don't see how the mathematical calculations in that model
can correspond to any functions I can imagine a real nervous system performing.
... with one exception. The real human nervous system, we know for certain,
contains a level of function that permits it to deal with symbols according
to rules. I've already mentioned that kind of control system. At that level
we can have axioms and theorems and lemmas and derivations and proofs and
any mathematical entities of any degree of complexity. That's the kind of
thing that level DOES. But that's the ONLY level where we can expect a
mathematical model to be a literal representation of the behavior of the brain.
Day has come and the Muse has fled. I'll try again another day.
Best to all,
Bill P.