Simulations versus calculations, or something.

[From Bill Powers (970124.0500 MST)]

I suppose I will go to my grave still trying to figure out this problem, but
I can't seem to let go of it. I know there is something very wrong with the
"modern control theory" approach, _as a model of how living systems work_,
but each time I try to see what it is I come up against my mathematical
handicap. This leads to disdainful comments such as "Oh, well, that's a
pity, because you will never grasp the subtleties of this concept unless you
can follow the mathematics." It's a bit like being told by a Frenchman that
there are some concepts which only the French language can express, and that
we English speakers can't even imagine.

Somehow I can't bring myself to believe either the mathematician or the
Frenchman. Why is that? Is it just a matter of protecting my ego, so I can
go on pretending to have had a few good ideas despite my mental retardation?
Am I really an _idiot savant_ who can tell you instantly what day of the
week fell on January 5th in the year 199,307 BC, but can't figure out how a
farmer is going to divide his livestock among his sons? Am I simply a child
among adults, pretending to knowledge while the grown-ups smile at each
other and pat my head condescendingly? Oh, I tell you, some awful thoughts
occur to me as I grapple with this problem, interfering seriously with my
ability to think clearly about it. The poor ego tries to protect itself from
invalidation, all the while fearing that every horrible criticism is the truth.

This morning I almost have the answer, although as usual the idea in my head
as I woke up is changing shape before my eyes. Before it slips entirely
away, let me put it into inadequate words.

It is possible to devise methods of control using mathematical calculations.
For example, let's consider Bruce Abbott's model of the E. coli system, in
which a complicated logical process was proposed and actually implemented as
a computer program.

In this program, the system computed four logical conditions in which a
either an increase or a decrease of the reinforcer occurred at a time when
the previous behavioral act of tumbling had either improved the situation or
made it worse. According to the outcome, the probability density of the next
tumble was increased or decreased. Bruce repeatedly affirmed that this was
not meant to be a model of the real E. coli, but only an exercise in
applying reinforcement theory.

Yet I was unconvinced that this program was or even could be a simulation of
any organism doing anything. Not knowing how to justify my lack of
conviction I kept (relatively) silent. I had to admit that Bruce's model
"worked." It produced calculated behavior that would indeed have the right
kind of effect.

This morning, I would put my objection this way. The calculations performed
in Bruce's program would indeed produce the right kind of result, but that
result could be obtained only by an organism that could have written Bruce's
program. BRUCE could obtain that result, but the organism he was supposedly
modeling couldn't, unless it could also write programs and translate their
results into actions. In a way that I still can't put clearly into words,
Bruce's program is a description of the behavior of the organism, but not a
simulation of the means by which that behavior is produced. In other words,
Bruce's program would apply just as well to an E. coli organism that
actually creates the observed behavior by a means that involves neither
logic nor probability densities. It may be true that these logical
conditions and probability densities can be seen to hold in the observed
behavior, but while they _describe_ what we observe, they don't _make the
observed behavior occur_.

Now let's get to the harder problem: modern control theory. The basic
principle behind modern control theory is a way of transforming a reference
input into an action that will bring the state of an external plant to a
specified condition matching the reference specification. If r is the
reference signal, u the output, and x the external variable to be
controlled, then

x = f(u),

u = f^-1(r)

and therefore,

x = r

So if the output u is transformed into a state of x by the environmental
function f, we can make x = r simply by inserting the inverse of the
environmental function between r and u. The rest of the model, the part
including a forward world-model of the environment and the Extended Kalman
Filter method of adjusting the model, is just a way of adjusting the
parameters in the function f^-1(r) so they produce the right result. This
makes the model adaptive, which is important, but let's just focus on the
final result, after adaptation is complete. The above equations describe the
_point_ of using the EKF method. The means of achieving the adaptation is
secondary to the question of what it is that results from successful
adaptation, however it is achieved.

As Hans Blom has mentioned once or twice in passing, this basic method
depends on the existence of an inverse of the function f over at least some
limited region. Nothing is said about how the inverse is calculated. While
we can perhaps imagine an actual simulation of the environment existing in
the forward form, it is hard to imagine how the inverse of that same form
could also be calculated. Since, after adaptation, is it only the existence
of the inverse that is critical, we have to ask how the correct inverse
could be obtained.

The answer given to this question is simply an exposition on matrix algebra.
The reciprocal of a matrix is the adjoint of the matrix divided by its
determinant, or so says my old textbook. If you look up how this is
calculated in detail, you find a large array of calculations done in compact
form with multiple indices and conventional geometric ways of laying out the
underlying equations.

Now obviously, if someone is designing a control system and is able to
compute the inverse of the environmental function using matrix algebra, that
person will be able to compute the output u that will make x = r. It will
help a lot if those computations can be done automatically, because
computing the inverse with pencil and paper will be slow, and doing the
matrix multiplications to calculate the required output at each moment will
also be slow, too slow to keep up with all but the most glacial control
processes. A fast digital computer would be essential.

This is a perfectly valid model of an organism controlling something. But it
is a valid model only if the organism is a human being who understands
matrix algebra and knows how to program a computer to carry out the
mathematical calculations. It could not possibly be, for example, a model of
the way a dog or a chimpanzee, or a person who does not understand matrix
algebra, controls something. It's a particular way of controlling that
depends on a particular brain's ability to handle symbols according to the
relevant rules.

Of course some people do understand matrix algebra and matrices of Laplace
transforms and all that, and they know how to program computers and they
have machine shops and electronics shops that can build the required
interface components, so it is quite possible that they can build successful
control systems that work exactly in this way. It's even conceivable that
they could build systems surpassing the capabilities of natural systems,
although I might reserve judgement on that.

However, sooner or later the engineer-mathematicians who use this approach
are going to come up against exactly the same problem that faced the
old-time control engineers who used analog methods: finding the inverse of
the environmental equation, or a close enough approximation to it to allow
good stable control. The real environment is subject to physical dynamics,
so its continuous representation must be cast in terms of matrices of
(nonlinear) differential equations. Finding the inverse is the equivalent of
solving these equations, and unfortunately very few real environments have
forms for which any solutions are obtainable by analytical means. The
student gets an entirely wrong impression of the relation of mathematics to
reality, because the problems that are presented always HAVE solutions; the
writer of textbooks works backward from the solutions to the problems, and
so knows that there is an answer. But most problems encountered in the real
world lead to intractible equations; that is why engineering graduates find
that they know very little when they actually get to their jobs in industry.
They find that they have to limit their designs to fit the kinds of analyses
that do lead to answers, or resort to cut-and-try. And they might find that
stubborn old-timers who don't have the same sophisticated backgrounds
sometimes manage to produce designs that work even better than the
rationally-designed mechanisms work.

But that's the ego sticking up for itself. Let's try to get back to the main
thought, William.

What's the difference between a simulation and a calculation? At first
glance there doesn't seem to be any difference; a simulation is just a bunch
of calculations carried out iteratively in simulated time. But consider
Little Man Version 2.

Inside this model, we have a representation of the way a jointed arm with
three degrees of freedom responds to torques applied at the joints. As the
torques are varied, the arm flails around in space, going through all sorts
of contortions.

The program contains the differential equations that describe all the
accelerations and the interactions due to inertial effects of moving masses.
The accelerations are integrated to yield angular velocities, and the
angular velocities are integrated to yield joint angles. From the lengths of
the limb segments and the joint angles, the program calculates where the tip
of the arm and the elbow will be at all times. The little man drawn on the
screen shows the arm positions graphically.

This calculation involves some fairly complicated program steps which are
iterated over and over. But by making this calculation, are we asserting
that the arm moves by means of making calculations? Absolutely not. We are
not asserting by this model that any calculations at all take place in the
real arm. There are no multiplications and divisions, no sines and cosines,
no summations of numbers at small intervals. Instead there are physical
forces acting on physical masses in particular configurations, and we are
only trying to describe the relationships by using mathematical forms.

What we are simulating is a physical system, not a set of calculations. In
the program there are places where some expressions that are used several
times are used to give values to dummy variables which are then used here
and there in the ensuing program steps. Is there any implication that these
dummy variables correspond to anything in the real system? Not at all. There
are, in fact, many ways in which this program could have been written, ways
that conserve computing time or make the steps easier for the programmer to
understand. If the program itself were considered a simulation, then each
different way of writing the program would amount to a different proposition
about the real system. But that isn't the meaning of the program steps. What
the program does is to make certain variables dependent on others in
particular ways that correspond to properties of the physical arm. How that
is done is irrelevant. Behind the calculations is a conception of the
physical properties of the arm, which are actually responsible for its behavior.

The point is that the arm itself does not perform the calculations by which
we represent its response to applied torques.

Now consider a model of the nervous system that drives the arm. Here again,
we have some calculations. For example, the calculation of the output signal
that leaves the second level of control converts the error signal e into an
output signal o by

o := o + gain*e

What does this calculation represent? It certainly does not assert that the
output signal is obtained by this program step. But it does assert that the
output signal increases as a cumulative function of the magnitude of the
error signal. In other words, a integrator is proposed. A neural integrator,
by my way of modeling it, is a neuron in which the effects of
neurotransmitters accumulate inside the cell as each input impulse arrives,
while being reduced at the same time by diffusion, chemical reactions, and
the production of output impulses. If the losses are small, the rate of
production of output impulses will increase steadily with time while the
rate of incoming impulses is constant. To make the output impulse rate
decline rapidly, inhibitory input impulses are needed, the equivalent of a
negative input signal. The overall conversion of input to output is
approximately in the form of an integration, and we represent an integration
by the above program step. If we want to make the integrator leaky (taking
into account the natural diffusion of reaction products), we would write

o := o + gain*e - k*o

where k governs the decay rate.

But this is not a proposal that the neuron performs these program steps, or
that it computes integrals in a mathematical way, using symbols and rules.
By writing the simulation as a series of program steps, we are not by any
means proposing that the brain contains programmable digital computers that
execute program steps.

When we're modeling the nervous system, however, our calculations are coming
much closer to what the nervous system actually does. We expect that most of
the programming steps, those not done for convenience or speed, represent
components of the real nervous system. As above, a program step that does an
integration is supposed to correspond to a component of the nervous system
that does something very similar. The comparator that we model as an
algebraic subtraction is really supposed to correspond to something like a
neuron with both excitatory and inhibitory inputs. We don't for a moment
think that the neurons are using the mathematical rules of subtraction, but
we are using those rules to approximate what the neuron really does through
the medium of biochemical reactions.

Basically, every component of a simulation of the nervous system is supposed
to be a description of some component of the real nervous system. And this
creates a sticking point for me in my acceptance of the modern control
theory approach. I don't see how the mathematical calculations in that model
can correspond to any functions I can imagine a real nervous system performing.

... with one exception. The real human nervous system, we know for certain,
contains a level of function that permits it to deal with symbols according
to rules. I've already mentioned that kind of control system. At that level
we can have axioms and theorems and lemmas and derivations and proofs and
any mathematical entities of any degree of complexity. That's the kind of
thing that level DOES. But that's the ONLY level where we can expect a
mathematical model to be a literal representation of the behavior of the brain.

Day has come and the Muse has fled. I'll try again another day.

Best to all,

Bill P.

[From Oded Maler (970124)]

[From Bill Powers (970124.0500 MST)]

I suppose I will go to my grave still trying to figure out this problem, but
I can't seem to let go of it. I know there is something very wrong with the
"modern control theory" approach, _as a model of how living systems work_,
but each time I try to see what it is I come up against my mathematical
handicap. This leads to disdainful comments such as "Oh, well, that's a
pity, because you will never grasp the subtleties of this concept unless you
can follow the mathematics." It's a bit like being told by a Frenchman that
there are some concepts which only the French language can express, and that
we English speakers can't even imagine.

Well, I don't know whether you were aware of it, bit it coincides with
a famous quote:

···

----------------------------------------------------------------
Mathematicians are like Frenchmen: whatever you say to them they
translate into their own language, and henceforth it is something
entirely different.
                -- Johann Wolfgang von Goethe
----------------------------------------------------------------

Somehow I can't bring myself to believe either the mathematician or the
Frenchman. Why is that?

Really unexplainable. If you you apply your own PCT principles, you will
surely see how perceptual hierarchies of speakers of different languages
differ.

For the rest of your post, you said something rather obvious (to me).
In the simulation, the algorithm for calculating the state of a
falling stone ( x(t+1)=x(t)+v(t); v(t+1)=v(t)+g or something like
that) is not supposed to represent "mental" activities of the objects
involved: the stone does not "calculate" its state conciously and
there is a universal physical mechanism responsible for its dynamics.
For human sensory-motor activities (not just inert movements) you want
the underlying control mechanism to be realizable by neuronal
computing machinery. You claim that humans at that level cannot
compute inverses of matrices in real-time. This is probably true,
however what they do (by hierarchical servoing as you suggest) is a
good enough approximation of these calculations.

--Oded

[From (Bruce Gregory (970124.1030 EST)]

(Bill Powers (970124.0500 MST)

This morning, I would put my objection this way. The calculations performed
in Bruce's program would indeed produce the right kind of result, but that
result could be obtained only by an organism that could have written Bruce's
program. BRUCE could obtain that result, but the organism he was supposedly
modeling couldn't, unless it could also write programs and translate their
results into actions. In a way that I still can't put clearly into words,
Bruce's program is a description of the behavior of the organism, but not a
simulation of the means by which that behavior is produced. In other words,
Bruce's program would apply just as well to an E. coli organism that
actually creates the observed behavior by a means that involves neither
logic nor probability densities. It may be true that these logical
conditions and probability densities can be seen to hold in the observed
behavior, but while they _describe_ what we observe, they don't _make the
observed behavior occur_.

The same problem arises in interpreting the mathematics of
quantum mechanics. Physical systems can be represented by
vectors in Hilbert space, but very few people believe that
physical systems _are_ vectors in Hilbert space, or that
electrons perform matrix multiplications to determine where they
"should" go. The situation is much better with PCT because we
can picture and employ models that plausibly reflect what
organisms actually do rather. For me, the strongest argument in
favor of PCT is its conceptual simplicity and testability.
Lacking a compelling reason in the form of a breakdown of PCT, I
can see no reason to worry about more complex models that
_might_ be able to explain the same phenomena.

Bruce Gregory

[From Bill Powers (970124.1610 MST)]

(Bruce Gregory (970124.1030 EST) --

For me, the strongest argument in
favor of PCT is its conceptual simplicity and testability.
Lacking a compelling reason in the form of a breakdown of PCT, I
can see no reason to worry about more complex models that
_might_ be able to explain the same phenomena.

Thank you.

Bill P.

[From Bill Powers (970124.1600 MST)]

Oded Maler (970124) --

You claim that humans at that level cannot
compute inverses of matrices in real-time. This is probably true,
however what they do (by hierarchical servoing as you suggest) is a
good enough approximation of these calculations.

I'd put it this way: these calculations are not the same as what people do
by hierarchical servoing. The output function in a PCT control system is NOT
the inverse of the environmental feedback function, so the PCT model is NOT
equivalent to the calculations that use inverses.

Best,

Bill P.

[From Oded Maler 970121]

Bill P. said:

I'd put it this way: these calculations are not the same as what people do
by hierarchical servoing. The output function in a PCT control system is NOT
the inverse of the environmental feedback function, so the PCT model is NOT
equivalent to the calculations that use inverses.

As Hans also said, you *must* calculate something that resembles the inverse
if you want to control the thing. At least if you accept the premises of
control theory as applicable to PCT.

And these are:

1) There is a dynamical system whose evolution depends on its state,
on controller actions and on disturbances.
2) The controller can observe the state of the system (or some
perceptual function of it).
3) By applying a feed-back function the controller can push the observable
values to some reference state.

In order for this to work, the feed-back must somehow inverse the
drift of the uncontrolled system and steer it to the right direction.

The only objection one can raise are:

1) The physical world is not a dynamical system - so what is it?
2) Complex (but finite) perceptual hierarchies, when applied to the
practically-infinite world let people control in a different
way, by choosing the perceptual variables so that to have a feeling
of control.

--Oded

[From Rick Marken (970127.0800)]

Hans Blom (970127) --

My thoughts, experiments, calculations and simulations show
(prove may be too strong a word) _that_ an inverse function
(of the environment) is needed in a (any) controller if it
is to "calculate" how to act.

Right. But our thoughts, experiments, calculations and simulations
_prove_ (not too strong a word;-)) that an inverse of the environment
function is most definitely _not_ needed by an ordinary control
system -- one that controls its perception.

So how does the nervous system do it [calculate the inverse]?

It doesn't. The nervous system controls perceptual inputs; it
doesn't calculate actions. You are devoting your life to the
study of something that doesn't happen. But you're young enough;
there is still time to change. All you have to do is do it.

Oded Maler (970121) --

As Hans also said, you *must* calculate something that resembles the
inverse if you want to control the thing.

Just because Hans said it doesn't make it so. In fact, a good case
can be made for the proposition that if Hans said it, it is almost
certainly is _not_ so. Hans is just defending ideas to which he has
devoted a great deal of his time and energy. He has probably written
books on this stuff. I think the chances of Hans realizing that
living control systems _don't_ calculate actions (using the inverse
of the environmental feedback function) are rather minute.

At least if you accept the premises of control theory as applicable
to PCT.

If you accept the premises of control theory as applicable to
PCT (viz., that behavior occurs in a negative feedback loop so that
sensory input is _always_ both a cause and a result of action) then
it is unquestionably true that the behaving system _must not_
"calculate something that resembles the inverse" of the environmental
function relating its actions to its inputs.

Best

Rick

[From Bill Powers (970127.0900 MST)]

Oded Maler 970121 --

As Hans also said, you *must* calculate something that resembles the
inverse if you want to control the thing. At least if you accept the
premises of control theory as applicable to PCT.

"Something that resembles an inverse" is a long way from "inverse." See my
post to Hans today.

One of my problems with the complex mathematics some people use (aside from
my mental handicap) is that when you start out proving theorems and building
a big mathematical structure on them, you really have no idea how much
deviation from the idealized assumptions is allowed. As in my example for
Hans (which I hope is valid) , you can prove mathematically that the
integral of the derivative of x is x, but in a physical system the
_slightest_ deviation of the actual computations from perfection creates
physically impossible conditions, and introduces mathematical singularities.
It's like taking the limit of a ratio as numerator and denominator go to
zero. If the numerator doesn't _quite_ make it to zero, your calculation of
the limit is rather far from reality.

When mathematical systems of reasoning, based on exact manipulation of
symbols, are used as models of physical systems, just how brittle are the
conclusions? I'd like to know, before going too far out on a mathematical
bridge from one observation to another.

Best,

Bill P.

[From Oded Maler (970121-II)]

[From Rick Marken (970127.0800)]

Oded Maler (970121) --

>As Hans also said, you *must* calculate something that resembles the
>inverse if you want to control the thing.

Just because Hans said it doesn't make it so. In fact, a good case
can be made for the proposition that if Hans said it, it is almost
certainly is _not_ so. Hans is just defending ideas to which he has
devoted a great deal of his time and energy. He has probably written
books on this stuff. I think the chances of Hans realizing that
living control systems _don't_ calculate actions (using the inverse
of the environmental feedback function) are rather minute.

According to my personal rule of a thumb, now I have an even stronger
evidence..

>At least if you accept the premises of control theory as applicable
>to PCT.

If you accept the premises of control theory as applicable to
PCT (viz., that behavior occurs in a negative feedback loop so that
sensory input is _always_ both a cause and a result of action) then
it is unquestionably true that the behaving system _must not_
"calculate something that resembles the inverse" of the environmental
function relating its actions to its inputs.

I started reading Quantum Psychology by Wilson. In the introduction
he speaks about "semantic noise" related to the meaning of words.

So let me try to restate it once more: there is a system with a state
vector x whose closed loop behavior depends on its internal dynamics
and a feed-back. You can imagine that not all possible feed-back laws
will bring the system state (or a perceptual function of it) to a
reference value. In fact, a small fraction of the possible feedback laws
will do the job.

In order that the system + the feedback will flow to the desired
reference (or if you want, the reference value is the attractor of the
combined system) the system+feedback must have certain proerties, that
imply certain properties for the feed-back. This is what I now call
the "inverse" - it is not necessarily what you calculate in linear
control when you have a large matrix - this is just one instance of an
"inverse". The hierarchical structure perhaps lets you claculate
another type of inverse on multiple time-scales. And of course it is
not done consciously by a homonucleus sitting on your muscles.

How does the above secular description contradict your particular
faith is beyond me.

Best regards

--Ofrf

[From Rick Marken (970127.1100)]

Oded Maler (970121-II)

there is a system with a state vector x whose closed loop behavior
depends on its internal dynamics and a feed-back.

You lost me already. What's a state vector?

You can imagine that not all possible feed-back laws will bring
the system state (or a perceptual function of it) to a reference value.

Ah. From context it sounds like "state vector" is "controlled variable".
But feedback laws don't bring controlled variables to reference values.
So I can't imagine what you are talking about here.

In fact, a small fraction of the possible feedback laws will do
the job.

Feedback laws don't "do the job". I think you mean that a particular
control system can't maintain control in many environmental
circumstances (feedback functions). I agree that this is true. I
suppose it is also true that a particular control system can only
maintain control in a small fraction of the environmental circumstances
that one could imagine. That's what makes control engineering
interesting; you have to learn how to build control systems that work
under the ranger of circumstances (feedback function) that it might
encounter.

In order that the system + the feedback will flow to the desired
reference...the system+feedback must have certain proerties

True. The control system must have the correct properties or it will
not control.

This is what I now call the "inverse"

Interesting. I thought "compute the inverse of the feedback function"
meant "compute 1/f(), where f() is the feedback function". But what
you mean by "compute the inverse of the feedback function" is "design
the control system so that it controls in a particular environment".
Now why didn't I realize that that was what you meant? I suppose that
control engineers who have been saying that "behavior results from
computation of the action that produce particular results" have
really meant "behavior is the control of perception".

Fascinating.

How does the above secular description contradict your particular
faith is
beyond me.

I guess it doesn't. I was just confused by your terminology. Now I
see that you description iscompletely consistent with my "faith"
that behavior is the control of perception. You just say it a bit
differently. From now on, when you say "a control system must compute
the inverse of the feedback function" I will understand you to mean
that "control systems do not compute the inverse of the feedback
function".

Thanks for the clarification.

Best

Rick

[Hans Blom]:> >So how does the nervous system do it [calculate the inverse]?

[Rick Marken]:> It doesn't. The nervous system controls perceptual inputs; it

doesn't calculate actions. You are devoting your life to the
study of something that doesn't happen. But you're young enough;
there is still time to change. All you have to do is do it.

living control systems _don't_ calculate actions (using the inverse
of the environmental feedback function) are rather minute.

If you accept the premises of control theory as applicable to
PCT (viz., that behavior occurs in a negative feedback loop so that
sensory input is _always_ both a cause and a result of action) then
it is unquestionably true that the behaving system _must not_
"calculate something that resembles the inverse" of the environmental
function relating its actions to its inputs.

Best

Rick

Dear Rick,

        I am still trying to burn PCT into my brain so that I can readily
apply it to problems I am most interested in, namely, language. Anyone
else working on this one? Any linguists into PCT?

        I reply to this post because it relates to a question and an
intuition I have had for a while. Say I throw a ball to my friend and
she catches it. She obviously had to do _something_ with her brain in
order to catch it. I have wondered what that something would be. I
guess the standard explanation, a la Blom and others, is that some sort
of calculations are made by my friend's brain, which allow her to place
her hand precisely in the ball's trajectory and then close her fingers
around it as soon as it hits her hand. Could you explain a bit more, for
me, how PCT explains this catch? I am thinking that you are saying that
no calculations are necessary. Instead, basically, my friend has a goal
of catching the ball with her hand, and she knows, intuitively (whatever
that means) that if she puts her hand in the right spot and grabs the
ball, the goal will have been achieved. So what she does is move her
hand based on the perceptions she has of where her hand is and where the
ball seems to be headed, until the ball proceeds into her hand.

        So no mathematical calculations are involved, right? Those are
unnecessary. I guess if the ball were actually a point moving through
formal space (not a real ball in real space), and my friend's
neurological view of the world was based on mathematical calculations of
Newtonian physics, then calculations would be necessary. Is this pretty
much how computers work? But since my friend can perceive a gross object
moving through space, can focus on that object, and can move her hand
through an indefinite range of points to catch it, she really only needs
to control her perception of her hand, its movements, and its relation to
the position of the ball. Do I have this right, however sloppily?

        Now, Noam Chomsky and a million linguists are out there trying to
say that there are all kinds of formal rules in our heads which are put
into play when we use language. We are not aware of those rules, but
they are there. Some linguists think this is bunk. I think it is too.
Do you see, without me drawing it out forever, the relationship between
PCT and its power of explanation sans mental-calculator for catching a
ball, and PCT's potential explanatory power for how language really works?

        I know you and others here will say that we need to find out what
is being controlled for in speech production and speech comprehension.
Any ideas what that might be and how we can explain language without
recourse to a million ad hoc formulations for all the languages of the
world?

Scott

···

On Mon, 27 Jan 1997, Rick Marken wrote:

______________________________________________________________________________
Scott M. Stirling Email: scstirli@anselm.edu
Saint Anselm College Phone: (603) 668-1101 school
Box 2111 (603) 225-3799 home
100 Saint Anselm Dr.
Manchester, NH 03102-1310

http://www.anselm.edu/student/scstirli/welcome.html

[Avery.Andrews 970128, Eastern Australia]
(Scott Stirling 970127)

       So no mathematical calculations are involved, right? Those are
unnecessary. I guess if the ball were actually a point moving through
formal space (not a real ball in real space), and my friend's
neurological view of the world was based on mathematical calculations of
Newtonian physics, then calculations would be necessary. Is this pretty
much how computers work? But since my friend can perceive a gross object
moving through space, can focus on that object, and can move her hand
through an indefinite range of points to catch it, she really only needs
to control her perception of her hand, its movements, and its relation to
the position of the ball. Do I have this right, however sloppily?

The nature of the calculations required depends on the nature of the
situation. There's a method for catching fly-balls that Bill Powers
told me about that involves, I think, running so as to minimize/eliminate
the perceived movement of the ball - if the ball doesn't seem to be moving
it's gonna land in your face, so you make sure your glove is in the way.
So in this case control of some simple perceptual variables in real time
makes it unnecessary to make calculations where the ball is going to be.

Another common topic of dicussion in this area is reaching, grasping,
etc.; in industrial robotics it is the normal practice to calculate the
`inverse dynamics', which means computing the muscular forces that will
result in moving an arm to a given position. Bill's `Little Man' demo shows
how this is unnecessary for human behavior; there's been a fair amount
of discussion of this in the past, searching the csg archives on topics
such as `inverse dynamics' and `transpose cartesian blues' will turn up
some relevant discussion.

PCT doesn't say `no calculations', but there is a very strong emphasis on
not assuming complicated and difficult calculations until all the
possibilities for simple and easy ones have been ruled out, a step that
people with conventional cog. sci. & linguistics backgrounds (like me)
tend to skimp on ...

       Now, Noam Chomsky and a million linguists are out there trying to
say that there are all kinds of formal rules in our heads which are put
into play when we use language. We are not aware of those rules, but
they are there. Some linguists think this is bunk. I think it is too.
Do you see, without me drawing it out forever, the relationship between
PCT and its power of explanation sans mental-calculator for catching a
ball, and PCT's potential explanatory power for how language really works?

As one of those millions, I'll note that one difference between language
and, say, reaching for a coffee cup is that the *apparent* need for models
and calculation is much greater in language. Language clearly has a lot
of purposes, but I'd maintain that the fundamental one (responsible for
the major differences between language and other kinds of animal communication)
is remedying perceived asymmetries in knowledge: I perceive that you
don't know something, I want you to know it, so I say it; or I perceive
that you know something that I don't know, I want to know it, so I ask
you about it. But since we can't have continuous and full perception
of each other's knowledge states, a lot of `assumption and refutation'
goes on. So if I say `The butter is in the refrigerator', and you're
standing there, I henceforth assume that you know where the butter
is (hypothesis) *unless* you do something that clearly indicates that
you don't (stand there stupidly with a fresh piece of toast, seeming
to be looking around for something ...

I think that grammatical rules can be in part understood as a gadget for
enabling to people to make assumptions about what people know, based on
what has been said, but alongside the grammatical aspect of language there's
the `interactional' aspect, studied in `Conversation Analysis', which I don't
know much about, but would like to know more. For example people don't
like to shoot their assertions into the void; if you tell something something
(a `telling', in CA terminology, our local practitioner tells me), they're
not supposed to just sit there, but make some kind of acknowledgement, which
at least indicates that they understood something. Grammatical theory
has nothing in it about interaction, and very little about minimization
of calculations (twice in the last 30 years we have people proposing
structures with a zillion levels of embedding for simple transitive sentences
like `Floyd broke the glass'); Conversaton Analysis is highly
descriptive and rather aggressively a-theoretical (kindof a disincentive
to study it), and current PCT doesn't have a whole lot to say about
the perceptions and errors that drive `communicative activity', so there's
certainly plenty to think about in this area (thinking something useful
and intelligent is maybe not so easy, however).

  Avery.Andrews@anu.edu.au

[From Oded Maler (970128)]

Bill Powers (970127.0900 MST)]

Oded Maler 970121 --

"Something that resembles an inverse" is a long way from "inverse." See my
post to Hans today.

I am still making my first steps in understanding control theory. I'll
share with you some kindergarten stuff I learned. You may have the
state of the system as a vector x in R^n and consider discrete time
models. The dynamics of the system can be given by x(t+1)=Ax(t)+Bu(t)
where A, B are some matrices and u is a vector representing the
controller's actions (matrices are just a compact way to represent
interactions between variables, A specifying how the system variables
interact and B how the controller actions influence the dynamics). A
feed-back function (I think, btw that we use the term in the different
[but symmetrical] sense: I mean the feed-back that the controller
applies to the environment) is something that determines u by
observing x. In the linear case it can be written as u(t)=Fx(t). (we
assume full observability to simoplify). So now, the dynamics of the
external environment + the controller can be written as

x(t+1)= (A+BF)x(t)

Suppose you want x to become the zero vector, so you have to find a
good feed-back function F such that A+BF is an operator that "shrinks"
every x (absolute value smaller than 1, in the case of scalars).
In general if A and B are ok you can find F that makes you reach zero
in n steps for every x.

x(t+n)=(A+BF)^n x(t) =0

You can thus find the/a good F using linear algebra. Of course, there are
many assumptions here which are not true for living control systems and
even for engineering ones. Systems are not linear, the values of F
are bounded, the parameters of A and B are distributed in time and space,
noise, disturbances, there are communication lags and so on. But what
should be adapted from this theory are not the specific technicalities
but the conceptual ideas. A PCT hierachical controller implements another
F which is perhaps more dirty object mathematically (although you use
simplifying assumption in analyzing it) but more physically realizable.
But it might have a certain relation with the dynamics of the physical
world so that it can "inverse" and "control" it.

I think you might want to look at the first chapter of E.D. Sontag,
Mathematical Control Theory, Springer, 1990.

Best,

--Oded

[From Bill Powers (970128.0600 MST)]

Oded Maler (970128) --

You may have the
state of the system as a vector x in R^n and consider discrete time
models. The dynamics of the system can be given by x(t+1)=Ax(t)+Bu(t)
where A, B are some matrices and u is a vector representing the
controller's actions (matrices are just a compact way to represent
interactions between variables, A specifying how the system variables
interact and B how the controller actions influence the dynamics).

This expression is what we call in PCT the feedback function, because it
turns the output of the control system (u) into the input to the control
system (x). That is, the output of the control system feeds back to affect
the input via the environment. I trust that you can adjust your usage of the
term feedback to fit this convention (rather than requiring everyone else to
change theirs).

The following doesn't address the "inverse" question, which I will leave for
later. It addresses the discrete function you write above. As a
representation of a physical system, this function omits an _essential_
consideration.

One thing I've been pointing out is that even though you're using the symbol
t, this equation does not take _physical_ time into account. What is the
length of time that correponds to "1" in "t + 1"? Is it one second? One
nanosecond?

If the environment is such that the vector x can change instantly from one
set of values to any other set, there is no problem -- it doesn't matter
what the physical time interval is, because the system will maintain one set
of values until you get around to computing the next set, whether the
interval is one second or one year, and then the next set of values will
instantly appear. The intervals don't even need to be equal, because the
index t can be incremented by 1 any time you please, at any physical time.
You can pause to print out a page of results, and when you resume computing
the system will continue from where it left off.

However, suppose that u represents a vector force, and x represents a vector
velocity. In that case, we have delta-x = B*u. Delta-x is x(t+1) - x(t), and
the equation should be written

x(t+dt) = x(t) + B*u*dt,

where B = 1/mass, and
      dt = physical time per iteration

Now it _does_ make a difference how fast the calculations are done. If the
"1" represents 0.01 seconds of physical time, then dt = 0.01, but if it
equals 10 seconds, then dt = 10.0. Notice that we have now created a
relationship between the size of the time step (on the left) and the scaling
factor applied to u on the right. It's not obvious that in your version of
the equation, you are asserting that dt = 1. That 1 is not an ordinal
number; it is a length of time, in some units.

The reason this is important is that the equation might be an acceptable
approximation of the physical process with dt = 0.01, but totally
unacceptable if dt = 10. If the output u is varying at 1 cycle per second,
then obviously if you're computing the state of the system only once every
10 seconds you're going to get a completely false picture of the variations
in x.

If dt is made small enough, then the operation of the system should become
independent of the size of dt. As dt becomes smaller, the calculations
should simply give a more and more accurate picture of the variations in x
as u varies. But in the equation as you wrote it, B is a constant of the
system, and dt is absorbed into it. There is no indication that if B changes
because the dt component changes, the operation of the system should be
essentially the same (with small enough dt), but if it changes because the
actual system constant changes, the operation should be different. This
alone is enough to show that your equation is an invalid representation of a
physical system.

There's nothing wrong with representing a continuous system as a series of
discrete states. But you have to remember that the underlying system is
continuous and is described by functions of physical time, not iteration
index. The assumed units of the "1" in "t + 1" make a tremendous difference,
a difference that is simply not apparent when you do your computations as
they are done in the model that Hans presented, and that you echo above.
That simple little "dt" provides the link between the conceptual system
represented in discrete steps and the physical system that operates
continuously.

A
feed-back function (I think, btw that we use the term in the different
[but symmetrical] sense: I mean the feed-back that the controller
applies to the environment) is something that determines u by
observing x. In the linear case it can be written as u(t)=Fx(t). (we
assume full observability to simplify). So now, the dynamics of the
external environment + the controller can be written as

x(t+1)= (A+BF)x(t)

Suppose you want x to become the zero vector, so you have to find a
good feed-back function F such that A+BF is an operator that "shrinks"
every x (absolute value smaller than 1, in the case of scalars).
In general if A and B are ok you can find F that makes you reach zero
in n steps for every x.

x(t+n)=(A+BF)^n x(t) =0

This is not the situation as Hans presented it. When Hans used the term
"inverse" he meant it literally: the mathematical inverse. If y = Ax, then
the inverse is written as x = (A^-1)y, where (A^-1) is the inversion of the
matrix A: adj(A)/|A| (I've been looking at my old text on matrix algebra).
Suppose you say there is a reference vector r, which is multiplied by the
matrix A^-1 to produce the output u. Then u is multiplied by the matrix A,
representing the environment, to produce x. In that case

x = A(A^-1)r = Ir = r, where I is the unit matrix.

Thus the controlled variable x will follow the variations in r perfectly.
This, apparently, is the big insight of "modern control theory." If you can
somehow compute the inverse matrix, and if multiplying r by the inverse
matrix doesn't generate any physically impossible states such as infinite
velocity or acceleration (the point I addressed in a previous post), then
the controlled variable will EXACTLY follow variations in the reference
signal. In our initial discussions of this model, Hans made a big deal out
of this exactness, pointing out that a PCT model always had to have at least
a little error in it, while the MCT model could achieve _perfect_ control. I
hope it's evident now that there are certain difficulties with this claim.
It is, in fact, mathematically sophisticated but physically naive.

···

---------------
What you are calling a "sort of inverse" is quite different from what Hans
is describing. In looking for a function that will "shrink" the value of x
toward zero, you are doing something more like what the PCT model does, not
looking for a perfect correction in one jump but simply trying to assure
that the final state will be reached in some reasonable time. However, your
treatment is deficient in that it allows for only one reference condition: x
= 0. What we need is a system that will maintain x = r, where r is an
arbitrary function of time. In the PCT model this is achieved by making u
not a function of x, but of r - x (or x - r, if you want to adjust all the
other signs accordingly). So if the environment function is E and the
organism function is O, our two equations are

x = E(u)

u = O(r - x)

Now what you need to find is a function O that will guaranteed the
"shrinkage" of (r - x) to zero in some usefully-short time. This function
will NOT, in general, be the inverse of the function E.

[Incidentally, for bystanders, the inverse of f(x) is NOT 1/f(x). The
notation f^-1(x) is not the same thing as f(x)^-1. The first notation means
f(x) solved for the value of x; the second means the reciprocal of the value
of f(x). If y = ax^2, the inverse relationship is x = sqrt(y/a).]
------------------------
Last point. We still need to consider disturbances. A disturbance, in PCT,
is defined as an _arbitrary_ variable, a function of time, which adds its
effects to the value of x. It represents the sum of all effects on x that
are not due to the control system's own output. The value of x is the sum of
the effects of all disturbances plus the effects of u acting through the
feedback path.

When we started the discussion of Hans' model, it was evident to me that
this model contained no provision for a disturbance of this general kind.
There were disturbances, but they were treated as random noise, their
effects being mainly to reduce the speed of convergence of the adaptive part
of the model. Only their variance was considered, and their mean value was
assumed to be zero. Because the individual fluctuations of the disturbance
were simply averaged out, there was never any question of the system
counteracting the effects of the disturbance. Disturbances of x simply
caused x to fluctuate. The idea that the control system could _reduce_ these
fluctuations in x was absent.

To represent these disturbances, your equation should be written

x(t+n)=(A+BF)^n x(t) + C*d(t) = 0

   where d(t) is arbitrary, and in general non-analytic.

Suppose there is an arbitrary disturbance d(t). We can partition the
bandwidth of the variations into two parts: a slow part and a fast part. By
definition, the fast part causes fluctuations in x that the system can't do
anything about. These fluctuations appear as corresponding fluctuations in
x, and have to be treated statistically, as random variables. That was the
only kind of disturbance initially considered in Hans' discussion.

The slow part of the disturbance, however, by definition involves variations
slow enough to be comparable with the speed of operation of the control
system, and so potentially can be counteracted. That is, it is physically
possible for u to vary fast enough to cancel out all the slow variations in
d(t) -- that is what we mean by the "slow" part of the disturbance.

The question is then, how should the system be organized to accomplish this
cancellation of the slow part of the disturbance? This cancellation is
essential, because d(t) is an _arbitrary_ function, and its amplitude can be
as large as the system's output u (no system can maintain control when d(t)
is larger than the maximum possible u(t)). The bandwidth of the slow part of
the disturbance extends to zero, meaning that there can be constant
components and slow components varying on any time scale. The average value
is not guaranteed to be zero.

Obviously, if there were a signal generator inside the control system that
could generate a d' exactly in synchronism with the real disturbance, but in
the opposite direction, then it could cause additional variations in u that
would just cancel the effect of the disturbance. So in principle, just
looking at the problem naively, it would seem that Hans' model could handle
arbitrary disturbances that are within the control bandwidth of the system.

Hans actually produced a model that could derive the correct value of the
modeled disturbance and apply it to cancel the effects of the real
disturbance very closely. This was very impressive. However, it turned out
that in order to do this, it was necessary to make the "adaptation" of that
part of the model occur very fast, so fast that when the input was cut off,
the model lost control about as fast as the PCT model lost control. So the
major advantage of the simulation-based model, the ability to continue
maintaining almost the right outputs after loss of input, was lost. Of
course the disturbance was always of the "slow" kind -- within the
capabilities of the output variations to oppose them.

But there was another problem which was hidden by the simple nature of the
environmental feedback function. Somehow, the disturbance was introduced in
just the right place in the model's internal simulation so that its effects
showed up properly in the output. By "properly" I mean that the output
effects were adjusted so that _after they had passed through the external
system_ the result was equal and opposite to the effects of the disturbance.
This was easy to achieve since only a constant of proportionality was involved.

Disturbances, however, can occur anywhere in the link between u and x. This
means that the internal simulation must contain provisions for simulated
disturbances _everywhere they might occur_. This vastly increases the
complexity of the problem of simulating disturbances. It also makes the
problem of "system identification" far more difficult, because somehow the
controlling system must find out about the existence of arbitrary
disturbances even when (as is usually the case) there is no sensory
information about the cause of the disturbance, and the disturbance occurs
only for short periods at unpredictable intervals.

The main problem is that if the internal simulation doesn't contain a
simulated disturbance in the right place, the real disturbance will simply
affect x and the system will do nothing about it -- even though it could
vary its output fast enough to do so. So this model of behavior relies on
the ability of the controlling system to construct an internal simulation of
_every disturbance that can possibly occur_.

While this is conceivable, at some point the theoretician must pause to
wonder whether the baroque complexities that are being generated are any
longer believable as a model of the real system. If a simpler way to achieve
the same result is available, even if the result is not quite so perfectly
exact, the balance must sooner or later swing to preferring the simpler model.

The PCT model, of course, requires no information about the disturbance. It
can be adjusted for optimum functioning in a disturbance-free environment,
and the first time a disturbance occurs it will be strongly and quite
accurately (if not perfectly) opposed. That means ANY disturbance that is
not too fast or too large (provisos that apply to ALL control system
models). And this is accomplished without any computation of the
mathematical inverse of the environmental feedback function.

Best,

Bill P.

[From Rick Marken (970128.0900)]

Me:

The nervous system controls perceptual inputs; it doesn't
calculate actions.

Scott Stirling (970127)--

So no mathematical calculations are involved, right?

Excellent questions in your post, Scott. Avery.Andrews (970128, Eastern
Australia), one of our other resident linguists, gave an excellent
answer. But I think this topic is so important that I will try to give
one myself.

Control theory is about how people produce intended results (such as
a caught ball). PCT says this is done by control of a hierarchy of
perceptual variables. Blom et al (the "Modern Control Theory" -- MCT
--crowd) say it is done by calculation of outputs based on a "model" of
the environment.

Here is a way to look at the MCT approach to the production of
intended results. MCT recognizes that intended results are produced
by the outputs of the system and that there is a functional
relationship between outputs and results. This functional relationship
is called "the environment function" or the "feedback function". Here
is an example of an environment function that relates a human output
(neural current) to a result of that output (distance of a ball from
the hand):

Result + |<---------------------------------*
(Hand/Ball | * |
Distance) | * |
               > * |
     (caught) 0|<----------------------- * |
               > * | |
               > f(o) * | |
               > * | |
               > * | |
             - | * v v

···

-----------------------------------------
                   Output (neural current, spikes/sec)

The stars map out the "environment function", f(o), that relates
what your nervous system does (generates neural currents) to a result
of that neural activity (movement of your hand to a position relative
to the ball). Note that the function is VERY non-linear, though it
is monotonic.

According to MCT, if you intend to produce a particular result (such
as the result "ball zero distance from hand" which I call "caught" on
the graph) then you have to _compute_ the neural currents that you
have to generate to produce this result. In order to do this, you
have to have a model of f(o) in your brain (this is the "model" in
"model based control"). If you have a model of f(o), then you compute
the output required generate ithat result by "working back" from the
intended result (the value of f(o) to the output that produces that
result (that value oif f(o). That is, you find the inverse of f(o);
this is the value of output that produces the intended result.

Computing the inverse of f(o) is like moving from the intended result
on the y axis above, along the horizontal line until you hit f(o). At
that point you move verticaly to the point on the output axis to find
that value of output that produces that value of f(o) (the intended
result).

If you intend to produce a different result (such as having the ball go
over your hand by a certain amount) then you start from this new result
(which is the value of f(o) at the + sign), move horizontally to the
f(o) curve and then move down to find the output value that produces
this result.

So MCT says that, in order to produce intended results, we have to
have a good "model" of f(o), the environmental function that relates
neural activity (the only output that is completely determined by the
control system itself) to the intended result. PCT questions this
model on two grounds: 1) it seems VERY unlikely that the brain can
do the computations required to estimate f(o) OR to compute it's
inverse and 2) this model fails to explain how we produce intended
results in the context of _unpredictable_ and _undetectable_
disturbances. The result on the y axis is not just a function of our
outputs; it also a function of forces that are independent of our
outputs -- such as gusts of wind -- that are completely unpredictable.
Despite such disturbances, we are able to reliably produce the results
we intend; we can, for example, catch a ball rather reliably on a windy
day.

The MCT approach to the production of intended results is based on
open-loop concepts of behavior that were (unfortunately) borrowed by
control theorists (who had been moving in the correct direction until
then) from psychologists.

A real closed loop control system doesn't doesn't compute outputs;
the outputs of a closed loop control system are always proportional
to r-p, the difference between the intended, r, and actual, p, results
of output. PCT explains the behavior that MCT says is computed output
in terms of controlled INPUT. A control system produces intended
results because it can PERCEIVE the state of those results (for
example, it can perceive the distance of the ball from the hand) and
it can vary its outputs appropriately (meaning, with the appropriate
SIGN and GAIN) to keep "pushing the perception of the result toward the
intended state, r.

The significant "computation" that goes on in a control system occurs
onn the INPUT (perceptual) side of the system and in the ORGANIZATION
of the relationship between control systems. In order to catch a ball
you have to design a control system that can _perceive_ the state of
the intended result; that can _continuously_ percieve the distance of
the ball from the hand, for example. You also have to design other
control systems that can control perceptions of acceleration and
velocity of the hand (to move the hand relative to the ball),
perceptions of convergence of the eyes (to keep the ball and hand
"centered" in vision), etc.

All of these aspects of the design of a hierarchical perceptual
control system, one that can produce intended results in a disturbance
pronce environment (and in an environment where even the environment
function, f(o) can change!), are embodied in Bill Powers' Little Man
demo that Avery mentioned. When you understand how the Little Man
works you will have gone a LONG way toward understanding how HPCT
works and (I believe) how humans work, too.

Best

Rick