Predictive control; brain function; loop delay

[From Bill Powers (960119.0830 MST)]

Martin Taylor 960118 14:05 --

     "Inside the control system" _must_ include the environmental
     feedback path, and there the lags can be anything from milliseconds
     to years.

I think that when we say "people are control systems" we don't mean to
include every different environment in which they may find themselves.
At least I don't. I speak of the control system as separate from its
environment, although as you say the whole control loop has to be
considered when analyzing a control process. I wonder what you thought I
meant in my next paragraph, which began "The other use claimed for
prediction is to produce an action whose effects on the controlled
variable are delayed."

     When we make an investment for the purpose of fulfilling a far
     future (for some of us) need for money in retirement, we perceive
     what we imagine to be a future with and a future without that
     investment. Our prediction may be wrong, but our actions are based
     on a predicted perception and a predicted reference value for that
     perception.

From the standpoint of the system at the time the retirement income is

to begin, the action of saving must be far in advance of the state of
the controlled variable. In other words, to start the retirement income
just as the employment income ceases, it is necessary to have begun the
actions of saving and investing for retirement several decades
previously.

As I said to Hans, handling past, present, and future in a predictive
control system gets confusing. We never actually control a future event.
What we control in present time is a presently-imagined future, not the
future that will actually occur. Starting with our current financial
status and our current knowledge of economics, we imagine a future in
which we retire at some standard age, and we make a plan to save for
that future -- NOW -- at some rate. We then calculate what our
retirement income will be, based on an assumed average interest rate
over those years, other sources of income such as (in the U.S.) Social
Security, and so forth. Periodically, as time passes, we recalculate the
retirement income as the current situation changes and our estimates of
future income change, and we may change our current rate of saving to
adjust the calculated income to fit our changing _PRESENT_ conception of
how much income we will want.

This is very much model-based control as Hans described it, including
running the model at high speed to calculate its state at some future
time and adjusting our present actions to control the predicted outcome.
It's the same principle as the instrument-assisted predictive airplane
landing system.

     It's clear that a perception NOW has to match a reference NOW, and
     that unless the reference value for a perception is predicted,
     there is no way that a perceptual prediction can be useful.

That's a very good point. The system must imagine its own state in the
future (including reference signals) as well as the state of the
environment.

     But is it not reasonable to suppose that a future reference value
     can be predicted and compared with a predicted perception to
     produce an output NOW that will be delayed through the environment
     so that when the time comes, the perception will match its
     reference apart from the unpredicted aspects of disturbances that
     happened in the interim?

Yes, this is how I think it works, too. However, people seem to have a
lot of difficulty in imagining reference states different from those
they have now. What people predict mostly, I think, is that they will
always want the same things they want now. The twenty-year-old hurtling
down the mountainside on skis can hardly envision a future in which the
thought of doing that would fill him with alarm, in which he is more
concerned for who will take care of his children than with having a
brief downhill thrill. I'm sure it's hard for young people making
$50,000 per year to realize that when they retire they will need
investments worth perhaps three million dollars 30 years from now to
achieve the income they will then need just to live passably well. It's
hard for the young person to predict future attitudes toward fame,
wealth, pursuit of nubile members of the opposite sex, loud music, good
music, drinking, morals, and even cleanliness.

     I don't see how you can do it with a predictive filter based solely
     on the error signal... It seems to me that other inputs have to be
     used, not just the error signal.

You're right; I was simply trying to get the focus onto what actually
has to be predicted, which is the output needed now in order to bring
about the imagined future state. When the time arrives in which the
controlled variable is finally to be matched to its reference state, the
action that accomplished this is far in the past.

It strikes me that the kind of prediction we're talking about now isn't
really a built-in brain function; it's something we learn, a policy we
can adopt or not adopt. When you think of the predictive control process
that you and I applied (such as it was) to prepare for retirement (and
that many others omit), it was something we had to persuade ourselves to
do, and when we did it, we did it with pencil and paper and using
learned ways of thinking about how the world works -- calculating
compound interest and that sort of thing. I don't think that brains just
naturally compute compound interest 20 years into the future. If you
didn't learn arithmetic in school, you couldn't do it even if you wanted
to. It's not a natural brain function. Only the ability to learn such
things is natural.

It also strikes me that there is probably too much emphasis on
prediction in our cultures. Some people seem to be very insecure if they
can't predict exactly what is going to happen next, forgetting that as
events unfold they are still going to be conscious and functioning, and
will be able to deal with most disturbances as they arise. The less
confidence you have in your ability to control in present time, the more
you will want to predict what is going to happen so you can be prepared
for it. There are certainly advantages in predicting some things -- your
example of planting crops is a good one. On the other hand, substituting
open-loop prediction for control to the extent that you forget how to
control in present time is as dangerous as not predicting at all. What
will you do if you have staked everything on getting a good crop, and
the rains fail? If you're a farmer competing in the market, what will
you do if you do get a good crop -- but all the other farmers do, too?

By its very nature, predictive control is much less reliable than
present-time control, and the longer the period over which the
prediction is made (as you have frequently said), the less reliable the
prediction. There's probably some relatively short time over which
prediction can be useful, and beyond which reliance on prediction
becomes a liability. You and I were betting that the banks would remain
functional and the government would manage to keep its promises despite
several complete changes of personnel and policies. Look at the people
retiring right now after 20 or 30 years of service to the same company,
only to find that the company has spent all the retirement funds to pay
off debts from a takeover.

···

----------------------------------------
There's an even deeper question here. Should we think of prediction and
computation as part of a model of the brain, or simply as evidences of
the sorts of behavior that a properly-equipped brain can learn to do?
"Prediction" and "computation" are words we used to cover an extremely
wide range of phenomena. If there is a phase-advance neural circuit
built into a spinal reflex, we speak of it as being predictive or
anticipatory in function, classifying it exactly the same way as we
would classify calculating the derivatives of a curve, using calculus
and pencil-and-paper symbol manipulations, and extrapolating the curve
to a time t + tau. Are we to consider both kinds of "prediction" to be
basic properties of the physical brain? I see them as very different.

It's hard to find a general way to draw the line between a built-in
brain function and a process that is carried out by _using_ built-in
brain functions. In many cases it's easy to see the difference: speaking
English is certainly not a built-in brain function, and just as
certainly there have to be certain basic built-in functions to permit us
to speak or write English or any other language. Any person can learn to
drive a car, but only because we have abilities to control that are more
basic than the particular things we learn to perceive and control.

It seems to me that in trying to model brain functions, we are
constantly getting confused between the _properties_ of an organism and
the _activities_ of an organism. Is imprinting a property of young
chicks, or is it that imprinting is a mode of behavior that chicks can
learn because of underlying properties of their sensory and motor
systems? We all learn to balance and walk, but are balancing and walking
basic properties of human behavior? Or could it be that the physical
construction of the brain and body simply make it easy for these
particular kinds of activity to be learned and carried out?

What are the basic capacities needed to make cognitive predictions
possible? You have already mentioned one, imagination. In a more
elaborate form, it is model-making. Then we need to be able to run the
model, or perform calculations in some way, to derive a picture of
variables as they will be some time in the future. But what basic
capacities are needed in order to do that? If we could not do
arithmetic, could we make cognitive predictions or extrapolate? Somehow
it seems to me that our models of such things make heavy use of specific
procedures that we have all learned in school; if there were no such
thing as school, if nobody had taught us how to do such calculations,
how would we do these things? Would we do them at all? I don't think
that mathematics is an innate brain function. If we want to model the
brain, we have to ask what capabilities are necessary in order that
mathematics be a doable activity of the brain.
------------------------------------
RE: discrete calculations

Do you really mean "independent?" If the even-numbered values were
independent of the odd-numbered values, you could arbitrarily change
the one without changing the other. Somehow I don't think this would
work.

     What do you mean "work." It's what happens.

If X[n] depends on X[n-1], then an odd value depends on an even value,
even if you then write "X[n-1] depends on X[n-2]." You can show that
X[n] depends on X[n-2], but this does not absolve X[n-1] of a role in
determine the dependency.
--------------------
     ... your comments seem to be an argument for treating the
     biological control loops as analogue ones, a point with which I am
     in full agreement. My comment is purely technical, about what you
     need if your discrete simulations are believably to behave like the
     analogue loops they are supposed to simulate.

Yes, I was expanding on your comment. However, there is another point,
which perhaps I did not get across.

     In doing any discrete simulation of analogue effects, one has to be
     careful to reduce the aliasing that occurs when there is energy
     above the Nyquist limit. The Nyquist frequency is 1/2T where T is
     the sampling interval.

Before you can do this calculation, you have to pick a sampling
interval. If you assume that the loop delay is the natural sampling
interval, you will miss the point that any variable in the loop can
change states many times during the loop delay. The only solution is to
treat the loop delay simply as another physical variable, and sample
rapidly enough to pick up changes that occur faster than the interval
set by loop delays.

     Incidentally, how does Simcom deal with this issue? Our Control
     Builder for the Mac dealt with it in a rather heavy-handed way, by
     filtering every signal as it emerged from whatever process computed
     it. There's no need to be so careful, if it is guaranteed that the
     signals are adequately bandlimited. If they are, you don't have to
     filter them at all.

In Simcon, every function is assumed to operate at the same time as all
other functions and to have a delay time of 1 iteration. This is done by
computing the new output of each function on the basis of the old
outputs of other functions which feed into it. Then, before the next
iteration, all the "new" output values are copied into the "old" values.
So in effect, all functions operate simultaneously.

The physical time represented by one iteration, dt, is part of every
time-dependent function. The value of dt is chosen to represent the
function with the least delay in the loop. Longer delays are then
represented as an explicit delay function, n delays of duration dt. Of
course dt can be chosen to be as short as desired; every function would
then be followed by an explicit delay function in units of dt.

The only penalty for choosing a very short dt is that the simulation
will take longer than necessary to run. But this is a practical problem,
not anything of fundamental theoretical significance. Generally, to be
on the safe side, I run a simulation with dt set to one value, and then
again with dt set to half that value. If there is no discernible change
in the behavior of the simulation, I use the first value of dt. If there
is a change, I halve dt and try again.

     Yes, but that's a completely different phenomenon, based on
     ensuring that the absolute value of loop gain is sufficiently low
     for frequencies at which the loop delay and phase shift would
     otherwise bring the real part of the gain above +1. The RC filter
     is a low-pass filter with a drop-off of 6dB per octave, so by
     making the cut-off frequency low enough, you can always get the
     real part of the loop gain low enough at the critical frequency for
     your actual loop delay.

This is how the slowing factor works, too. All you have to do is make
the factor in the denominator larger than the minimum value. If dt is
made small enough, the R-C slowing will match that of the real system
while the slowing factor is still far larger than the minimum value
needed to prevent computational oscillations. The expression using the
slowing factor IS a low-pass filter (an approximation to one) with a
dropoff of 6 DB per octave (in power) or 3 DB per octave (in amplitude).
-----------------------------------------------------------------------
Best,

Bill P.

[Martin Taylor 960119 14:14]

Bill Powers (960119.0830 MST)

Technical points about discrete simulations.

Do you really mean "independent?" If the even-numbered values were
independent of the odd-numbered values, you could arbitrarily change
the one without changing the other. Somehow I don't think this would
work.

    What do you mean "work." It's what happens.

If X[n] depends on X[n-1], then an odd value depends on an even value,
even if you then write "X[n-1] depends on X[n-2]." You can show that
X[n] depends on X[n-2], but this does not absolve X[n-1] of a role in
determine the dependency.

If X(n) = f(Y(n-1)) and Y(n) = g(X(n-1)) then X(n) is totally unaffected
by the value of X(n-1). The two sets of values for even and odd n are
completely independent. That's the situation we were discussing. You can't
arbitrarily introduce a dependency of X(n) on X(n-1) where none exists
in the original equations.

    In doing any discrete simulation of analogue effects, one has to be
    careful to reduce the aliasing that occurs when there is energy
    above the Nyquist limit. The Nyquist frequency is 1/2T where T is
    the sampling interval.

Before you can do this calculation, you have to pick a sampling
interval. If you assume that the loop delay is the natural sampling
interval, you will miss the point that any variable in the loop can
change states many times during the loop delay. The only solution is to
treat the loop delay simply as another physical variable, and sample
rapidly enough to pick up changes that occur faster than the interval
set by loop delays.

Whatever the sampling interval you pick, you have to ensure that NONE of
the variables of interest change in such a manner that their spectrum
has significant energy above the Nyquist limit. In a loop, it is almost
a sure bet that changes in any variable have components above 1/2*(loop delay).
So if you choose to sample at the loop delay (actually twice as slow as
in the X(n) = f(Y(n-1)) example), you are just about guaranteed to get
nonsense as a result. At least it's nonsense if you try to apply it as
a description of what the analogue loop does.

In Simcon, every function is assumed to operate at the same time as all
other functions and to have a delay time of 1 iteration. This is done by
computing the new output of each function on the basis of the old
outputs of other functions which feed into it. Then, before the next
iteration, all the "new" output values are copied into the "old" values.
So in effect, all functions operate simultaneously.

But this is irrelevant to the question of how Simcon assures that the
spectrum of any variable has no substantial components at frequencies
above 1/2dt. And THAT is the critical question if your Simcon simulation
is to tell you anything about the analogue system it is simulating.
As I said, in COntrol Builder, we ham-handedly filter every signal, but
that's not an optimal approach.

The only penalty for choosing a very short dt is that the simulation
will take longer than necessary to run. But this is a practical problem,
not anything of fundamental theoretical significance. Generally, to be
on the safe side, I run a simulation with dt set to one value, and then
again with dt set to half that value. If there is no discernible change
in the behavior of the simulation, I use the first value of dt. If there
is a change, I halve dt and try again.

That's an effective policy. It ensures that any aliasing you might have
had with the longer sampling interval is unimportant. The penalties I am
concerned with are those that occur with dt values big enough to lead
to important aliasing, and that's the problem that showed up in your
compututation of "optimum slowing factor", as we showed together in an
exchange of messages some years ago. It's quite different from the
RC filter effect that stabilizes the analogue filter. Putting the
slowing factor small enough provides a low-pass filter at one point
in the loop, making sure that the simulation actually does look like
the analogue loop. If the analogue loop has such a filter, then dt will
be short enough that there is no important aliasing. But not having
aliasing affect the simulation is different from having an analogue
filter that is stable because the high-frequency gain is low enough.

All in all, what I'd like to achieve is an understanding on the part
of simulators that they should do what you said you do, in the paragraph
quoted above (or anything that has the equivalent effect of ensuring that
your results are not contaminated by aliasing and can therefore be believed
as a description of the analogue loop being simulated).

Martin

<[Bill Leach 960119.23:25 U.S. Eastern Time Zone]

[Bill Powers (960119.0830 MST)]

... anything of fundamental theoretical significance. Generally, to be
on the safe side, I run a simulation with dt set to one value, and then
again with dt set to half that value. If there is no discernible change
in the behavior of the simulation, I use the first value of dt. If there
is a change, I halve dt and try again.

Though rather rare, you would need to use a second prime divisor (like
1/3) for dt to ensure that there was not a computational error related to
synchronization rate. Of course there is a potential that even
attempting this would fail since there is no such thing as a quotient in
a digital computer that is not actually related to a power of two.

-bill