Thoughts on PCT, IT, and adaptive control

[From Bill Powers (940527.0700 MDT)]

Some general thoughts on PCT, IT, and adaptive control:

I think I am becoming clearer on the basic difference in approaches
between PCT and statisical approaches to control.

Suppose we have a good model of a control behavior. Solving the
equations of the model for the output of the control system, o, we
will obtain a deterministic equation

o = f(d,r),
    where d = the disturbance waveform, and
          r = the reference signal waveform.
          f = some expression (Laplacian, for example)

This equation predicts o given the behavior of d and r. The
objective of PCT modeling is to find the form of f.

Because the model provides a deterministic relationship, the extent
to which we can predict o is exactly the extent to which we can
predict d and r. If (as is apparently the case in IT) the objective
is _to predict future values of o_, then the problem reduces to
predicting future values of d and r.

There is in general no _a priori_ way of predicting d and r.
Therefore we must treat d and r as random variables. A random
variable can't be predicted in terms of specific amplitudes or
derivatives at specific times, but if its characteristics are
stationary (that is, if its spectral distribution and mean amplitude
do not change with time) we can predict that it will retain these
same characteristics in the future; the "prediction" is merely a
restatement of the assumption of stationarity.

Given the deterministic dependence of o on d and r, we can then
derive the corresponding stationary characteristics of the behavior
of o, because the form of f will show how spectral and mean
amplitude characteristics of d and r will appear after passage
through the function f to produce o. Thus it becomes possible to
predict the spectral characteristics of o (its bandwidth being one
such characteristic) and the mean amplitude of o in terms of similar
characteristics of d and r.

The same applies to the other variables in the control loop; the
perceptual signal and the error signal, with the system equations
suitably solved to yield the appropriate deterministic equations.

In the same way it is possible to convert any other stationary
measure of d and r into a corresponding stationary measure of any
variable in the control loop. One such stationary measure is
information or uncertainty, formally defined.

As remarked, the degree to which the future behavior of a loop
variable can be predicted is the degree to which d and r can be
predicted, given the deterministic equations. If there is
uncertainty introduced in the control system itself, by noise
generated in the functions or by random variations in the parameters
of the functions, then the ability to predict the future states of
loop variables is correspondingly reduced. Martin Taylor has shown
in a straightforward way how noise generation simply adds to the
uncertainty in the ability to predict d and r, and thus the
uncertainty in the ability to predict any loop variable. Random
variations in parameters would be considerably harder to represent
analytically, but the general idea is clear. What random noise and
parameter variations mean is that even if d and r were perfectly
predictable, there would still be some uncertainty in a prediction
of any loop variable.

There is one other consideration that seems to be behind the IT
approach (and also the adapative control approach). Suppose we treat
the problem not as that of explaining the behavior of an existing
system, but as that of designing a system to produce a specific kind
of behavioral characteristic. Now we begin with some stationary
measure of d and r, and some desired stationary measure of a loop
variable such as the error signal. The problem then becomes that of
finding a function f that will yield the required stationary measure
of the loop variable. For example, we might want the error signal to
have a small mean amplitude and a spectral distribution that has no
peaks at any frequency within the bandwidth of the disturbing
variable. In IT terms, we might want the error signal to embody the
least possible information about d and r.

This is a formidable design problem, because the criteria are stated
only in terms of time-independent statistical characteristics (such
as mean amplitude and spectral distribution) of the independent and
dependent variables. It is not, in general, possible to take the
inverse of the desired relationship and express it in terms of the
necessary functional components. So while the required relationship
can be stated in terms of required statistical characteristics of
the variables, finding a specific design or class of designs that
will meet the requirement may not, even in principle, be possible by
analytic methods. This may explain the horrendous mathematical
complexities in the literature of IT and adaptive control that I
have seen. The specific system designs to be found there (when they
can be inferred) seem to have been chosen at random -- which would
seem to be the only way to choose. There is simply no systematic way
to work backward from the general design criteria to a system design
that will meet them.


Much of the above discussion was predicated on the assumption that
the objective is to predict the future behavior of some loop
variable by predicting the future behavior of r and d. But there is
quite a different approach needed when the objective is only to
explain the behavior of a system that already exists.

In explaining the behavior of an existing system, we are given r, d,
and o (or some other loop variable), and are required to create a
system design that will reproduce the observed relationship. The
variables are no longer treated as random variables because we are
not trying to predict what they will do, but only to explain what
they have done. Now we measure these variables as functions of time,
making no use of their stationary statistical characteristics. The
mathematics is very much simplified, because now we need only find
functions which will match the observed input-output
characteristics. Of course we are also constrained by knowledge
about physical characteristics of components in the real system, but
this constraint simplifies the task instead of complicating it, by
ruling out many possible designs and providing hints about a correct

Since we do not have to predict future states of r and d, we can
test models by arranging for known forms of r and d to exist (so
far, r is usually arranged to be constant). We apply a known
disturbance, and compare the response of the model to the response
of the real system to the same disturbance. By systematically
modifying the model, we minimize the difference.

The product of this approach is a deterministic model which can
predict the time-course of any loop variable given the time course
of the disturbance and reference signal: we can give a highly
plausible form to f in the equation o = f(r,d).

As a consequence, we can then proceed to predict the stationary
statistical properties of a loop variable given the stationary
statistical properties of d and r! In short, by taking the problem
as one of explaining past behavior, we automatically get the same
kinds of prediction that statistical approaches are trying to obtain
in what seems a more direct way, but which turns out to be the hard
(and perhaps impossible) way. Given the spectral distribution of r
and d, we can predict the spectral distribution of o. Given the
deterministic dependence of o on r and d, we can compute the
conditional probabilities. Having solved the problem, in short, we
can then go on to derive the necessary consequences in statistical
terms, if there is any reason left to do so.

Of course if one still wants to predict the behavior of any loop
variable, it is still necessary to predict the behavior of r and d,
and the prediction of the loop variable will be no better than the
prediction of r and d. Furthermore, if noise is generated inside the
system, even the quantitative predictions of the model will leave
some variance unaccounted for. So statistics and noise still play a
part in the model -- but now it is a derivative part, not

In fact, looking back, it is clear that before any analysis based on
stationary statistical characteristics can be carried out, there
must already exist a successful model that can represent the
behavior of the system by deterministic equations. That is where we
get the form of f in o = f(r,d). Without knowing the form of f, it
is impossible to state how the statistical characteristics of o will
relate to those of r and d.
This finally explains to me a nagging discontent I have felt with
what I have found in the literature of optimal control and adaptive
control. All these analyses begin with an assumed system design. But
completely missing is any justification for choosing that design
instead of another one. The design chosen fixes the nature of the
computational problem, and in my opinion unnecessarily complicates

In many of these designs, there is a function intervening between
the reference signal and the comparator, and/or a function
intervening between the controlled variable and the perceptual
signal that reaches the same comparator. If I were designing a
control system, the first thing I would do would be to make sure
that the perceptual signal is an accurate representation of the
external variable I want to put under control, and that the
comparison is between that signal and a reference signal that
directly represents the desired state of the controlled variable.
This gives the clearest indication of how the system is performing:
if the perceptual signal remains in a close match with the reference
signal at all times, control is as good as it can get.

In a design where one or both of the signals reaching the comparator
misrepresent the desired or actual state of the variable to be
controlled, a great deal of room is left for the designer to
introduce ad hoc compensations, but the result is to bring the
controlled variable to a state known only to the designer or user of
the system. There is no natural indication, inside the control
system, of how well it is controlling. If the reference signal and
perceptual signal are directly related to the state of the
controlled variable, then the error signal is the natural criterion
for optimality and can be used in relatively simple ways as the
basis for adjusting the system for better control.

But when there is no such natural indicator of good control, as
there is not in most "optimal" designs I have seen, the designer
must introduce extraneous ad-hoc criteria. Because of those
intervening functions, it is no longer true that the best design for
making the controlled variable follow the reference signal is also
the best design for resisting disturbances. This, in fact, is why
there is so much emphasis on predicting disturbances. The kind of
disturbance makes a difference to performance when the reference
signal and the perceptual signal are differently related to the
controlled variable, and when neither one represents the current
state of the controlled variable, actual or intended.

A major lack in system designs assumed for optimal or adaptive
control analyses is the idea of a hierarchy of control. This is
probably because when neither the reference signal nor the
perceptual signal is closely related to the controlled variable,
there is no obvious advantage in the hierarchical approach. But when
the design begins specifically with a perceptual signal that is a
direct analog of a controlled variable, one sees that it is possible
to put the state variables of a system under direct control by low-
level systems. Then, using the fact that the state variables will
now accurately follow the setting of reference signals despite many
changes in the environment and many kinds of disturbances, one can
construct derived perceptions based on the controlled state
variables to put more abstract or general variables, such as plant
output, under control. Disturbances and parameter variations at the
level of state variables are mostly removed by the first level of
control, making the task of higher levels far simpler.

In Little Man Version 2, taking a hint from nature, I made the
lowest level of control _force control_, using a loop that made the
force applied to a tendon accurately follow a reference signal. Then
that loop, in which acceleration of the arm could be exactly set,
became the output of a velocity control loop, which in turn became
the output of a position control loop. This quite automatically, and
with no special design considerations, compensated for the varying
moments of inertia of the jointed arm; the effects of Coriolis
forces simply disappeared. By adjustment of a couple of parameters
for each control system, the entire arm could be turned into a
simple position control system in x, y, and radius with a time-
constant of a tenth of a second or less and a smooth exponental
response to a step-change in either a disturbance (turning gravity
on and off) or a position-setting reference signal.

Without the concept of hierarchical control, one must try to handle
an entire complex controlled plant in a single step of enormous
computational complexity. Any disturbance affects the entire system.
All the interactions among degrees of freedom, which are mostly
removed by localized lower-level control, must be taken into account
at the highest, and only, level. Because of choosing a single-stage
design, one is led ever further into mathematical complexities -- I
believe quite needlessly.

I'm sure that none of this will persuade the accomplished
mathematicians who are investigating optimal or adaptive control
that they have saddled themselves with an unnecessarily complex
problem by paying too little attention to alternative system
architectures. But I am fairly satisfied that there are simpler
approaches that are just as valid. This is a relief.
Best to all

Bill P.