Hans' model and the adaptive PCT model

[From Bill Powers (950625.0810 MDT)]

RE: Hans' model.

Despite the hiatus in the discussion with Hans Blom, I've been trying to
figure out what strikes me as so strange about his method of setting up
a control system equation. Hans took the equation by which the
controlled variable depends on the output, and solved it for the output
that would correct the error on the next iteration. Not only did this
method correct the error in one iteration, it corrected the error
_perfectly_, leaving no error at all from then on. PCTers who followed
the details of this discussion must have been wondering why OUR control
system models creep toward zero error and never actually get there, when
there seems to be a closed-loop model that could correct error instantly
and perfectly.

Hans starts with an equation purporting to be a model of a physical

x[t+1] = a*x[t] + b*u[t] + disturbance terms.

We can drop the disturbance terms for now, so we just have

(1) x[t+1] = a*x[t] + b*u[t].

u is the driving function at the input to the world-model, x is the
output variable of the world-model.

Hans then sets up the condition

(2) x[t+1] = xopt, and solves (1) and (2) together:

xopt = a*x[t] + b*u[t].

From this we can compute the form of the adaptive model's output

function that produces u:

u[t] = (xopt - a*x[t])/b

Substituting this expression for u[t] into (1), we get

x[t+1] = xopt. The error is corrected on the next iteration!. In fact,
there is never any difference between x and xopt.


Despite his vehement denials, Hans followed a very simple procedure
involving taking the inverse of the world-model:

x = f(u), from which we compute

u = (inverse f)(xopt)

Obviously, then, x = xopt for any form of the function f.

What he did was simply to compute that value of u that would make x be
exactly equal to the value of xopt at all times. So his world-model
"loop" can be represented as

                    [u = (inverse f)(xopt)]
                         x = xopt

The world-model's output variable x is always identically equal to xopt,
regardless of the values of the parameters in the world model.

Thus when y, the perception of the true controlled variable xt, is
compared against the output of the model x, it is really always being
compared against xopt, because x is always equal to xopt regardless of
the operation of the Kalman filter. If we say that y is a true
representation of xt, then the actual input to the Kalman Filter
operation is (xopt - xt) -- the error signal!

The Kalman filter operations vary the values of the world-model
parameters until y = x, which means until xt = xopt. So in a control-
system model format we have
                              > > e, error
                 -------->[comparator] ---o----->---------
                > > > >
                y v v Kalman
                > (inv f)(e+xopt)<--- filter
          [input func] | \
                > u \
                > > output function
                > >
                xt <----[f(u)]------------

Note that the world-model itself is a smokescreen, not being shown in
the above diagram. In the loop that runs from x back to x through the
world-model in Hans' diagram, the net effect is always to make x = xopt,
independently of what the parameters are. So the Kalman filter
operations have no effect on the world-model loop or on the value of the
output of the world model, which is always xopt. The world model itself
plays no part in the adaptation.

The real effects of the Kalman filter operations take place in the
_inverse_ of the world-model which is placed in the output part of the
system. By varying the parameters in the inverse function, the adaptive
filter causes xt to approach xopt; indeed, that is the error that keeps
the adaptation going. The Kalman filter makes cumulative corrections, so
the effect is that of a control system with an integral output function.
After enough time has passed (and of course in the absence of
disturbances), the error signal will become exactly zero.
In the above diagram I have shown the reference signal entering both the
comparator and the output function. If the comparator loop were missing,
and the output function's parameters were adjusted so the output
function was the exact inverse of the real function f, then variations
of the reference signal alone would cause xt to follow xopt, open-loop.
In that case, reconnecting the comparator loop would make no difference
because the error signal would always be zero.

However, if an independent disturbance were to act on xt, the error
signal would become non-zero and add to the effects of xopt in the
output function. This would produce resistance to the effects of the
disturbance even though disturbances are not taken into account in the
output function. What this arrangement does is to set up a basic open-
loop connection between the reference signal and the controlled
variable, with a closed-loop effect that comes into play only when there
are disturbances, or when the output function is not the exact inverse
of the real environmental feedback function -- the two main conditions
that produce a nonzero error signal.

When such errors occur, there are two results. The first result is that
the effect of the error signal shows up as a variation in output opposed
to the effect of the disturbance, just as if the reference signal had
changed. The second result is that the Kalman filter operations receive
a nonzero input, and begin slowly altering the parameters of the output
function. If the Kalman filter is smart enough, it might even be able to
add regular disturbances to the output function that would oppose the
effects of regular external disturbances, restoring the error signal to
zero even in the presence of regular disturbances and permitting the
equivalent of open-loop operation. How well this would work would depend
on how accurately the external disturbance can be matched by an internal
generator, including synchronization. I would expect some rather severe
limits on the success of such attempts to anticipate disturbances other
than constant ones.

In some earlier explorations of arrangements like that in the above
diagram, I think I showed that the dual function of the reference signal
can be achieved with a normal comparator and a single reference signal,
just by assigning different weights at the two inputs of the comparator.
I would very much like someone else to verify this. If this is true,
then I think we would have something much closer than before to a merger
of the ordinary PCT control system and the adaptive control system
proposed by Hans.
The above diagram is in fact completely equivalent to Hans' model, even
though it doesn't seem to be because of the missing "forward" world-
model. I have shown that the forward world model is actually a dummy
operation having no effects on adaptation. Its only function is to
assure that x is always identically equal to xopt. What does have an
effect is the change in parameters in the _inverse_ of the world model
that is in the output function of the control system. In fact, the above
diagram is exactly the "alternative" use of the Kalman filter which I
had previously proposed to Hans, before realizing that it is
functionally identical to his model. The identity is not exactly self-
evident, so perhaps both Hans and I can be forgiven for not seeing it
Since the Kalman filter is only one possible method of reorganization, I
propose that the general adaptive control model substitute the label
"reorganization" for "Kalman filter" in the above diagram. The result is
a very nice package (if it really works), amenable to expansion into
multiple control systems and a hierarchy of systems with each system
containing this ability to adjust itself to minimize control errors.
There is even the ability to run for a while open-loop, when the source
of error signals is lost. We have considered models of this form before,
but never realized that they could also take care of limited open-loop

All that remains is to solve the problem of handling loss of input
signals. As several people have noted, a signal with a value of zero is
not conceptually the same as lack of any signal. But I am mostly
convinced that in real circuits, the two conditions must be treated as
the same. In other words, cutting off sensory inputs results in
perceptual signals having a value of zero. The receiving circuits are
still there, and the significance of a perceptual signal of zero
frequency remains what it was before.

This means that to make the above diagram work right, we need some way
to detect the condition of zero input signal and use it to cut off the
effect of the error signal. Hans' model did that just by letting the
user of the program do it; the user could make error signals ineffective
through manual setting of one system variable to a value that switched
off the error signal's effects. There may be other more automatic ways,
especially if we think in terms of pairs of balanced one-way comparators
(the norm in the nervous system). The absence of _both_ inputs to a
push-pull comparator would be an error signal of zero, which is what we
I hope there are at least one or two listeners out there who have been
following the details of this discussion. I think this analysis is a
major discovery concerning the model Hans presented; I don't know how
general its implications would be for Extended Kalman Filters, but in
this one application we can see that the real organization is very
different from the apparent one. If this is a general truth, it would
obviously have extended implications. Some independent checking of my
reasoning would be welcome.

Bill P.