second generation MCT controller

------------------- ASCII.ASC follows --------------------
[Hans Blom, 970321]

Starting tomorrow, I'll be gone for a week -- a much needed Easter
vacation. I'll depart with another MCT controller, which shows much
better control behavior -- in the sense that Bill posed as additional
requirement: smooth operation. I've translated this into prescribing a
constant velocity during the time the theodolite moves, much in line
with Bill's velocity reference. And it works fine...

Note that, although the analysis that follows is more complex than the
earlier one, the _program code_ has changed only where the control law
(the expression that calculates the controller's output) is
formulated. Warning: it is much easier to understand what is done from
reading my mathematical analysis than from reading the program code.
Reverse-engineering the code into the equivalent math might be next to
impossible.

No, I didn't like my MCT controller much, either ;-). Although it
realizes its control objective (x=r) perfectly, as well as its
identification objective (estimating the disturbance), we also observe
persistent oscillations of its output u and the theodolite's
acceleration and velocity. The reason is that a reference for x is
specified but no other objectives. Thus the controller is free to pick
exactly those outputs that achieve its objective with zero error.
Although it does that perfectly, I don't happen to like its choice. It
is _too_ good.

Let's analyze why it should compute such a funny, oscillating output.
Remember the theodolite equations. In their barest form (J=1, dt=1, no
disturbance) they are:

v(t+dt) = v(t) + u(t)
x(t+dt) = x(t) + v(t) + 0.5*u(t)

I happen to know (from knowledge about bang-bang controllers) what the
ideal control sequence is in cases such as what we specified for the
theodolite's movement: give the theodolite a short kick (an "impulse",
during a single control interval dt) to get it moving, and another,
equal kick in the reverse direction to stop it again when it has
reached its desired destination. Let's see what position and velocity
profiles develop when u(1) = 1 and u=0 elsewhere, using the above
formulas:

t u v x
1 1 1 0.5
2 0 1 1.5
3 0 1 2.5
4 0 1 3.5

The x-trajectory that I initially prescribed was not 0.5, 1.5, 2.5,
3.5, ..., however, but 1, 2, 3, 4, ... That meant that to get to x=1
at t=1 the correct value for u had to be 2 rather than 1. But u(1)=2
would make v(1)=2 instead of 1 and would, with u(2)=0 make x(2) equal
to 3 instead of the required 2. So at t=2 a counter-kick of size -2 is
required to get x=2 rather than x=3. As a result, we get a repeating
sequence of u(k)=2 and u(k+1)=-2, as well as a repeating sequence of
v(k)=2 and v(k+1)=0.

t u v x
1 2 2 1
2 -2 0 2
3 2 2 3
4 -2 0 4

Perfect control of x, bad "control" of v and a. But note that our
original formulation did not specify any particular behavior for v, a
or u...

Note that this funny oscillating behavior can be abolished by having a
reference for x that follows the above ideal trajectory, i.e. one that
starts with 0.5 initially and thereafter increases by 1. In that case,
behavior is ideal again ... until the disturbance occurs: then the
oscillations are back. I don't know how to combat this with a simple
change in the program. I'll think about that some more, but I will now
offer a better and more general solution: I'm going to specify a
velocity reference profile -- and control for velocity as well. Here
is the new analysis.

The equations that govern our idealized theodolite's motion are:

a(t) = (u(t) + d(t)) / J (1)
v(t+dt) = v(t) + a(t) * dt (2)
x(t+dt) = x(t) + v(t) * dt + 0.5 * a(t) * dt * dt (3)

The TWO objectives of the MCT controller are now:

v(t+dt) = rv(d+dt) (4)
x(t+dt) = rx(d+dt) (5)

where rx and rv are the references for x and v. I again assume that
both x(t) and v(t) are observed at every iteration. Variable rx will
increase linearly, whereas rv will be constant during the time the
theodolite moves. And I'm going to find a solution that works whether
there is a conflict between the requirements of (4) and (5) or not.

The control objectives are, as usual, obtained by combining (2) and
(4) resp. (3) and (5):

rv(t+dt) = v(t) + a(t) * dt (6)
rx(t+dt) = x(t) + v(t) * dt + 0.5 * a(t) * dt * dt (7)

Note that solving (6) and (7) separately will generally give two
different, conflicting values for a(t) and thus for u(t). It simply is
not possible to specify any arbitrary reference levels for position
and velocity simultaneously. Solving (6) and (7) for a(t) gives

a(t) = (rv(t+dt) - v(t)) / dt (6a)
a(t) = 2.0 * (rx(t+dt) - x(t) - v(t) * dt) / dt * dt (7a)

which must be identical. Thus, if there is to be no conflict, the
following relationship must hold:

rx(t+dt) - x(t) = 0.5 * (rv(t+dt) + v(t)) * dt

whose physical interpretation escapes me, I must admit.

The standard MCT approach (both in case of conflicts and if not) is to
combine both requirements into a single expression, which states that
both (6) and (7) must simultaneously be fulfilled "as perfectly as
possible". The standard tool is to minimize the (suitably weighted)
sum of squares of the errors of (6) and (7)

J = (rv(t+dt) - v(t) - a(t) * dt)^2
  + k * (rx(t+dt) - x(t) - v(t) * dt - 0.5 * a(t) * dt * dt)^2 (8)

with respect to a(t). The weighing factor k (0 <= k <= infinity) can
be anything; if there is conflict, we will choose a convenient value
that makes for good control. If there is no conflict, the value of k
does not matter, because the squares can _both_ be brought to zero
simultaneously. Note that, in our case, only the velocity will be
controlled if k = 0, whereas if k goes to infinity only the position
will be controlled. Intermediate values of k stand for compromises
between control of velocity and control of position.

For ease of notation we drop the time indices for now:

J = (rv-v-a*dt)^2 + k * (rx-x-v*dt-0.5*a*dt*dt)^2

dJ/da = -dt*(rv-v-a*dt) - k * 0.5*dt*dt*(rx-x-v*dt-0.5*a*dt*dt) (9)

which is zero where J has its minimum. Solving dJ/da=0 for a gives

    rv-v + 0.5*k*dt*(rx-x-v*dt)
a = --------------------------- (10)
       dt + 0.25*k*dt*dt*dt

For pure velocity control (k=0) this becomes

a = (rv-v)/dt (10a)

and for pure position control (k=infinite) this becomes

a = 2.0*(rx-x-v*dt)/(dt*dt) (10b)

which is completely in line with the far easier to solve expressions
(2) and (3).

In (10) we see that the errors in rv-v and in rx-x-v*dt get roughly
equal weights when the value of k is around 2/dt, i.e. k * dt = 2.

Again, we assume for the moment that d(t) is known. Then we can solve
for u(t):

u(t) = J * a(t) - d(t) (11)

where a(t) has been computed in (10).

This is the equation that this new MCT controller uses to compute its
output. Note that we actually have an (infinite) _family_ of
controllers, each with its own k.

Again, the next predicted observation is, given the value of u(t) that
is the true output:

x(t+dt) = x(t) + v(t) * dt + 0.5 * (u(t) + d(t)) * dt * dt / J (12)

where we have combined (1) and (3).

In (12), as soon as we have observed x(t+dt) the only unknown is d(t);
we have used its estimate to predict x(t+dt), not its true value. If
there is a difference between both, (12) will not give a correct
prediction. Let us call the true perception at time t+dt by the name
of x'(t+dt) and the true disturbance at time t by the name of d'(t).
The formula that "explains" the true perception is then (see (3)):

x'(t+dt) = x(t) + v(t) * dt + 0.5 * (u(t) + d'(t)) * dt * dt / J (13)

Subtracting (12) from (13) gives:

x'(t+dt) - x(t+dt) = 0.5 * (d'(t) - d(t)) * dt * dt / J (14)

Now we can reconstruct d'(t):

d'(t) = d(t) + 2.0 * J * (x'(t+dt) - x(t+dt)) / (dt*dt) (15)

Note that, again, the correct value of the true disturbance d'(t) can
only be known at time d+dt, at which moment we observe x'(t+dt): its
estimate always runs one sample late. As a result, we can predict that
the controller will perform badly when the disturbance consistently
varies significantly between samples.

This completes the analysis of this MCT controller family. Which value
of k works best? There are two approaches. In the theoretical approach
one prespecifies how important position resp. velocity control are
relatively. In the practical approach, one tests the controller for
different values of k and picks that value of k that gives the
preferred response.

The above approach is, by the way, the general method that MCT uses to
handle what PCT calls "conflicts".

The code of the combined position/velocity controller follows below.
In order to allow a good comparison with my previous program, I have
used the same names. Note that the only significant change is where u
is computed.

Greetings,

Hans

program x_v_theodolite_model;

const
  J = 1.0; {theodolite characteristic}

var
  k, {k=0: velocity control; k=large: position control}
  x, {perceived position}
  v, {perceived velocity}
  u, {controller's output}
  rx, rv, {TWO reference levels}
  d, {disturbance estimate}
  p, {prediction of next observation}
  dt, {time increment; var so it can be changed}
  t: real; {time; for display/loop control purposes}

procedure observe;
{this procedure implements the theodolite motion equations; it
generates the NEXT observation(s) x(t+dt) and v(t+dt), given u(t)
and d(t); thus it also shifts the time by dt; note that aa, vv and
xx are LOCAL variables, not visible outside this function; declaring
them as "const" saves their value from run to run; aa, vv and xx are
initially known to be zero}

  function disturbance: real;
  {this function implements the disturbance, which is known only
   to procedure observe, the theodolite's motion simulator}
  begin
    if (t < 4.0) or (t > 5.0) then
      disturbance := 0.0
    else
      disturbance := 10.0
  end;

const
  aa: real = 0.0;
  vv: real = 0.0;
  xx: real = 0.0;
begin
  aa := (u + disturbance) / J;
  xx := xx + vv*dt + 0.5*aa*dt*dt;
  vv := vv + aa*dt;
  t := t + dt;
  x := xx; v := vv {the controller may use these}
end;

procedure reference (t: real);
{this procedure specifies the reference levels for x(t) and v(t);
the initial and final steps of rx are half their normal value}
begin
  if t < 2.0 + 0.25*dt then {no movement}
    begin rx := 0.0; rv := 0.0 end
  else
  if t < 6.0 + 0.25*dt then {constant speed}
    begin rx := t - 0.5*dt - 2.0; rv := 1.0 end
  else {and halted again}
    begin rx := 4.0; rv := 0.0 end;
end;

procedure control;
{this function implements the controller; it knows the motion equa-
tions and J, is allowed to observe x(t) and v(t), and delivers u(t)}
var
begin
  u := J * a - d
end;

begin
  dt := 0.01; {or any other value}
  k := 1.0; {SCALED value w.r.t. dt, i.e k*dt}
  t := 0.0; {start at t=0}
  x := 0.0; {initially zero}
  v := 0.0; {initially zero}
  rx := 0.0;
  rv := 0.0;
  d := 0.0; {estimate of disturbance}
  repeat
    writeln (t:9:3, rx:9:3, x:9:3, v:9:3, d:9:3, u:9:3);
    reference (t+dt); {specify desired x and v}
    control; {controller computes u}
                                  {predict next observation}
    p := x + v*dt + 0.5 * (u + d) * (dt*dt) / J;
                                  {time shifts by dt here}
    observe; {observe x (and v?)}
                                  {update disturbance estimate}
    d := d + 2.0 * J * (x-p) / (dt*dt)
  until t > 9.0
end.

···

a: real;
  a := (rv - v + 0.5*k*(rx-x-v*dt)) / (1.0 + 0.25*k*dt)/dt;