SIMCON

[From Bill Powers (970121.0100 MST)]

Martin Taylor 970120 11:10 --

One point: my revised manual also notes the computational artifact (found
by Bill P. and me some years ago) involved in the calculation of "optimum
gain" for the "Simple Control System" program, and suggests as an exercise
for the reader that they rewrite the program to simulate the same analogue
network with values of the compute iteration cycle (dt) reduced by a
factor of ten. This augments the discussion of computational artifacts
later in the manual.

That's a good suggestion. I should point out that the hazard of
computational artifacts (which I first discussed in the appendix of B:CP,
and later with more competence in the "Spadework" article in 1979) arose
originally because of the limited speed of digital computers and the problem
of handling huge volumes of output data. When, for practical reasons, you
want to use as few iterations of a program as possible, you make dt as large
as possible, and that's when you run into this problem. Now, with today's
fast computers, this is not so much of a consideration. Wolfgang Zocher has
included, in the latest version of Simcon, the ability to save data only for
every nth iteration (an obvious solution to the data volume problem which I
should have thought of years ago), so there is little remaining reason from
the standpoint of simulation speed or data storage to use large values of
dt. It's a good idea, as you say, to try different values of dt to make sure
that shortening it makes no material difference in the way the program runs,
but there's no longer any need to flirt with values that could create
computational artifacts.

There's a second reason for using small values of dt. The integrations that
are always involved in models of real control systems suffer an
approximation defect, in that the "real" value of the integral is changing
continuously during an iteration, while on the computer it can change only
at discrete intervals. If we were dealing with open-loop systems, the simple
means of integration ("Euler integration") that we use would be completely
inadequate unless we used _extremely_ small values of dt. We would have to
use Runge-Kutta or even more advanced methods of numerical integration in
order to keep the simulated integral from wandering far from the correct
value over even rather short runs of a model (This problem is greatly
reduced in models of closed-loop control systems, because integration errors
do not accumulate; they are handled just like disturbances and are corrected
automatically and invisibly -- if the errors aren't too great).

One way to convince yourself of this is to program a sine-wave oscillator:

x := 100;
y := 0;

{iteration loop}
dx := 0.1*y
dy := -0.1*x

y := y + dy*dt {Euler integrations}
x := x + dx*dt
{end loop}

If you plot x against y, you should get a circle. Making dt as large as the
"optimum" value or even 1/10 of the optimum value will produce a pretty poor
circle -- it will grow or shrink appreciably on every iteration (I forget
which). You have to make dt VERY small (like 0.00001*k) in order for the
circle to repeat itself for more than a few revolutions without any visible
change in size.

The Nyquist sampling criterion says that to preserve all the information in
a sine-wave, you should theoretically have to sample it no more than twice
per cycle. But that has nothing to do with the problem here: here the
problem is to generate a sine-wave that continues for some time and actually
repeats. In general, the problem is to compute the integral of a varying
quantity in a way that will continue to represent the value of the integral
correctly over a long run of the simulation. The value of dt needed to do
this, even in a closed-loop system, is far smaller than the value that would
lead to computational artifacts, and several orders of magnitude smaller
than what the Nyquist sampling criterion would suggest. This is the
principal reason that, in tracking experiments, we have to sample the handle
and cursor positions at least 25 times per second and preferably 60 times
per second or more, even though the bandwidth of a human tracker is between
only 0 to about 2.5 cycles per second. The reason has nothing to do with
"sampling rate." It has everything to do with accurate integration using the
simple method of Euler integration.

In simulating the environment, this is an important consideration. Suppose
we have a position control system, and the object being positioned is a pure
mass on which the system exerts position-independent forces (as in a rocket
ship). The velocity of the mass is the integral of the applied force, and
the position is the integral of the velocity. Any integration errors in
computing velocity are integrated again in computing position, so the errors
are (roughly) squared. To get an accurate model of the environment, you must
either use microscopic values of dt, or use some advanced
predictor-corrector method of integration that minimizes the errors. If you
just use Euler integration with a dt calculated on the basis of the Nyquist
criterion and the maximum expected frequency of variation of position, you
will end up with a horribly inaccurate model of the environment -- too
inaccurate even if the model is a closed-loop control system.

···

--------------------------
Incidentally, there is a related problem I have seen in Hans' way of
discrete modeling. The value of dt is a way of introducing physical time
into a computer simulation. If the model is set up correctly to handle
physical phenomena, then changing dt should change only the simulated
time-scale on which the model runs, not the actual physical time or the
relationships among physical quantities. In effect, the plots look exactly
the same if the same physical time scale is used, with only the density of
plotted points changing as dt is changed (Simcon works this way). In
particular, the model of the plant should run at the same speed in physical
time regardless of the size of dt (as long as it is comfortably shorter than
the minimum value for accurate computation).

However, when the plant is modeled simply in terms of successive states
indexed by k, the speed at which it runs is proportional to the time assumed
for one iteration. The value of dt is always 1 but the units are
unspecified. If you change the association of one iteration with physical
time (in your head), the plant simply runs at a different speed. When you
use indexes that have no specified meaning in physical time, there is no way
to anchor the behavior of simulated physical processes to their actual
physical properties.

Let me use a simple example. Suppose you're simulating the way a mass
converts a force into a velocity. If you simply say

v := v + 0.01*f, in MKS units,

then on each iteration, the velocity will increase by the amount 0.01*f. But
how long a time is represented by one iteration of the loop containing this
expression? If you think of one iteration as representing 0.001 seconds of
physical time, then a force of one newton will create a velocity change of
0.00001 meter per second in one iteration. But if one iteration is imagined
to correspond to 1 second of physical time, then the same force will create
a velocity change of 0.01 meter per second in one iteration.

Obviously, simply thinking of one iteration as representing different
lengths of time CHANGES THE ASSUMED MASS OF THE OBJECT! There is nothing in
the innocent line of program code to suggest this change -- if one weren't
thinking about the physics of what is being simulated, the hidden assumption
of a value of dt, with the subtle effect of changing the modeled mass, could
completely escape notice. In fact, as written and with the units involve, dt
has an implicit value of 1 second, and the coefficient of 0.01 represents
the effect of a mass of 100 kilograms. So you have no choice about the
correspondence between iteration time and physical time: whether you realize
it or not you have assumed a specific value of dt, exactly 1 second.

The correct way to write the program line is

v := v + 0.01*f*dt

where dt is the explicitly assumed duration of one iteration of the program.
Now, if you change dt, you will find that the same physical force always
produces the correct change of velocity in one iteration, assuming a
constant mass of 100 kilograms. For shorter dt's, the velocity changes
appropriately less in one iteration.

Notice that dt enters ONLY in integration steps or other steps including
time functions. It does not enter any intervening calculations involving
only algebraic relationships. This means that some steps of the program
yield different values depending on the assignment of a value to dt, but
other steps do not. That, in turn, means that you can't simply assign
arbitrary meanings to the physical time occupied by one iteration. You must
actually change some of the computations -- but not all -- when you change
the assumed duration of one iteration. The time scale is NOT arbitrary. If
you don't include explicit dt's in the correct places in the program, the
simulation will be correct only for one physical time scale.

I assume that when the successive-state types of models are applied to real
physical situations, the value of dt somehow gets assigned correctly. But I
have yet to see a computation of this kind in which dt appears explicitly.
The general assumption seems to be that the same model, without dt, will
apply to processes that run at any speed.

Best,

Bill P.

[Martin Taylor 970121 11:05]

Bill Powers (970121.0100 MST)]

I agree with almost everything in Bill's posting. But there's one niggle.

The Nyquist sampling criterion says that to preserve all the information in
a sine-wave, you should theoretically have to sample it no more than twice
per cycle.

At the highest frequency of any signal or filter of interest in the analysis
or in the reconstruction (display presentation) of the simulated signals.

But that has nothing to do with the problem here: ...
... In general, the problem is to compute the integral of a varying
quantity in a way that will continue to represent the value of the integral
correctly over a long run of the simulation.

An integrator has a frequency response that declines by half for every octave
increase in frequency. The highest frequency involved in computing a perfect
integration is infinite, and so the Nyquist sampling interval is zero.

(I haven't tried it, but I imagine that you could do quite a good job
of reconstructing the correct integral if you used a high-quality rectangular
filter to filter the samples entering the simulation from the integrator
input and the output samples that are supposed to match the samples that
would be found at the integrator output if you really had a perfect analogue
integrator to check against. Then you could use a sampling rate T=1/2W
where W is the bandwidth of the rectangular filter. But This Is Only
Speculation.)

The value of dt needed to do
this, even in a closed-loop system, is far smaller than the value that would
lead to computational artifacts, and several orders of magnitude smaller
than what the Nyquist sampling criterion would suggest.

Several orders of magnitude smaller than zero? And I would have thought that
"computational artifact" should mean any computation that leads to a conclusion
that is false when applied to the simulated analogue system. I'm not sure
what you intend it to mean.

This is the
principal reason that, in tracking experiments, we have to sample the handle
and cursor positions at least 25 times per second and preferably 60 times
per second or more, even though the bandwidth of a human tracker is between
only 0 to about 2.5 cycles per second. The reason has nothing to do with
"sampling rate." It has everything to do with accurate integration using the
simple method of Euler integration.

And the reason it has everything to do with accurate integration _is_
sampling rate. The point is that every signal, at whatever frequency, has
an alias. Many of them, actually. An alias is a signal at a different
frequency that has exactly the same sample values as the real signal.
If you sample at regular intervals, you get an infinite sequence of aliases,
rather like the infinite set of images you see in a room with mirrors on
opposite walls. These aliases affect the accuracy of reconstructing a
signal that might have frequencies above 1/2T (T being the sampling interval).

(Returning to the above speculation: Putting the rectangular filter into the
simulation analysis and reconstruction is like putting a curtain across
the mirror walls. You are left only with the wanted signal, if its frequency
falls within the filter's bandwidth).

But, as Bill points out in a part of his posting that I am not quoting, if
the integrator is inside a control loop, small integration errors will not
cumulate. They will act like little disturbances and be cancelled out over
the long run. So you don't have to go to infinite frequencies just to satisfy
the Nyquist criterion, unless you are looking for perfect integration in
an open-loop situation. What you get for ignoring the criterion is usually
a bit of noise in your results.

In simulating the environment, this is an important consideration. Suppose
we have a position control system, and the object being positioned is a pure
mass on which the system exerts position-independent forces (as in a rocket
ship). The velocity of the mass is the integral of the applied force, and
the position is the integral of the velocity. Any integration errors in
computing velocity are integrated again in computing position, so the errors
are (roughly) squared. To get an accurate model of the environment, you must
either use microscopic values of dt, or use some advanced
predictor-corrector method of integration that minimizes the errors.

Yes, indeed.

If you
just use Euler integration with a dt calculated on the basis of the Nyquist
criterion and the maximum expected frequency of variation of position, you
will end up with a horribly inaccurate model of the environment -- too
inaccurate even if the model is a closed-loop control system.

Yes, if you haven't realized that for an integrator to be simulated accurately
by this method, the Nyquist criterion implies an infinite sampling rate.
(Or, if my speculation is right, you haven't used an effective rectangualar
filter to specify your sample values).

As I said before, as a rule of thumb I tend to avoid situations in which
any output can return to affect its own input in less than 2*dt, and to
distrust any result in which any signal or its derivatives change
dramatically in less than 2*dt. But it's much better if one substitutes a
larger number for "2".

Anyway, the point is that one has to be careful, and I'm glad Bill reinforces
it with reference to the "Spadework" article of 1979 (reprinted in Living
Control Systems I).

Martin

[From Bill Powers (970121.1230 MST)]

Martin Taylor 970121 11:05 --

An integrator has a frequency response that declines by half for every
octave increase in frequency. The highest frequency involved in computing
a perfect integration is infinite, and so the Nyquist sampling interval is
zero.

That's interesting. So you're saying that in any control loop that includes
an integrator, the Nyquist criterion requires an infinite sampling rate? I
suppose, then, that this would also be true for any system "up to" an
integrator; a proportional response, for example. A resistor has a frequency
response that is (ideally) independent of frequency. So by the same
reasoning, any system that includes a resistive stage requires an infinite
sampling rate. How would this work with a differentiator? The first
derivative of a sine wave (of any frequency) is a cosine wave of the same
frequency, and so on for any number of derivatives. So by your reasoning, a
system containing any number of integrals, a proportional stage, or any
number of derivatives would also call for an infinite sampling rate. Does
that leave us with any systems that don't require an infinite sampling rate?

I had thought that the Nyquist sampling criterion applied to _variables_
rather than functions. If there are no frequency components higher than some
limiting frequency in a given system, my impression would be that all the
variables could supposedly be adequately represented by sampling at rate
that is twice the highest frequency. In a human control system which has a
maximum frequency response of about 2.5 Hz, the implied Nyquist sampling
rate would then be about 0.8 samples per second. Using a safety factor of 10
for comfort, that would still call for only 8 samples per second. Simulating
integrations in a typical environment accurately would certainly require
using a dt a lot smaller than 1/8 second.

However, I'm willing to accept your word for this, becaue it agrees with my
own feeling regarding dt: the shorter, the better (as long as making it
shorter makes any difference in the results).

The value of dt needed to do
this, even in a closed-loop system, is far smaller than the value that
would lead to computational artifacts, and several orders of magnitude
smaller than what the Nyquist sampling criterion would suggest.

Several orders of magnitude smaller than zero?

OK, I'll retract what I said, since you are now saying that the ideal
sampling frequency is infinite, and the ideal dt is zero, for any kind of
system I can imagine.

And I would have thought that
"computational artifact" should mean any computation that leads to a
conclusion that is false when applied to the simulated analogue system.
I'm not sure what you intend it to mean.

I was referring in particular to the oscillations that occur when the gain
of the discretized system is so large that a one-iteration time delay
results in an 180-degree phase shift. These oscillations are easy to spot
because their period is always exactly 2 * dt. When dt is made smaller (by a
sufficient factor), these oscillations disappear, and whatever variations in
the variables are then left can be considered real. The system may in fact
be unstable and oscillate, but it will oscillate at a lower frequency and
reducing dt will not affect the period of those oscillations (once dt is
small enough).

These are all considerations that modelers must be aware of when trying to
represent continuous systems on a discrete machine.

Best,

Bill P.

[Martin Taylor 970121 15:30]

Bill Powers (970121.1230 MST)]

Martin Taylor 970121 11:05 --

>An integrator has a frequency response that declines by half for every
>octave increase in frequency. The highest frequency involved in computing
>a perfect integration is infinite, and so the Nyquist sampling interval is
>zero.

That's interesting. So you're saying that in any control loop that includes
an integrator, the Nyquist criterion requires an infinite sampling rate? I
suppose, then, that this would also be true for any system "up to" an
integrator; a proportional response, for example.

Yes, if you want an accurate reconstruction of the analogue system from the
sampled values.

... How would this work with a differentiator?

Even worse. The equivalent spectral response of a differentiator is a rising
6dB per octave--to infinite frequency.

The first
derivative of a sine wave (of any frequency) is a cosine wave of the same
frequency, and so on for any number of derivatives. So by your reasoning, a
system containing any number of integrals, a proportional stage, or any
number of derivatives would also call for an infinite sampling rate. Does
that leave us with any systems that don't require an infinite sampling rate?

No, so far. No, but...

I had thought that the Nyquist sampling criterion applied to _variables_
rather than functions. If there are no frequency components higher than some
limiting frequency in a given system, my impression would be that all the
variables could supposedly be adequately represented by sampling at rate
that is twice the highest frequency.

Yes, that's correct so far as it goes. But when you reconstitute the analogue
system after operating on the sampled values, you have to think of how you
are doing that reconstitution. What you want to see is a (probably
smooth) waveform that replicates the waveform that the analogue system
would have at the same point in the simulated system. But you only have
sample values. How do you go about finding that waveform you want to see?

Let's suppose you do absolutely no simulated operations on your samples. You
only want to reproduce the original waveform. You _can_ do this _exactly_
using samples taken just above the Nyquist criterion rate. For a sine wave
of frequency F, you can do if you have more than (1/2F) samples/sec. But
will you? If you reconstitute merely by taking the sample values at the
sample times, you get a comb-like waveform that has the right sample values,
but has frequency components up to infinity. Would this waveform have the
right properties if you integrated it? I don't think so.

But we don't ordinarily do anything so silly as to assert that the waveform
had the value 0.0 whenever we weren't looking at it. Often whe we do do is
to say that it kept the value we saw, until the next time we looked. We
reconstitute a stepped waveform. That's better, but it still isn't our
nice smooth original sine wave.

Now let's see what we did with those two (imperfect) ways of reconstituting
the waveform. The first time, we used a spike of zero width and a peak value
equal to each sample in turn. That's the same as passing the sample sequence
through a filter of infinite bandwidth. We get horrendous aliases, so much so
that the infinite sequence of aliases has _infinitesimal_ energy in any
finite frequency band. In particular, it has no energy in the band up to
the frequency F of the original sine wave. And if you look at it, you'll see
that such a comb waveform will always integrate to zero when passed through
an analogue filter. It won't integrate to zero when passed through a discrete
sampled integrator with the same sample instants, though. The discrete
integrator sees the whole infinite frequency set of aliases, since it
itself has those same alias frequencies. (Another way of looking at it is
that the discrete integrator's infinite aliases overlap onto the frequency band
containing the original sine wave. This also will ensure that the integration
comes out wrong).

The second reconstruction produces a stepped waveform. What is different?
We can look at it in either of two equivalent ways. One is to say that we
are playing back a set of rectangular pulses of time duration equal to the
sampling interval; the other is to say that we are playing back a sequence
of successive differences of the sample sequence, through a step function
that lasts for infinite time after the step. When you do the spectral analysis,
either way you find that the energy drops off like (if my memory serves)
sin(f)/f. Now there is finite energy in the band that includes our original
sine wave, but there's also energy in the aliases outside the band, and
that's going to affect the accuracy with which the simulation models the
analogue system.

One can go further and further, modelling the first derivative so that you
reconstruct the original wave with the correct slope, the second derivative
to model curvature, and so forth. Each such exercise sharpens the filter
that you are implicitly using to relate the analogue wave to its sample
values, and each such exercise reduces the energy in the high-frequency
aliases, making your reconstruction smoother and better, and improving
your simulation when you do other operations on the signals. (Looking at
it from the other view--of the operator functions such as the integrators
and differentiators in your circuit--their aliases in the frequency band
of your signals become more and more attenuated, and your operations are
more like the analogue operations you are trying to simulate).

In the limit, you have a perfectly rectangular filter that allows no energy
to be lost to the aliases...But (and there's always a "but") sharp filters
imply enormous phase shifts. In the limit, the phase shift approaches
infinity. In other words, the effect of any sample extends over infinite
time. You don't want that. (Another way to see that this happens is to
recognize that each successive derivative used to improve the smoothing of
the reconstruction requires you to use another sample point, and therefore
to delay half a sample interval before you can determine what the appropriate
value should be at any moment in time).

So you don't want rectangular filters, which means that you can't work too
close to the theoretical Nyquist limit. What you want is to sample fast
enough that the aliases are pushed up to high enough frequencies that the
filter you do use brings them down to levels that get lost in the noise.
If you have an integrator that's easier than if you have a differentiator.

In a human control system which has a
maximum frequency response of about 2.5 Hz, the implied Nyquist sampling
rate would then be about 0.8 samples per second. Using a safety factor of 10
for comfort, that would still call for only 8 samples per second. Simulating
integrations in a typical environment accurately would certainly require
using a dt a lot smaller than 1/8 second.

I think experience with systems similar to the one you are interested in is
as good a guide as any theory. When decreasing dt no longer makes a difference
that matters to you, use the longest dt for which this is true. But if one
is trying to make a general-purpose simulator like SIMCON, it's probably
a good idea to try to put in some checks that people might be able to use
to get warnings such as "Signal vel passes through differentiator dif with
a dt that may be too small" or "The output of amplifier amp returns to its
input in only two compute iterations. You may want to decrease dt and
insert a delay to get an accurate simulation". (I don't intend to put
such checks in my Java SIMCON--if I ever get to the stage of having one--
at least not at first).

Anyway, the whole thing is very subtle. The only way I know to convince
oneself that one isn't the victim of a discrete computation problem is to
do the whole thing at more than one value of dt--and better if the values
of dt don't have integer or small-fraction relationships.

Does this help, or does it sound like gobbledygook?

Martin

[From Bill Powers (970121.1655 MST)]

Martin Taylor 970121 15:30] --

Anyway, the whole thing is very subtle. The only way I know to convince
oneself that one isn't the victim of a discrete computation problem is to
do the whole thing at more than one value of dt--and better if the values
of dt don't have integer or small-fraction relationships.

Does this help, or does it sound like gobbledygook?

No, it doesn't sound like gobbledegook -- it obviously comes out of a
well-considered system of thought. But from my standpoint it's overkill. My
explanation for poor integration using large values of dt with a simple
Euler integration is simply that representing a curve by a series of steps
creates errors at every step. By luck these errors can cancel out, but in
general they accumulate, leading to simulations that drift farther and
farther from the correct values as time passes.

Your final suggestions again are good ones. It shouldn't be hard to include
computations in some future version of Simcon, done only once, that detect
problems with the size of dt, and put up some warning messages. But that can
wait.

···

------------------------------
I'd like to get back to an abandoned project that was started when Simcon
first came out: tutorials on control theory designed to give a feel for how
control systems work. Unfortunately, at the moment we seem to be limited to
Unix platforms, PCs, and Amigas. No Macs yet. Wolfgang has notified me that
he has come into possession of a PC on which he can produce a .EXE
(runnable) version of Simcon. When I receive it, I'll put it, together with
the manual, on my own FTP site so others can download it. We can start with
people who have PCs, at least. Those with Unix systems or Amigas can get the
programs directly from Wolfgang.

I realize that there are lots of people out there who are shy about showing
their ignorance by asking for help at every step. I hope we can coax you
guys into taking the plunge -- there is plenty of patient and willing help
available. If you need to start with 1 + 1 = 2, that's where we'll start.
The only sin I know of in a beginner is not asking for help when it's
needed. Even PhDs don't have a PhD in _everything_. We have between 130 and
150 people listening in -- I don't know of any reason why every one of these
people couldn't learn how to obtain Simcon, run canned programs in it, learn
how simulation really works, and get a gut feeling for what a control system
does. Of course right now only the PC users and those on the other two
machines can participate, but that will be a considerable number of us. Hope
we can get this off the ground.

Best to all,

Bill P.