[From Bill Powers (951003.0500 MDT)]
Tom Bourbon (950929.1340) --
It's always highly validating when someone who knows more than I do
independently offers an analysis reasonably parallel to mine. I think
you laid out the basis for objections to Brian's paper much more clearly
than I did. Hope you're back with us soon.
···
-----------------------------------------------------------------------
Bruce Abbott (951002.2135 EST) --
I think I see one problem in your interesting statistical analysis; it's
one I've run into before, but your analysis made it clearer. When
psychologists talk about the "amount of effect" of one variable on
another, the methods they use always seem to normalize all the variables
to a peak value of 1.00, so any _absolute_ measure of the amount of
effect is lost. Some of this absolute measure is restored by considering
the regression coefficient, but generally there is no way to make sense
of that coefficient.
I ran into this problem when doing correlations for tracking data. If
you do a simulation of a control system using a moderately good
controller with optimum slowing, you come up with a rather high
correlation between disturbance and cursor behavior, where in a human
subject with the same loop gain the correlation is much lower. The
reason, as you point out, is that there is noise in the human system
that is not present in the control model. In the human system, the noise
swamps the effect of the disturbance (or handle) on the controlled
variable, greatly lowering the apparent effect of disturbance (or
handle) on cursor in terms of "accounting for variance.".
But there is actually not a great deal of noise in the human system,
only a few percent of the magnitude of the behavior that is affecting
the controlled variable. The reason it seems so large in the statistical
analysis is that it is being compared with the result of adding two very
large variables together in almost perfect opposition -- the disturbance
and the handle position. In absolute terms, the cursor excursions are
only a small fraction of the excursions of the disturbance and the
handle.
But when you calculate the effect of the disturbance on the cursor using
the noise-free model, you come up with a rather high correlation because
of the lack of noise in the model. This would say that the disturbance
"accounts for a large part of the variance" of the cursor. This is
perfectly true, but in fact the _magnitude_ of this effect can be made
as small as you please by raising the loop gain of the control system.
So you end up with D having a relatively high correlation with C, at the
same time that D is having only a trivial amount of effect on C.
In the control system model we get around this problem by comparing the
variance of the cursor with the variance that would have been expected
if the handle varied randomly relative to the disturbance. This gives us
the "stability factor" that has been mentioned occasionally (see
Spadework article in Psych Rev). However, this actually understates the
degree of control, because if there were actually no control system,
there would be no handle effect to consider and the cursor variations
would be exactly the size of the disturbance variations.
What seems to be missing from most psychological treatments is a way of
estimating the size of the effect of A on B in comparison with the
maximum effect that could have occurred.
Consider the example in your generated data. You started with X as an
independent variable and Y as a dependent variable. You said that the
underlying relationship was Y = 2*X. But the data you presented was all
normalized: correct me if I'm wrong, but it seems to me that you would
come up with exactly the same numbers if Y = 0.002*X. The standard
deviation, of course is normalized to the range of Y, so the scaling
factor is irrelevant. Yet in the second case, the effect of X on Y is
only 1/1000 of the effect in the first case. In the first line of your
data analysis, X accounts for 96% of the variance in Y in both cases,
but in the first case the effect is about the same size as the
variations in X while in the second the effect is 0.1% as great. So does
X "account for Y" to the same degree in both cases?
The reason correlations are typically so high in tracking
experiments is that there are no disturbances to cursor position at
work during the task that are anywhere near the size of the
disturbance being applied by the program. But make the cursor "too
sensitive" to mouse movements and watch the numbers deterioriate.
In a pursuit tracking task, there are two disturbances, the target
movements and the disturbance applied directly to the cursor. Any number
of other disturbances can be applied to the cursor without much
affecting it, as long as the total doesn't get too big or fast-changing.
Even noise from random variations in handle position, introduced by the
person doing the tracking, is reduced by the loop gain of the system
(within its bandwidth of control). The high correlations to which you
refer are between handle position and the sum of all disturbances. What
we look for in detecting control is not a _high_ correlation, but a
_low_ correlation between handle position and cursor position. The low
correlation is seen because people typically keep the error small enough
to be comparable with internal system noise, and it is the system noise
that reduces the correlation of output with input, or disturbance with
input.
If there are disturbances of the cursor other than those we intend
during a tracking experiment, and there always are, they simply add to
the total disturbance. If we estimate the quality of control by
comparing cursor excursions to handle excursions (where only system
noise enters), it doesn't matter how many external disturbances there
are, or whether we have identified them all. The cursor will still vary
far less than we would expect from the handle movements alone, so we can
tell that control is occurring and estimate how good it is (the total
effective disturbance, as Hans Blom has pointed out, can be estimated
from C - H).
If there were some external disturbance comparable in size to the ones
we use in the tracking experiments, don't you think it would be rather
easy to find what is causing it?
In more complex situations where unmeasured and uncontrolled (in
the experimental sense) sources of error exert as strong an
influence as the disturbance does, control (in the PCT sense) will
be poor and the correlations will be correspondingly less
impressive.
Do you see now that this isn't true? It's true only if you're looking
for a high correlation between disturbance and action. But by looking at
handle and cursor movements, we can always deduce the amount of the net
effective disturbance, and from the absolute size of cursor movements in
comparison with handle movements we can demonstrate that control is
occuring, and even estimate the loop gain (approximately, the regression
of cursor movement on handle movement divided by the physical
proportionality factor between handle and cursor).
Anyway, to get back to my main point:
It seems to me that there can be a statistically "large" effect of X on
Y, where X accounts for a large part of the variance of Y, while at the
same time X is hardly affecting Y at all in comparison with the possible
range of effect that could occur. The only reason I ever started using
correlations as a way of measuring control was to create a bridge to
those who were used to measuring all relationships in terms of
correlations. I can now see that this was a mistake, because entirely
the wrong impression can be created about the actual magnitudes of
effects. As I said, the disturbance can account for a high percentage of
the variance of the cursor while still having a negligible effect on it.
That is very hard to say in the language of statistical analysis, for me
at any rate.
-----------------------------------------------------------------------
Hans Blom (951002b) --
I appreciate your efforts to clarify my thinking about the strategy of
your model, but you're still giving me too much credit for familiarity
with your methods. What I'm trying to do is eliminate all the
statistical aspects of the strategy and just see what is going on in the
absence of noise.
This is useful:
delta_K := delta_K + constant1 * estimate (dK/dt) * (x-y)
delta_D := delta_D + constant2 * estimate (dD/dt) * (x-y)
The constants, I presume, are to prevent computational oscillations --
to keep the corrections from undershooting and overshooting.
I'd just as soon leave out the Kalman Filtering aspects of the situation
for now; instead of
x [i] = b * u [i] + noise [i],
I just want to think about
x = b*u + d,
where d is just a variable.
Here's where I am now:
In general, x will be some function f, with d being added to x (because
if d is an unknown waveform, it doesn't matter whether it is passed
through some function before having its effect on x):
x = f(u) + d
In the second step of your method, it seems to me that it's necessary to
compute
-1
u = f (r - d).
In short, whatever world-model is being used, the u that is calculated
for simultaneous application to the real plant and the world-model is
derived from the inverse of the world-model, with the reference signal r
substituted for x.
Is this correct?
-----------------------------------------------------------------------
Best to all,
Bill P.