[From Bill Powers (960923.1500 MDT)]
Martin Taylor 960920 17:00 --
Task pursuit compensatory disk-on-circle pendulum Number-at-50
Drug cor Ang cor Ang cor Ang cor Ang cor Ang
Plac 0.95 0.95 0.87 0.68 0.93 0.79 0.68 0.82 0.93 0.79
Amph 0.96 0.95 0.88 0.67 0.94 0.79 0.70 0.78 0.94 0.79
Modaf 0.96 0.95 0.88 0.70 0.93 0.82 0.67 0.80 0.93 0.82
If my interpretation of the criteria is anywhere near right, what this
says is that the model is about as good as one could hope for when treating
simple pursuit tracking. (By the way, these results are only for the G and
U disturbances). However, for the other tasks, even when the correlation
is pretty good, nevertheless there might be some structural differences
between the model and what the human actually does. This is particularly
true of the pendulum task, where the correlation is poor ...
I finally found a version of viewdata that I hadn't modified, and did a run
with the "pendulum" experiment. In fact, I did three runs:
Model vs real correlation: 0.9814 0.9882 0.9784
Real vs disturbance corr: -0.9584 -0.9660 -0.8906
The disturbance difficulty was 4 for the first two and 6 for the last one.
These numbers are displayed in the upper right corner of the viewdata
screen. The "real vs disturbance" correlation would be -1.0 if tracking were
perfect. So this correlation gives an idea of the quality of the control
under the disturbance being used. The "Model vs real" correlation shows how
closely the model matched the real behavior.
These numbers are similar to your measures, in that they show the model
matching the real behavior better than the real behavior matches that of a
perfect control system. So this is not just "generic control" -- the model
specifically fits the imperfect control of the real person, me.
Note the difference between the fit of the model to the performance of a
properly practiced subject and the fit to the performance of your subjects.
The model fit my behavior with a correlation of 0.98; it fit the performance
of your subjects with a correlation of 0.70 at best. I expect that the
second figure, the degree of opposition of the behavior to the disturbance,
was much lower for your subjects, too. A correlation of 0.98 implies a error
of fit of about 7% of the peak-to-peak excursion of the target. A
correlation of 0.70 implies an error of fit of about plus or minus 36% of
the peak-to-peak excursion (that is, plus or minus 72% of the maximum
deviation from the mean). Your subjects were hardly controlling at all on
this task.
I explained before the experiments began that the pendulum task was not what
it seemed. It is actually a simple compensatory tracking task. What makes it
seem difficult is that both the pendulum and the moving dot are moved in a
sine-wave from side to side, the SAME sine wave. If there were no
disturbance they would swing exactly together. The disturbance simply makes
the pendulum move left or right of the position of the dot. The regular
swinging motion was a distraction, not a disturbance of the controlled variable.
If the subjects had all been trained to criterion, they would all have
understood the nature of this task, and would have performed as well as I
did (I haven't done these experiments for over a year, and the figures above
are for my first three trials).
On simple pursuit tracking, the model usually fits my behavior with a
correlation of 0.98 or better with a fairly difficult disturbance. That
implies about an 8% rms error between model and data. The correlation of
0.95 or 0.96 that you report seems hardly any lower, but it implies an error
of 10 to 12%. And your subjects were young men in their 20s and 30s, I
presume, while I am 70. If they had learned the task well, they should have
outperformed me and their behavior should have been more consistent than
mine. Particularly on the easier disturbance, the model fits to their
behavior on this task should have been 0.99+.
I don't really need to run all these tasks again. I always get model fits in
the high 0.90s, well above 0.95 even with a disturbance of 6, except for the
numbers task (which I find difficult). My k factor varies only by about 5%,
if that, for each task (although it is somewhat different on different
tasks). The model always fits my behavior better than my behavior matches
the disturbance. I am a practiced subject, always working near asymptote. If
you were to test me under conditions of sleeplessness, I suspect that my k
factor would change, and that it would change very clearly because it is so
repeatable under normal conditions.
That is the entire point of my critiques of the experiment as it was done.
My vision was to obtain stable performance from all subjects which the model
could match very closely, with errors of only perhaps 5 to 8 percent on the
difficult disturbances, and less on the easier ones. I have repeatedly shown
that this is easily possible for most people, given enough practice. This
would have meant that a change in the parameters for a SINGLE SUBJECT on a
SINGLE TASK of 5 to 10 percent would have been meaningful, so we could see
that some subjects begin to control worse sooner than others do, and that
drugs affect some of them more than others.
I think you will agree that as matters worked out, this vision was doomed
from the start. By clever uses of statistics, averaging over runs and
conditions and subjects, you may well be able to discern the shadows of
regularities in the data, but they will be about the group, not about the
individual. They will be, in short, pretty much what we have come to expect
from psychological experiments.
I don't blame you for what I see as a failure of the project, and I don't
blame you for attempting to find information in the data that were obtained.
I just can't get very interested in the outcome any more.
Best,
Bill P.
ยทยทยท
---------------------------------------------------------------------------