[Martin Taylor 950113 16:30]
Rick Marken (950112.1020)
I seems to me that the last time you talked about
loop delays, it was your way of showing how it was possible for information
about the disturbance to be carried by (in, whatever) the percetual signal.
Your understanding mystifies me once again. But that's becoming an
entertainment rather than a disturbance--kind of fun, but not requiring
output to correct the perceptual error. My output function gain is getting
quite low.
However, a mild comment: What loop delays enforce is that control cannot
be maintained if the disturbance changes too fast. With zero delay,
control could, in principle, be perfect no matter how fast the disturbance
varied. The speed at which the disturbance can vary is one aspect of the
relevant information rates, so loop delay is indeed quite closely related
to the informational analysis of the behaviour of the control loop.
Perhaps you could describe some of your research on the relationship between
disturbance bandwidth and control. It sounds like it is quite interesting.
Not my research. Ask Bill P, or look back at the stuff we dealt with last
winter on the Laplace analysis of the control loop.
ยทยทยท
=========================
Did you go through separate parameter fits for each of the six trials on
which the median fit is based?
I assume, by the way, that you are
describing the fit of the model to data from the standard tracking task.
Yes. And yes. Here are some examples from the first experimental set.
More commentary and answers and questions follow:
------------------------------
Correlations between best-fit model and actual handle positions for tracking
vertical-line cursor against vertical-line target in pursuit and compensatory
mode, with two difficulty levels of Gaussian disturbance. The tables are
by subject, in the following order of rows: compensatory difficulty 6,
pursuit difficulty 6, comensatory difficulty 3, pursuit difficulty 3.
This is the order these particular trials were done within each 6-hour block,
so the tables can be read from top to bottom of a column, and the columns
from left to right to get the order of the trials if you want to judge the
effect of learning. The timings represent the start hour of a six-hour
block within which the particular track was done. The offset from this hour
was the same for each track of a particular kind, and is noted in the first
column of the matrix; for example, the first comensatory run at difficulty 3
occurred at 14:40 on Monday (12 plus 2:40 offset). The same timings and
offsets apply for each subject, and are not repeated. (Blanks represent
trials that either were skipped for some reason or for which the subject
did not move the mouse for long periods). In the "hour" row, "V" represents
approximately when the drug dose was given (amphetamine for these subjects).
It was actually given at 23:30 Tuesday, and 0530 and 1530 Thursday.
Subject 111:
Monday>----------Tuesday-----------Wednesday-------Thursday------|Friday
hour 12 08 14 20 V 02 08 14 20 02 V 08 V14 12
Offset
0:20 .884 .941 .959 .960 .875 .938 .953 .913 .966 .945 .888 .972
1:20 .884 .941 .959 .960 .875 .938 .953 .913 .966 .945 .888 .972
2:40 .961 .892 .989 .966 .987 .956 .953 .965 .964 .968 .701 .991
4:20 .991 .934 .996 .990 .988 .995 .924 .945 .970 .992 .931 .988
Subject 112:
.699 .883 .873 .886 .956 .943 .766 .685 .726 .485 .693 .807
.962 .957 .967 .976 .978 .964 .972 .967 .824 .978 .956 .970
.986 .764 .435 .993 .996 .996 .988 .995 .960 .771 .302 .990
.994 .991 .995 .956 .979 .988 .996 .993 .898 .298 .740 .998
Subject 113:
.827 .840 .012 .807 .829 .883 .838 .675 .697 .899 .819 .864
.939 .945 .923 .927 .957 .950 .824 .648 .884 .883 .939
.987 .951 .977 .690 .988 .984 .944 .972 .870 .987 .826 .989
.905 .963 .950 .952 .994 .977 .974 .985 .691 .972 .831 .914
Subject 114:
.732 .640 .644 .273 .638 .631 .626 .686 .638 .561 .693 .784
.816 .875 .947 .785 .944 .858 .937 .825 .848 .910 .943 .838
.965 .922 .975 .944 .941 .942 .830 .987 .912 .983 .986 .809
.947 .978 .937 .937 .989 .940 .994 .993 .982 .916 .923 .702
Subject 115:
.485 .413 .823 .721 .904 .785 .340 .518 .565 .323 .647 .711
.821 .889 .953 .926 .967 .967 .978 .925 .854 .811 .943 .964
.899 .879 .995 .975 .994 .985 .993 .956 .935 .959 .994 .986
.660 .992 .993 .996 .989 .983 .972 .876 .724 .991 .993 .997
Subject 116:
.611 .765 .498 .726 .865 .857 .690 .827 .753 .285 .365 .371
.939 .891 .917 .914 .959 .942 .873 .317 .934 .934 .943 .831
.847 .971 .927 .996 .992 .987 .984 .988 .953 .446 .990 .981
.923 .986 .956 .992 .996 .974 .973 .986 .647 .928 .992 .990
-------------------------------
If so, it might be that during this early part
of the experiment the subjects' control parameters are still unstable
(reorganization is still happening) during a run. I would expect things to
get better (for the subjects with the low median fits) for later runs.
This doesn't seem to be happening in this first experimental set. If you
compare the Monday data with the first set of Tuesday data, the Tuesday runs
provide better fits on 14 of the 24 comparisons. These subjects had both
some unrecorded runs and 30 "official" practice runs before any of these
data were collected. For the most part, the good and bad model fits seem
to be fairly randomly intermixed, at least in the first four experimental
blocks (up to the one ending 06:00 Wednesday).
If
the fits don't improve using the simple model -- especially for later runs --
I would suggest that you look at a graph of the time variations in model and
subject handle movements (overlaid on the graph) to see what's going on; it
may be that you just had a sticky handle or somethong like that.
Mouse, not handle. And there were six independent PC workstations, so
the explanation might work for one, but not for all.
I've checked the quality of fit overall for the first four blocks (i.e.
before there was any serious issue of sleep deprivation, if we are
to believe previous work on sleep deprivation). These subjects have
fits that are about as good as those in the second set, and much better
than the fits for the subjects in the third set, but worse than those
in the final three sets. (Even those last three sets didn't come up to
the standard I had expected, by a long shot).
As for tracking time variations of the model parameters, that was the prime
objective of the study. But I don't trust the results unless I get a
decent model fit, reliably. I'm tempted to try adding a dead zone, partly
because Bill P suggested it, and partly because it agrees with my subjective
experience, especially with the "jump" disturbance, but also with the
smooth disturbances represented in the tables above.
=====================
Bill Powers (950112.0805 MST)
I suggested a number of strategies for getting good data when the
project was starting out. As I recall, it was not possible, in the last-
minute rush to get the project under way, to implement many of them.
Also, so many different conditions were studied that it was not possible
to get much practice with each one before the effects of drugs and
sleeplessness became important.
Yes, you did. But I do not see much evidence of learning, in the sense
of having better values of "k" or "d" in the fitted model, at least in the
results I have looked at. In preliminary tests with my colleagues, the
quality of tracking didn't seem to change much after the second or third
run. But there may be subtle effects I haven't spotted. These, I would
have thought, should have affected the model parameters, rather than the
model fit.
Each control task was probably novel to
most participants, and some were perceptually difficult.
Yes, that's true.
I also pointed out that it is probably a good idea to make sure that all
the subjects were using the same output degree of freedom, and proposed
that a fixed elbow rest be used, and a mouse pad that would assure
minimum slippage of the mouse ball. These are secondary considerations,
but could be important (particularly if the mouse were being used on a
surface that allowed a great deal of slippage).
I do know that one problem was that the mouse sometimes came off the pad,
and this would affect the model fit. But I think that it happened quite
a bit less often than once per track, just from watching the subjects on
the video monitors. When it did happen, the effect would be a glitch
during which the "handle" didn't move, leading to a momentarily large
error. The model doesn't take this into account, but I very much doubt
it accounts for much of the problem. As for "slippery surface" there was
at least one occasion when we spotted a subject using the mouse on the
desk-top. But this shouldn't matter unless the slippage was gross, because
both the real system and the model use the actual counts coming from the
mouse, and we know that tracking quality is not much affected by small
changes in the environmental feedback function.
I also suggested trying to make sure
that the eye-screen distance was standard, perhaps by using a chin-rest.
Yes, but not long ago you posted results that seemed to show that this
in fact didn't matter.
These latter considerations are obviously not very important for a well-
practiced subject because the same subject would probably settle into a
more or less standard configuration
!!! For sure--our subjects did just that.
The model uses constant parameters during a prediction run, so there is
the implicit assumption that the participant's parameters are constant
during the run. During learning, this is not true -- not only are the
parameters changing during a run, but the participant is reorganizing in
many other ways, such as the way the mouse is held, the position of the
body, what aspect of the display is being watched, and so forth.
Yes, I expect that's important, but I do think that if they hadn't done so
before Monday, they had at least become comfortable with all the tasks
by mid-Monday--though they might not have reached peak performance. On
Monday, they did 72 runs, 12 with each kind of display.
The effects of reorganization would be most pronounced on the tasks that
were the most difficult to grasp, such as the pendulum task or the one
with the rotating relationship between direction of cursor movement and
direction of handle movements. Participants might take a lot of practice
to master these tasks.
Interesting, but the rotating one seemed to involve a distinctly noticeable
reorganization. The others didn't. And it was Tom's task that was
initially hardest. We don't have any analytic data from Tom's task.
The results you give would be more informative if they weren't averaged
over tasks, difficulties, and types of disturbance. I suspect that the
model fit best for the first task, simple pursuit tracking, and much
worse for some of the others.
Yes. I think you have those data (I've found another bug in the fitting
program, but fixing it results in very slight improvements in the correlations,
and what you have will be either exactly the same or within .01 or so.
The obvious explanation would be that on
the harder tasks, learning was far from complete by the end of the first
experimental day -- only a few repetitions of the task.
I'll check that.
The basic problem is that the experimenters wanted to get
too much out of the data by using 5 experiments with 2 degrees of
difficulty each and 3 different kinds of disturbances -- 30 different
conditions!
Each of which was tested 12 times in a 50-second run in each subject
over the course of the experiments. Maybe more would have been better,
but it OUGHT to be possible to fit the same model with the same parameters
to each condition for any given display. Even if we eliminate the "jump"
disturbance as "unfair" to the simple model, there are still 48 runs for
each display. And if I remember your earlier discussion, you thought that
a non-linear output function would help with the "jump" fit. It seems to
me that the same model should fit all the tracks for any given display,
with the same parameters, since the model fits the subject, not the
environment.
There simply wasn't time to establish a baseline for each
condition, and to be sure performance had stabilized so differences in
performance would be meaningful.
There's nothing very obvious in any changes in fitted d or k for any of the
displays that would support this. It could be true, though, and perhaps
a more critical analysis would support your claim.
The best you can do is to take the tasks where the subjects showed the
least tracking error by the end of the first day, and compare runs under
the same conditions for the same experiment each day.
That's what I had done for myself, and what I provide for you, above.
There is no way to
get good information by averaging together experiments of different
types and disturbances of different difficulties.
No. I wasn't doing that, and I thought that what I wrote had made that
clear. It obviously didn't, since Rick also had to ask whether I had done
that.
Anyway, if you haven't gone any further with the dead-zone idea to improve
the fit, I may try it out. But it will greatly increase the compute time
for the fit if we use the same method as in your "Viewdata" program. I was
wondering whether an e-coli fitting technique might not work better?
Adjust all three parameters (d, k, and the dead-zone width) simultaneously,
moving in a random direction in the 3-space until the fit gets better between
attempts, and then keep going in the same direction ... Should work,
shouldn't it?
Martin