Feedforward yet again

[From Bill Powers (2010.01.06.1506 MST)]

Rick Marken (2009.01.06.1230) –

BP earlier: If you and Rick
haven’t already seen the implication

of this, allow me, belatedly, to point it out: positive feedback does
not

reduce error, it increases error.

RM: The way I did it it doesn’t seem to (but given my programming

credentials I’m not saying this with a lot of confidence). But
here’s

what I did. I changed the controlled variable in my tracking program

to one that has been incorrectly dubbed a “predictive”
variable; it’s

just the usual controlled variable (Target-Cursor) but it includes
the

derivative of cursor (or target) movement. In the pursuit tracking

case, the controlled variable is [ I rearranged the terms a
little]:

P = Target - Cursor + Kd * (Target - Targetp)

where Targetp is the target position 1/60 of a second earlier. In
the

compensatory tracking case the controlled variable is:

P = Target - Cursor + Kd * (Target - Cursorp)

BP: It’s better to say the controlled variable is Cursor - Target, since
the effect of a positive mouse movement on cursor position is positive
and the error is r - p, which inverts the effect of the cursor on the
error signal. We want only one negative sign (or an odd number of them)
in the loop.

This isn’t the same as Martin’s model. In Fig. 2 of his paper, Martin has
substituted “Target” for what we would call Cursor, and
“Target position” for what we would call P. That makes his
model into compensatory tracking with the actual target position (as we
would name it) constant at zero (he then calls it a “marker”
without mentioning what it marks). Reducing his model to canonical form
we then have

P = (Target_Position) - d(Target_Position)/dt

e = R - P

and so on.

If you copied your equations correctly, I can’t figure out what you’re
proposing. In the pursuit case, you say (again rearranging a
little)

P = (Target - Cursor) + Kd*(Target - Targetp)

So you’re adding only the derivative of the target, not the cursor.
That’s not the derivative of the separation between target and cursor. I
don’t see why the person perceives the derivative of target position but
not the derivative of cursor position, especially when the controlled
variable is the separation of cursor and target, Cursor -
Target.

The compensatory case is also messy: the derivative is the target
position minus the previous value of the cursor, which makes no sense to
me at all.

The target value is zero, so it reduces to

P = 0 + Kd * (0 - Cursorp) - Cursor or

P = -Kd*Cursorp - Cursor

which is nonsense.

I was thinking for a while that the main difference between your model
and Martin’s was that you gave the derivative the opposite sign. But now
I have no idea what you did. How about checking back to the original
program and making sure what the equations are? I suspect that you
miscopied something and it escaped your proofreading.

Best,

Bill P.

[From Rick Marken (2010.01.06.1835)]

Bill Powers (2010.01.06.1506 MST)--

This isn't the same as Martin's model. In Fig. 2 of his paper, Martin has
substituted "Target" for what we would call Cursor, and "Target position"
for what we would call P. That makes his model into compensatory tracking
with the actual target position (as we would name it) constant at zero (he
then calls it a "marker" without mentioning what it marks). Reducing his
model to canonical form we then have

P = (Target_Position) - d(Target_Position)/dt

e = R - P

So Target_Position is the instantaneous difference between target and
cursor? If so, then I think I can implement this correctly in my
spreadsheet.

If you copied your equations correctly, I can't figure out what you're
proposing.

I didn't mean to be proposing anything. I was trying to copy what I
though I said in that 1995 post that Martin reveres;-) I will change
the equations to make them consistent with your proposal above.

The compensatory case is also messy: the derivative is the target position
minus the previous value of the cursor, which makes no sense to me at all.
The target value is zero, so it reduces to

P = 0 + Kd * (0 - Cursorp) - Cursor or

P = -Kd*Cursorp - Cursor

which is nonsense.

I agree. But it works;-)

I was thinking for a while that the main difference between your model and
Martin's was that you gave the derivative the opposite sign. But now I have
no idea what you did. How about checking back to the original program and
making sure what the equations are? I suspect that you miscopied something
and it escaped your proofreading.

The equations are what I actually used. nonsensical or not. And when

0 the performance of the model did improve. But I'm just interested

in testing the model Martin used so I will change the equations for P
to the one you specify above.

Best

Rick

···

--
Richard S. Marken PhD
rsmarken@gmail.com
www.mindreadings.com

[Martin Taylor 2010.01.06.23.19]

[From Bill Powers (2010.01.06.0100 MSTG)]

Martin Taylor 2010.01.06.00.16 --

I feel a lot of empathy for your position, Martin, and I don't want to be cruel about this.

Thanks, but I don't feel nearly as disparaging about the results of the study as you do. Yes, there were problems, and yes it would be nice to have the raw data and reanalyze. But some of your criticisms really don't apply with the force you give them. The problems you correctly describe for the most part enhance the import of the results rather than detract from their value.

1. Training: It would have been better had the subject been able to get more training. We knew that before we started, and I agitated to get it. But the actual tasks were easily understood by all the subjects, after at most half a run. Yes, more training would have resulted in better tracking, but since the experiment was expected to result in a deterioration of tracking over time, which would oppose the improvement to be gained from the continuing learning across sessions, the result would be a reduction in any effects that might have been observed.

2. Consistency of working conditions: Inconsistency works against getting meaningful results, so any that do show up (particularly the clear effect of adding the predictor to the reference signal when people are sleepy, undrugged) are more solid than they might have been if they were just tiny effects that showed up against a clean background.

3...

When I saw the results that started coming in, I knew that our ambitions were doomed. The "microsleep" phenomenon finished it off for me. The integrators in the models went right on integrating while the subject held the mouse motionless or moved it randomly and the error signal zoomed upward, and there was no way to just cut out the bad part and stitch the before and after parts together and still get a good match of model to data, at least not a match that meant anything.

That's a major point of disagreement. You can model the microsleeps in various ways. At this remove I can't remember which way I actually used. I said yesterday (?) that the microsleep periods were cut out of the modelling, along with a guard period to allow the leaky integrators to catch up, but on thinking about it, I might just have set the model gain to zero for those periods. I don't remember and I don't think there's an obvious way to find out, since I doubt those programs are recoverable from any known archive.

Either way, yes the microsleeps do affect the model fits, but as with the inconsistencies of working conditions you reference, the effect is to make it harder to discover effects that are really there, making more important any effects that do nevertheless stand out.

4...

I think we're now stumbled onto the last fatal flaw, the fact that the derivative you added and called "prediction" actually introduced positive feedback into the loop.

Not into the position control loop. The position control loop is still negative feedback. The positive feedback is in the velocity, and of course if you put too much gain into the prediction loop the velocity will go into oscillation. I haven't analyzed or simulated the effect in a linear loop, but subjectively, if you exaggerate the velocity you observe, you then have to undershoot to get the position correct again, and then again overshoot, and that could certainly lead to runaway. In the raw (human) tracks, one quite often sees this kind of running ahead and then waiting and then running ahead again, so it's possible that this kind of positive feedback occurs, but not with enough gain to cause the runaway. Anyway, what we do know is that by including the derivative predictor, the fit to human data was improved, not necessarily when people were normally wakeful, but clearly so when they were sleepy -- definitely in the placebo condition, and slightly but consistently in the two drug conditions as well.

If the analysis says the fit shouldn't improve by adding in the predictor, then the analysis should be rethought.

I can now almost see what happened. The clue is the fact that you had to divide the multiplier for the derivative, z, by the gain of the output integrator, k. If you didn't do that, any loop gain greater than 1 would produce instant oscillations or runaway for z greater than 1.

Yes, it would. In practice it didn't, either for the human or in the model that tracked human behaviour better by incorporating a positive z than by setting it to zero.

I think you are mixing up the analyst's view with the internal control system view in your comment. Whether the parameter is called z or z/k is entirely irrelevant to the model. Either way, the amount of derivative that's added to the reference signal in the best fit model is the same. I've forgotten why I chose z/k. It was probably because I anticipated (or had observed) that the gain would be influenced by sleep loss, and imagined that the same effect would apply to the predictor part of the loop, so dividing the best fit predictor parameter by "k" would decorrelate at least that part of the sleepy effect. Whether it did or not is irrelevant to the result that 17 of the 18 best-fit values of z were positive when the subjects were sleepy, and the other one was zero (and that one was the first one after a sleepless night), whereas positive and negative values were more or less balanced before the first sleepless night (suggesting that prediction was not much used, if at all, when the subjects were not sleepy -- or perhaps not before they were well practiced, since the z values were all positive even after the subjects had their 14 hours recovery sleep).

Anyway, I stand by the results of the experiment, as much because of all the problems you correctly describe as in spite of them.

Martin

[From Rick Marken (2010.01.06.2200)]

Bill Powers (2010.01.06.1506 MST) --

Reducing his model to canonical form we then have

P = (Target_Position) - d(Target_Position)/dt

e = R - P

and so on.

OK, I rewrote the program to be consistent with this model (I think).
The code is given below. The result of using this model is basically
what I got with my nonsense versions. Without any derivative added (Kd
set to 0 in the code below), the fit of the model to the human data
was _better_ than it was when the derivative is added (Kd>0). However,
tracking performance in both the compensatory and pursuit case
_improves_ (smaller RMS error) when the derivative is added.

Best

Rick

Here's the code:

For i = 3 To 3601

PTargetp = PTarget
CTarget = 0
PTarget = Cells(i, 8)
CDist = Cells(i, 8)

CCursorp = Ccursor
Pcursorp = Pcursor

Ccursor = Coutput + CDist
Pcursor = Poutput

    CTargetPositionp = CTargetPosition
    PTargetPositionp = PTargetPosition
    CTargetPosition = CTarget - Ccursor
    PTargetPosition = PTarget - Pcursor
    dCTargetPosition = CTargetPositon - CTargetPositionp
    dPTargetPosition = PTargetPositon - PTargetPositionp

    CErr = CTargetPosition - Kd * dCTargetPosition
    PErr = PTargetPosition - Kd * dPTargetPosition

Coutput = Coutput + (Gain * CErr - Damping * Coutput) * dt
Poutput = Poutput + (Gain * PErr - Damping * Poutput) * dt

Next i

···

--
Richard S. Marken PhD
rsmarken@gmail.com
www.mindreadings.com

[From Bill Powers (2010.01.07.0935 MST)]

Rick Marken (2010.01.06.2200) --

RM: OK, I rewrote the program to be consistent with this model (I think).
The code is given below. The result of using this model is basically
what I got with my nonsense versions. Without any derivative added (Kd
set to 0 in the code below), the fit of the model to the human data
was _better_ than it was when the derivative is added (Kd>0). However,
tracking performance in both the compensatory and pursuit case
_improves_ (smaller RMS error) when the derivative is added.

BP: Very good. I'll convert your program to Delphi and check it out, but I have no reason to think the results will be any different. The most important thing you've said here is "Without any derivative added (Kd set to 0 in the code below), the fit of the model to the human data was _better_ than it was when the derivative is added (Kd>0)."

This is what I expected: the model can be made to control better than the real person does, and that's a good thing because it says that the optimization of the model is not just making it control better, but making it match real imperfect behavior better. You are truly finding a "best fit" and not just a "better fit."

Your result shows that the human being in the experiment does not use this kind of predictive control. It can also show that a subject does use it if he or she does, so it's a proper scientific finding.

If you publish this finding, I recommend showing the data for the fit so the reader can judge whether the conclusion is appropriate. We need more debunking of the old myths about control. Prediction is one of them.

Best,

Bill P.

[From Bill Powers (2010.01.07.1415 MSRT)]

Rick Marken (2010.01.06.1835) --

BP earlier: P = (Target_Position) - d(Target_Position)/dt

e = R - P

RM: So Target_Position is the instantaneous difference between target and
cursor? If so, then I think I can implement this correctly in my
spreadsheet.

BP: No; what he calls "target" in the diagram is what we call "cursor." The actual target is what he calls a "marker" as near as I can tell. The marker appears to be stationary so his model is for compensatory tracking. I assume that his language was adopted for the sake of people unfamiliar with the language we use, but it makes for horrible confusion here.

>BP earlier: If you copied your equations correctly, I can't figure out what >you're proposing.

RM: I didn't mean to be proposing anything. I was trying to copy what I
though I said in that 1995 post that Martin reveres;-) I will change
the equations to make them consistent with your proposal above.

BP: No, wait, because Martin thinks he simply copied your model from 1995, and I can't see any relationship between what is shown in fig. 2 of his diagram and your model as you show it here. I'm trying to understand what happened, and checking to see if he added the derivative where you subtracted it. So far there isn't sufficient resemblance between his model and what you're describing as yours to draw any conclusions -- I'm comparing apples and ducks. Maybe Martin can supply the missing information.

RM: The equations are what I actually used. nonsensical or not. And when

0 the performance of the model did improve.

BP: Improving the performance of the model wasn't the point. It's easy to improve the performance of the model. What we want is to improve the match of model behavior to real behavior. When Kd>0, you said the match got worse, not better, so this shows real people don't use the derivative. But Martin seems to think that the improvement in the match did occur, largely on the basis of what he understands you to have said, and therefore he thinks people do use the derivative. He didn't present any data of his own showing that the match was improved. So at the moment I don't know what anybody is talking about.

If you want to try to replicate Martin's model, at least for normal conditions, that's a good idea. I may try it, too, though I don't know when the opportunity will arise.

Best,

Bill P.

[Martin Taylor 2010.01.07.17.24]

[From Rick Marken (2010.01.06.2200)]

   Without any derivative added (Kd set to 0 in the code below), the fit of the model to the human data was _better_ than it was when the derivative is added (Kd>0).

That implies that when you optimized Kd in the fit, the optimum was always zero or less.

Two questions:

1. How did you set the "sluggishness"? When you originally introduced this idea, you said that adding the predictive component led to an improvement only if the control was sluggish. In my sleep-loss study, as the figure I presented showed, the fit to the human data was better with prediction only after a night's sleep loss, which I assumed would have an effect similar to your "sluggishness".

2. How did you optimize the model fit? One would expect the optimum value of Kd to fluctuate around zero for different runs if the subject was not predicting (as it did for my non-sleepy subjects). It would be very strange if the optimum fit were exactly zero in every run. If, as you say, the fit to human data was typically better when Kd = 0 than when Kd > 0, was it better yet when Kd < 0? And was that difference consistent across runs? If so, either we have a theoretical problem or we have a software bug. What would be nice to see would be a histogram of the optimum Kd over several runs, as a function of "sluggishness". I would expect such a histogram to be centred near zero for an alert subject with snappy tracking feedback.

The interesting question for me isn't whether prediction is used by people, but under what conditions it is used, because we know of at least one condition when it is (sleepy subjects), and one (or more) when it isn't (alert subjects doing the same tracking tasks).

You originally (1995) gave "sluggishness" as a condition in which a predictive control model gave improved tracking. If that is generally true (and you say it still holds in your new experiment), it would be odd indeed if evolution had not found a way to allow predators and/or prey (such as humans) to take advantage of this opportunity to improve their control. We would be searching for plausible reasons why this strange lacuna existed in the biological repertoire, rather than searching for demonstrations that active organisms lack such a useful ability.

I think it is great that you are doing this kind of experiment, and hope to see the general results after you have been able to run a few subjects under a range of conditions.

Martin

[From Rick Marken (2010.01.07.1550)]

Bill Powers (2010.01.07.1415 MSRT)
So far there isn't sufficient resemblance between his model
and what you're describing as yours to draw any conclusions -- I'm comparing
apples and ducks. Maybe Martin can supply the missing information.

Yes, I think Martin will have to help out with this.

What we want is to improve the match
of model behavior to real behavior. When Kd>0, you said the match got worse,
not better, so this shows real people don't use the derivative. But Martin
seems to think that the improvement in the match did occur, largely on the
basis of what he understands you to have said, and therefore he thinks
people do use the derivative. He didn't present any data of his own showing
that the match was improved. So at the moment I don't know what anybody is
talking about.

In 1995, I think the only "improvement" I was talking about was
improvement in the performance of the model. Martin will have to tell
us what model he used that gave him an improvement in prediction of
human behavior. What I have found so far is what you say: when Kd>0,
putting the derivative into controlled perception, the match of model
to actual behavior gets worse, not better, and I agree that this shows
that real people don't seem to use the derivative.

Best

Rick

···

--
Richard S. Marken PhD
rsmarken@gmail.com
www.mindreadings.com

[From Rick Marken (2010.01.07.1600)]

Martin Taylor (2010.01.07.17.24) --

1. How did you set the "sluggishness"?

Using the damping parameter; the smaller damping, the more "sluggish".

2. How did you optimize the model fit?

Manually. I first fitted the model, with Kd=0, by finding values of
Gain and Damping that gave the smallest RMS error between model and
subject. Then I increased Kd. As soon as Kd>0 the fit of the model to
the actual behavior worsened (RMS deviation of model from actual
behavior increased).

One would expect the optimum value of
Kd to fluctuate around zero for different runs if the subject was not
predicting (as it did for my non-sleepy subjects).

Nope. Any increase in Kd above 0 worsened the fit of the model on for
all the subject runs I've looked at. But I'll look at other runs and
see if it holds.

It would be very strange if the optimum fit were exactly zero in every run.

It's not strange at all if the subjects are not including the
derivative in their perception of the controlled variable.

If, as you say, �the fit
to human data was typically better when Kd = 0 than when Kd > 0, was it
better yet when Kd < 0?

When Kd<0 the model controls more poorly (it controlled better when

0 remember) and the match of model to human is also worse.

And was that difference consistent across runs? If
so, either we have a theoretical problem or we have a software bug. What
would be nice to see would be a histogram of the optimum Kd over several
runs, as a function of "sluggishness". I would expect such a histogram to be
centred near zero for an alert subject with snappy tracking feedback.

Actually, what would be nice would be to see the actual computer code
you used for your model, the one where the fit to human data improved
with an increase the weight (Kd) given to the derivative.

The interesting question for me isn't whether prediction is used by people,
but under what conditions it is used, because we know of at least one
condition when it is (sleepy subjects), and one (or more) when it isn't
(alert subjects doing the same tracking tasks).

We don't know that until we see the code for the "predictive" model
that resulted in a better fit to the human data when prediction is
included. The model I'm using leads me to the conclusion that
"prediction" is never used in tracking tasks like the ones used in
your "sleep" study.

I think it is great that you are doing this kind of experiment, and hope to
see the general results after you have been able to run a few subjects under
a range of conditions.

Before I waste too much more time on it I think it's important to make
sure that I am using the same model as you were. Because right now the
model I'm using says that people don't use prediction, ever. In my
version of the model there is no variation in the values of Kd that
give a best fit to the human data. Increasing Kd above zero (including
the "predictive" component of the controlled variable) always makes
the prediction of human behavior worse.

I think you really have to find the code for the model that produced
the improvement in fit to the human data that you speak of.

Best

Rick

···

--
Richard S. Marken PhD
rsmarken@gmail.com
www.mindreadings.com

[Martin Taylor 2010.01.07.

[From Bill Powers (2010.01.07.1415 MSRT)]

Rick Marken (2010.01.06.1835) –

  BP earlier: P = (Target_Position) -

d(Target_Position)/dt

e = R - P

RM: So Target_Position is the instantaneous difference between target
and

cursor? If so, then I think I can implement this correctly in my

spreadsheet.

BP: No; what he calls “target” in the diagram is what we call “cursor.”
The actual target is what he calls a “marker” as near as I can tell.
The marker appears to be stationary so his model is for compensatory
tracking. I assume that his language was adopted for the sake of people
unfamiliar with the language we use, but it makes for horrible
confusion here.

What is called “Target” in my diagram is whatever the subject is trying
to control, whether it be in a pursuit or a compensatory tracking task.
The velocity is the derivative of that perceptual variable. The
prediction parameter z affects how far into the future the model
projects a straight-line motion in resetting the reference value.
Numerically it’s the same as subtracting the scaled derivative from the
perceptual signal, but conceptually, it isn’t, since adding it to the
reference implies that the perception remains veridical, and what
changes is where the subject wants to put the cursor in anticipation of
where the perception seems likely to go.

BP: No, wait, because Martin thinks he simply copied your model from
1995, and I can’t see any relationship between what is shown in fig. 2
of his diagram and your model as you show it here. I’m trying to
understand what happened, and checking to see if he added the
derivative where you subtracted it. So far there isn’t sufficient
resemblance between his model and what you’re describing as yours to
draw any conclusions – I’m comparing apples and ducks. Maybe Martin
can supply the missing information.

Here are the two diagrams, copied from my Xmas message [Martin Taylor
2009.12.27.11.25]:

Rick (1995) (sorry about the formatting – that’s what cut-and-paste
did, for some reason).

Modafinil_Model_small1.jpg

···

**

–>|ct+(ct-pt)| --r**

**

**

**

v**

**

---->|C| ----**

**

p

**

**

System**

________|f||i||o|________________

**

^

**

**

m Environment**

**
t
c<-----------|**

**
^**

**

**

**
d**

Martin (1995)

Rick took his velocity from a separate observation, whereas I took it
from the perception being controlled. At the time, I thought these were
numerically the same. Anyway, if you want to test his model, use his
circuitry. If you want to test mine, use my circuitry, They ought to be
the same, but I could be wrong in that, as I may have misinterpreted
his model. It’s hard to interpret using only his diagram, since the
symbols are not explained, but I think the accompanying text supports
my interetation (I copied that in my Xmas message, too).

RM: The equations are what I actually used.

nonsensical or not. And when

0 the performance of the model did improve.

BP: Improving the performance of the model wasn’t the point. It’s easy
to improve the performance of the model. What we want is to improve the
match of model behavior to real behavior.

Quite so.

When Kd>0, you said the match got worse, not better, so
this shows real people don’t use the derivative.

Under the conditions of Rick’s test. As I have pointed out previously,
my own results agree with his, but not with your interpretation unless
you add “under those conditions”.

But Martin seems to think that the improvement in the
match did occur, largely on the basis of what he understands you to
have said, and therefore he thinks people do use the derivative. He
didn’t present any data of his own showing that the match was improved.
So at the moment I don’t know what anybody is talking about.

And I don’t know what you are talking about. I don’t know where you got
“Martin seems to think that the improvement in the match did occur,
largely on the basis of what he understands you to have said”, and in
light of the messages of the last few days I don’t know how it is
possible for you to say “He didn’t present any data of his own showing
that the match was improved”.

What I showed was that in my study there was no evidence that the match
was improved by adding in the prediction pathway when the subjects were
normally awake, but there was evidence that the match was improved
after the end of the first sleepless night, and more so after the end
of the second sleepless night. Up to this point, my results and Rick’s
are in agreement, since Rick presumably used himself as a subject
without depriving himself of a night’s sleep (and so far as he has told
us to date, without making the feedback “sluggish”, which was the
condition he said was necessary for the model to show improved
tracking).

I have no data relevant to whether sluggish feedback when the subject
is normally alert leads to an improvement in the match by incorporating
the prediction pathway in the model. That’s why I encouraged Rick to
perform the required set of experiments. It’s something that would be
very good to know.

Martin

[From Rick Marken (2010.01.07.2150)]

Martin Taylor (2010.01.07.)

Rick took his velocity from a separate observation, whereas I took it
from the perception being controlled. At the time, I thought these were
numerically the same. Anyway, if you want to test his model, use his
circuitry.

I don’t know what this means. But what we want to do is test the version of the model that you used in the sleep study. Is the model I implemented just now the same as the one you implemented in that study? Here’s the relevant section of my code again so you can verify:

CCursorp = Ccursor

Pcursorp = Pcursor

Ccursor = Coutput + CDist

Pcursor = Poutput

CTargetPositionp = CTargetPosition

PTargetPositionp = PTargetPosition

CTargetPosition = CTarget - Ccursor

PTargetPosition = PTarget - Pcursor

dCTargetPosition = CTargetPositon - CTargetPositionp

dPTargetPosition = PTargetPositon - PTargetPositionp

CErr = CTargetPosition - Kd * dCTargetPosition

PErr = PTargetPosition - Kd * dPTargetPosition

Coutput = Coutput + (Gain * CErr - Damping * Coutput) * dt

Poutput = Poutput + (Gain * PErr - Damping * Poutput) * dt

If you want to test mine, use my circuitry,

Does the above code implement your circuitry?

What I showed was that in my study there was no evidence that the match
was improved by adding in the prediction pathway when the subjects were
normally awake,

And what I am finding is the the match is always made worse by adding the prediction pathway.

but there was evidence that the match was improved
after the end of the first sleepless night, and more so after the end
of the second sleepless night.

I find that hard to believe. Losing sleep should just make control worse. I’ve analyzed data where subjects did quite poorly because of a very difficult disturbance and I still find that adding the prediction pathway makes the match to the data worse.

Up to this point, my results and Rick’s
are in agreement, since Rick presumably used himself as a subject
without depriving himself of a night’s sleep

Actually, the data is from an experiment where several subjects, with no prior experience with this tracking task, were tested.

(and so far as he has told
us to date, without making the feedback “sluggish”, which was the
condition he said was necessary for the model to show improved
tracking).

I’m sure that the “sluggish feedback” applied to the fact that the performance of the MODEL (not the fit of model to human) improves when prediction is added. That is something that I still find to be true. The model controls better when prediction is added. The model just doesn’t match the subject’s behavior when prediction is added as well as it does without the prediction added.

That’s why I encouraged Rick to
perform the required set of experiments. It’s something that would be
very good to know.

What experiments?

Best

Rick

···


Richard S. Marken PhD
rsmarken@gmail.com

www.mindreadings.com

[Martin Taylor 2009.01.08.00.35]

[From Rick Marken (2010.01.07.1600)]
Martin Taylor (2010.01.07.17.24) --
1. How did you set the "sluggishness"?
Using the damping parameter; the smaller damping, the more "sluggish".

That’s what you did in the model. How did you vary the “sluggishness”
for the human subject? In your 1995 work, you said that including the
derivative in the model improved tracking more when the feedback was
sluggish than when feedback was crisp. If that’s the case for the
model, it would seem natural to test whether this holds true for the
human as well, so I assume you did. Did you add a simple fixed delay
between moving the joystick/mouse and seeing the cursor move or did you
smear out the effects of the output, such as by putting a leaky
integrator into the feedback path between joystick/mouse and cursor? Or
did you do something else, such as jitter the lag between
joystick/mouse movement and cursor movement?

2. How did you optimize the model fit?
Manually. I first fitted the model, with Kd=0, by finding values of
Gain and Damping that gave the smallest RMS error between model and
subject. Then I increased Kd. As soon as Kd>0 the fit of the model to
the actual behavior worsened (RMS deviation of model from actual
behavior increased).

That isn’t a proper 3-parameter fit, is it? If you have optimized two
parameters with the third fixed, varying the third is almost guaranteed
to give worse results, except in the special case where the effects of
the third are orthogonal to the effects of the other two. To see why
this is so, consider a 2-D example:

optimizing.jpg

The figure shows what would happen using your method if you had a
system that had an optimum value at the star, and you chose chose to
optimize Y with a particular fixed value of X and then varied X to get
the 2-D optimum. When you optimize Y for a given value of X, you find a
point at the bottom of the valley of the ellipses. When you then vary
X, you find that each direction makes matters worse, so you conclude
that the circle represents the true optimum. The only time this result
might be correct would be when X and Y had independent effects (or when
you had a very lucky initial choice of X). That is almost certainly not
the situation in this case. So your result might be right, or it might
be an artifact of your optimization method.

There are various ways to do the N-parameter fit, e-coli being a pretty
good one. Hill-climbing probably works in this case, though it’s not
terribly reliable when the landscape is lumpy, as does happen with
human tracking. In my later sleep study, using a 5-parameter two-level
model that I described in the Toronto CSG meeting (rather like what you
suggested you wanted to try), I used a Genetic Algorithm approach to
get a pretty effective solution in what turned out to be a rather
irregular fitting landscape.


It would be very strange if the optimum fit were exactly zero in every run.
It's not strange at all if the subjects are not including the
derivative in their perception of the controlled variable.

Yes it is very strange, to the extent of being statistically almost
impossible, regardless of whether they actually are or are not
including the derivative in their reference value for the controlled
variable.


If, as you say,  the fit
to human data was typically better when Kd = 0 than when Kd > 0, was it
better yet when Kd < 0?
When Kd<0 the model controls more poorly (it controlled better when
>0 remember) and the match of model to human is also worse.

That does seem to provide more evidence that your result is an artifact
of the optimization method. Almost confirmation, actually. The artifact
is almost the only way you could consistently get the fit to be worse
with Kd both greater and less than zero.

The model I'm using leads me to the conclusion that
"prediction" is never used in tracking tasks like the ones used in
your "sleep" study.

How many nights of sleep deprivation had your subject(s) experienced
when you took your data? As I showed you, mine didn’t seem to use
prediction (at least not to a detectable extent0 until after they had a
whole night without sleep. Unless your subject(s) had lost a night’s
sleep, your result (if it isn’t an artifact of the optimization
technique) agrees with mine.


Before I waste too much more time on it I think it's important to make
sure that I am using the same model as you were. Because right now the
model I'm using says that people don't use prediction, ever.

What a breathtaking extrapolation! I think it belongs in the Guinness
Book of Records as the greatest extrapolation from the least data ever
proposed in a serious scientific discussion. It’s a bit like looking
for two seconds at an intersection in New York City and concluding that
all cars that are pointing north or south in New Delhi or elsewhere
never move, whereas all cars pointing east or west move quite fast.

You really are grabbing at pretty weak straws to keep the idea afloat
that humans never use the opportunities available to them to improve
their controlling, opportunities you yourself have shown to exist.

Martin

[From Rick Marken (2010.01.08.0930)]

Martin Taylor (2009.01.08.00.35)–

2. How did you optimize the model fit?
Manually. I first fitted the model, with Kd=0, by finding values of
Gain and Damping that gave the smallest RMS error between model and
subject. Then I increased Kd. As soon as Kd>0 the fit of the model to
the actual behavior worsened (RMS deviation of model from actual
behavior increased).

That isn’t a proper 3-parameter fit, is it? If you have optimized two
parameters with the third fixed, varying the third is almost guaranteed
to give worse results, except in the special case where the effects of
the third are orthogonal to the effects of the other two.

This is a good point. I have just tried fitting the model with all three parameters – Gain, Damping and Kd – varying simultaneously. The result is a 1% improvement in the fit when Kd is included in the model as opposed to when Kd is set to 0 and only Gain and Damping are varied (the way I did it before, so that the addition of Kd always makes things worse).

I’m actually not sure what to make of this. The model sans inclusion of the derivative in the input accounts for just about as much variance in the human behavior as the model can account for. My guess is that the 1% reduction in RMS deviation (between model and human) that you get when you vary 3 rather than just 2 parameters to fit the model results from the fact that you now have more predictor variables (as when you add a predictor variable in multiple regression). Given the extra “degree of freedom” it seems to me that you are bound to increase the amount of variance accounted for by the prediction equation. But if the increase in variance accounted for by the increase in number of simultaneous predictors (from 2 to 3, in this case) is very small (as it is in this case, just an increase of 1%) the added predictor would be discounted as being insignificant.

So the results of my analysis still suggest to me that humans do not include the derivative in the controlled perception in a tracking task.

Best

Rick

···


Richard S. Marken PhD
rsmarken@gmail.com

www.mindreadings.com

[Martin Taylor 2010.01.08.15.01]

[From Rick Marken (2010.01.07.2150)]

Martin Taylor (2010.01.07.)

Rick took his velocity from a separate observation, whereas I took it
from the perception being controlled. At the time, I thought these were
numerically the same. Anyway, if you want to test his model, use his
circuitry.

I don’t know what this means. But what we want to do is test the
version of the model that you used in the sleep study. Is the model I
implemented just now the same as the one you implemented in that study?
Here’s the relevant section of my code again so you can verify:

I can’t tell from your code. A few comments about where the variables
fit on the diagram (yours or mine) might help.

Anyway, my model is supposed to be what my diagram describes, so you
are the one best able to tell whether your code fits it. It is, of
course, always possible that my code failed to implement my model.

What I showed was that in my
study there was no evidence that the match
was improved by adding in the prediction pathway when the subjects were
normally awake,

And what I am finding is the the match is always made worse by adding
the prediction pathway.

Is this still true when you do the optimization properly?

but there was evidence that
the match was improved
after the end of the first sleepless night, and more so after the end
of the second sleepless night.

I find that hard to believe. Losing sleep should just make control
worse. I’ve analyzed data where subjects did quite poorly because of a
very difficult disturbance and I still find that adding the prediction
pathway makes the match to the data worse.

I suppose that “hard to believe” puts you into Bruce Gregory’s “belief
trumps facts” camp. You’ve seen the results, but state that they aren’t
true because they don’t fit with your beliefs and with results you
found under quite different experimental conditions, even though my
results agree with yours when the experimental conditions were similar.

I guess you don’t understand the conditions in which one would expect
prediction to be useful. There are different reasons why control might
be worse in one condition than in another. Two of those reasons are
variations in disturbance bandwidth and variations in loop delay and/or
bandwidth (I guess that’s three reasons, really). My hypothesis for the
sleepy people was just that their internal processes became slower, but
that in itself doesn’t explain the results, since the modelled lag (d)
didn’t change very much.

“Making the disturbance more difficult” usually means increasing its
bandwidth, which, in turn, means prediction of it’s future position
becomes worse more rapidly. So, for any fixed look-ahead time, the
value of prediction should be less the more difficult the disturbance.

On the other hand, if control is more difficult because of time lag
between output and effect on the controlled variable, the more
difficult the control, the more valuable prediction should be expected
to be, so long as the lag remains short compared to the inverse of the
bandwidth of the disturbance (if it’s too long, the control loop goes
into oscillation or runaway, but even if that doesn’t happen, it’s
impossible to control at all if the loop delay is longer than the time
to zero autocorrelation of the disturbance, and difficult if the
autocorrelation is significantly less than 1.0 at a lag equal to the
loop delay).

Up to this point, my results
and Rick’s
are in agreement, since Rick presumably used himself as a subject
without depriving himself of a night’s sleep

Actually, the data is from an experiment where several subjects, with
no prior experience with this tracking task, were tested.

And had they been deprived of a night’s sleep?

(and so far as he has told
us to date, without making the feedback “sluggish”, which was the
condition he said was necessary for the model to show improved
tracking).

I’m sure that the “sluggish feedback” applied to the fact that the
performance of the MODEL (not the fit of model to human) improves when
prediction is added.

Yes, that’s what you said.

That is something that I still find to be true. The model
controls better when prediction is added. The model just doesn’t match
the subject’s behavior when prediction is added as well as it does
without the prediction added.

Again I ask whether you now have done the optimization properly, and in
situations where there is an appreciable delay (sluggishness of one
kind) between the HUMAN’S movement of the joystick/mouse and changes in
the on-screen display, or where the human has difficulties in actually
perceiving the environmental variable that is influenced by the
joystick/mouse movement to affect the controlled perception (e.g. a
noisy or very low contrast display).

That’s why I encouraged Rick
to
perform the required set of experiments. It’s something that would be
very good to know.

What experiments?

The ones you said you wanted to do to test your proposed two-level
model against the model represented in my diagram (a simple control
loop with a linear addition of scaled velocity to the position
reference) under conditions that varied in sluggishness.

Martin

PS. There is yet hope that I may some day be able to provide you with
the raw data from the sleep study. Not finding anything related on any
of my currently used computers, I fired up a 10-year-old machine not
yet scrapped, and on its disk was a backup of an earlier computer’s
disks, which included a backup of a yet earlier machine, and that
included a folder for “Sleep Studies”, which includes both the
modafinil and the later “sleepy teams” study. I’ve imported that folder
to my current main machine.

The problem is that many of the files seem to have been compressed by
some compressor that I don’t now know. They show up in the Mac Finder
as “UNIX Executable” which they are not, I suppose because they are
non-ASCII. I tried Mac Disk Doubler, pkzip, and an all-seeing
decompressor I found by googling. None worked. All is not lost,
however, as the C code is in the clear (though the Makefiles are not),
and I think it quite likely that the disturbance and real track files
are in formats readable by the results of compiling the C code. The
problem there is that it will be a question of porting from an old
SPARC Unix system to the Mac, and that is something I’ve never tried
(nor have I ever compiled a C program on the Mac). So as I said, there
is yet hope, but no guarantees.

[From Bill Powers (2010.01.09.1328 MST)]

Martin Taylor 2010.01.08.15.01 –

What I showed was that in my study there was no evidence that the
match was improved by adding in the prediction pathway when the subjects
were normally awake,

And what I am finding is the the match is always made worse by adding
the prediction pathway.

BP: I suggest again that you guys read each other’s posts more carefully
and study each other’s models until you understand them. You are not
talking as if you each understand the other’s model. Or, for that matter,
your own.

Rick’s model behaves as it does because when the derivative is removed,
the reference input becomes exactly equal to the target movements. The
control system, which makes the cursor position match the target
position, then controls to make the cursor position equal to the target
position. The difference between target position and cursor position is
not explicitly sensed and compared with a reference signal; therefore it
is not a controlled variable.

When the coefficient of the derivative term is nonzero, the reference
signal becomes equal to the target position plus the position
extrapolated into the future by a time determined by the size of the
coefficient of the derivative term. Apparently the scaling is such that
extrapolating by any amount is extrapolating too much. Performance might
improve, but the match with real behavior gets worse.

Is this still true when you do
the optimization properly?

MT: but there was evidence that the match was improved after the end
of the first sleepless night, and more so after the end of the second
sleepless night.

RM: I find that hard to believe. Losing sleep should just make control
worse. I’ve analyzed data where subjects did quite poorly because of a
very difficult disturbance and I still find that adding the prediction
pathway makes the match to the data worse.

MT: I suppose that “hard to believe” puts you into Bruce
Gregory’s “belief trumps facts” camp. You’ve seen the results,
but state that they aren’t true because they don’t fit with your beliefs
and with results you found under quite different experimental conditions,
even though my results agree with yours when the experimental conditions
were similar.

BP: Martin, What you say might seem perfectly true and reasonable to you,
but what did you think the results of saying it would be?

And Rick, you haven’t yet acknowledged that your model is different from
Martin’s (as he hasn’t, either) You can’t compare your results with his.
His model includes, in the feedback to the reference signal, the
derivative of cursor position. Yours doesn’t do that for pursuit
tracking; only for compensatory tracking in which the derivative of
target position is always zero because the target doesn’t move. Yours
assumes a zero reference signal for compensatory tracking and I guess
Martin’s does, too. Yours includes a disturbance of the cursor during
pursuit tracking, so there are two disturbances. Martin’s
doesn’t.

It 's really pointless to be arguing about the performance of these
models when different models are involved, especially when you’re arguing
as if they’re the same. You should both try both models.

I guess you don’t understand the
conditions in which one would expect prediction to be useful. There are
different reasons why control might be worse in one condition than in
another. Two of those reasons are variations in disturbance bandwidth and
variations in loop delay and/or bandwidth (I guess that’s three reasons,
really). My hypothesis for the sleepy people was just that their internal
processes became slower, but that in itself doesn’t explain the results,
since the modelled lag (d) didn’t change very much.

“Making the disturbance more difficult” usually means
increasing its bandwidth, which, in turn, means prediction of it’s future
position becomes worse more rapidly. So, for any fixed look-ahead time,
the value of prediction should be less the more difficult the
disturbance.

On the other hand, if control is more difficult because of time lag
between output and effect on the controlled variable, the more difficult
the control, the more valuable prediction should be expected to be, so
long as the lag remains short compared to the inverse of the bandwidth of
the disturbance (if it’s too long, the control loop goes into oscillation
or runaway, but even if that doesn’t happen, it’s impossible to control
at all if the loop delay is longer than the time to zero autocorrelation
of the disturbance, and difficult if the autocorrelation is significantly
less than 1.0 at a lag equal to the loop delay).

Nicely argued. Now prove it.

Best,

Bill P.

[From Rick Marken (2010.01.09.1530)]

Bill Powers (2010.01.09.1328 MST)--

Rick's model behaves as it does because when the derivative is removed, the
reference input becomes exactly equal to the target movements.

I don't understand that. I think the reference in my model is always
implicitly zero. The computer code for the model's behavior is:
Output = Output + (Gain * Err - Damping * Output) * dt

The Err variable is actually the perceptual variable. Maybe that was
what was confusing.

The control
system, which makes the cursor position match the target position, then
controls to make the cursor position equal to the target position.

Ah, you must be talking about the 1995 model. Not the one I just implemented.

I'm just trying to make what I'm doing now consistent with what Martin
did in the sleep study.

And Rick, you haven't yet acknowledged that your model is different from
Martin's (as he hasn't, either) You can't compare your results with his.

I'll acknowledge whatever anyone wants. I'm not trying to see sell a
model here; just trying to see what Martin did so I can reproduce his
results. Now that I have done the analysis where I varied the three
parameters simultaneously I can see why Martin might have gotten the
results he did int he sleep study. It may actually be true (if I have
something like the correct model now) that the fit to the data
obtained from the sleep deprived group would be much better with Kd
(the "predictive component") included in the prediction. If Martin
finds his old data and we can all agree that my implementation of the
model is similar to his, then we can, hopefully, see why Martin found
what he found.

Best

Rick

···

--
Richard S. Marken PhD
rsmarken@gmail.com
www.mindreadings.com

[Martin Taylor 2010.01.08.16.54]

[From Rick Marken (2010.01.08.0930)]

Martin Taylor
(2009.01.08.00.35)–

2. How did you optimize the model fit?
Manually. I first fitted the model, with Kd=0, by finding values of
Gain and Damping that gave the smallest RMS error between model and
subject. Then I increased Kd. As soon as Kd>0 the fit of the model to
the actual behavior worsened (RMS deviation of model from actual
behavior increased).

That isn’t a proper 3-parameter fit, is it? If you have optimized two
parameters with the third fixed, varying the third is almost guaranteed
to give worse results, except in the special case where the effects of
the third are orthogonal to the effects of the other two.

This is a good point. I have just tried fitting the model with all
three parameters – Gain, Damping and Kd – varying simultaneously. The
result is a 1% improvement in the fit when Kd is included in the model
as opposed to when Kd is set to 0 and only Gain and Damping are varied
(the way I did it before, so that the addition of Kd always makes
things worse).

I’m actually not sure what to make of this. The model sans inclusion of
the derivative in the input accounts for just about as much variance in
the human behavior as the model can account for. My guess is that the
1% reduction in RMS deviation (between model and human) that you get
when you vary 3 rather than just 2 parameters to fit the model results
from the fact that you now have more predictor variables (as when you
add a predictor variable in multiple regression). Given the extra
“degree of freedom” it seems to me that you are bound to increase the
amount of variance accounted for by the prediction equation. But if the
increase in variance accounted for by the increase in number of
simultaneous predictors (from 2 to 3, in this case) is very small (as
it is in this case, just an increase of 1%) the added predictor would
be discounted as being insignificant.

So the results of my analysis still suggest to me that humans do not
include the derivative in the controlled perception in a tracking task.

OK. I’ll agree with you about the added predictor, so we seem to be in
agreement that awake and alert subjects don’t use much if any
prediction when there is little or no “sluggishness” in the control
loop.

When you say “The model sans inclusion of the derivative in the input
accounts for
just about as much variance in the human behavior as the model can
account for.” how much is that, on average? It makes a difference in
how you view the 1%. If the non-predictive model accounts for 80%,
there’s 20% unaccounted for, of which the predictor accounts for 5%,
but if the non-predictor model accounts for 99%, then the addition of
the predictor accounts for all that’s left, making it quite important
to consider the effect of the predictor, small though it may be.

On looking now at the 14-year-old C code for my analysis of the sleep
study, I see that the way I optimized might not have been ideal,
either. What I did was to start by computing a best value for delay d
and then for gain k (as Rick did), and use those values, with
prediction z at zero, as a starting point for further optimization. The
continued optimization was done by getting the RMS deviation between
the human and the model track with the parameter values at d, k, z, and
vary each of d, k, and z by an amount determined from earlier steps in
the analysis, doubling or halving the step in each independently
according to the PEST procedure. I then determined a point with a new
d, k, z, by adding those newly calculated increments in each
dimensions, and another new d’, k’, z’ at the same distance in the
opposite direction (because simply stepping across the minimum diagonal
could be going in the wrong direction). Whichever of the new d, k, z or
d’, k’, z’ gave the better fit with the human track was taken as the
start point for the next move. The idea, I imagine, was to work with
the assumption that the landscape was smooth, and to find a direction
that aimed most directly at the optimum. I must have thought that would
work better than e-coli, but I don’t remember ever doing a comparison
to see which gave more consistent results.

Looking at the C code, here’s how I eliminated periods of the data that
seemed likely to reflect times when the subject was not tracking
“normally”, meaning that the model was not designed to fit those
periods. There were three kinds of “not normal” periods, microsleeps,
subject getting lost (perhaps another kind of microsleep), and
“bobbles” when the human’s tracking error was unusually great:

  1. Eliminate “microsleep” periods. These were defined as periods in
    which the mouse did not move at all for at least 1 second. Data within
    a microsleep period were designated as having validity 0.

  2. Compute the real tracking error sample by sample except for the
    points already marked as not valid.

  3. Mark as invalid those places where the subjects seemed “lost”,
    defined as having either an error greater than half a screen, or having
    a derivative greater than 1000 (pixels per sample period over 4 samples)

  4. Eliminate “bobbles”, according to a suggestion by Bill P.: Eliminate
    points for which the tracking error was greater than 4 time the rms
    tracking error.

In running the model, no account was taken of these periods insofar as
the model continued running throughout, but in computing the RMS error
to determine the goodness of fit for a particular set of parameter
values, the “non-valid” samples were ignored.

Here’s the C code loop for running the model, after a lot of
initialization. I’ve added a few comments to help in its interpretation.
This seems to be the latest of several versions in the folder I
recovered, so I expect it is the one that was used in the reported data.

···

============

`for(j= -200;j<datasize;++j) /* Both human and model have a 200
sample run-in period before measures are used */

{

if (--m <= 0) m = 30;  /* m is an index for a ring buffer used

to implement delays */

i = abs(j);

hdelay[m] = h + distsign * (float)dist1[i];  /* h is the output

variable (don’t ask why it’s h rather than “output”) */

                                             /* hdelay will become

the perceptual value after dy + dfract samples */

p1 = hdelay[(dy+m)%30 + 1];  /* Assuming the delay is all

perceptual and not in the environmental feedback path! */

                             /* MMT comment 2010.01.10 This differs

from the diagram, in which the delay is

                             /* placed between the output and the

point where the disturbance enters the loop */

p3 = hdelay[(dy+m-1)%30 +1];  /* Added by MMT 950929 */

pz = p3*(1-dfract) + p1*dfract; /*MMT 950929 to centre derivative

on same moment as current perception */

                                /* MMT 2010.01.10 dfract is the

fractional part of the floating point

                                /* delay being tested on this run */

e = ref - p;

deltap = (pold - pz)/2.0;   /* pz is one sample time before p,

                            /* pold one sample time after p */

                            /* MMT 2010.01.10: This line looks like

a bug, since pz is the

                            /* perception at some fractional sample

time.

                            /* The divisor should presumably be

1+dfract, rather than 2.0

                            /* This bug should make the fits

noisier than they need be, since from one

                            /* model run to the next, the effect

could be similar to a change in z

                            /* by a factor of as much as 2, which

might well affect the optimization process */

h += k*e + z*deltap; /* pold and deltap added 950929 MMT *//* Here

is the output function */

pold = p; p = pz;

if (h > 1000.0) h = 1000.0;

if (h < -1000.0) h = -1000.0;

if(j < datasize && j >= 0) 

  {

    modelh[j] = (short)h;

    u = (float)(modelh[j] - realh1[j]);

    if (valid[i]) {         /* MMT 2010.01.10 We avoid the

not-valid periods only when computing the RMS misfit */

                            /* I thought I had added a guard band

around each “not valid” period, but I see

                            /* none in the validity-setting code.

Perhaps they should be added if the old data

                            /* is to be reanalyzed */

      errsum += u;

      errsq += u*u;

    }

  }

}

errsq = errsq - errsum*errsum/validsize;

return errsq;

}`

============

I hope this helps in understanding just what was fitted to the data.
You will note that the integrator output function does not have a leak.
The leak rate would have been a fourth parameter in the fit. Maybe that
was a bad decision, but fitting four parameters offers more opportunity
for noisy interactions to affect the important parameters in the
optimization. With the extra compute power we have now as compared to
the mainframe I used in 1995, it should be possible to take four
parameters into account. In 1995, the fitting of three parameters to
over 400 tracks took rather a lot of computer time! I don’t know how
badly, if at all, the bug noted in the comments above would have
affected the published data, but if it did affect them, the effect
would have been to make them noisier than they should have been, and
making it harder for real effects to show up above the noise.

I haven’t got to the stage of trying to figure out how to go about
compiling the code, yet. But I think I’m beginning to understand it at
least a little. I used to write a lot of C, but I haven’t looked at it
since these studies, and reading the code Bill and I wrote at that time
is a bit like trying to read French after not having looked at French
for a decade. It comes back, but not immediately. Next I have to learn
what all the compiler flags must be, and how to interface with the
graphics. It might be easier to use the C code as a guide and rewrite
the whole thing in Ruby!

Martin

[From Rick Marken (2010.01.10.0930)]

OK. I'll agree with you about the added predictor, so we seem to be in
agreement that awake and alert subjects don't use much if any prediction
when there is little or no "sluggishness" in the control loop.

When you say "The model sans inclusion of the derivative in the input
accounts for just about as much variance in the human behavior as the model
can account for." how much is that, on average?

Actually, if you look at in Rsquared terms, the model sans predictor
picks up 99.97% of the variance in the output. With the predictor,
Rsquared increases, but only in about the 8th decimal place, ie,
without predictor R2 = .99973894 with predictor R2 = 0.99973895.

On looking now at the 14-year-old C code for my analysis of the sleep study,
I see that the way I optimized might not have been ideal, either...

Thanks for this Martin. It will take me some time to figure out your
code and reconcile it with what I did. But I'll get back to you soon.
And if you manage to find some data to check this out on -- maybe just
a non-sllepy and sleepy run from the same subject -- that would be
great.

Best

Rick

···

On Sat, Jan 9, 2010 at 10:37 PM, Martin Taylor <mmt-csg@mmtaylor.net> wrote:
--
Richard S. Marken PhD
rsmarken@gmail.com
www.mindreadings.com

[Martin Taylor 2010.01.10.15.40]

[From Rick Marken (2010.01.10.0930)]

When you say "The model sans inclusion of the derivative in the input
accounts for just about as much variance in the human behavior as the model
can account for." how much is that, on average?
     

Actually, if you look at in Rsquared terms, the model sans predictor
picks up 99.97% of the variance in the output. With the predictor,
Rsquared increases, but only in about the 8th decimal place, ie,
without predictor R2 = .99973894 with predictor R2 = 0.99973895.

So what did you mean when you said it gave a 1% improvement?

Martin

···

On Sat, Jan 9, 2010 at 10:37 PM, Martin Taylor<mmt-csg@mmtaylor.net> wrote:

[From Rick Marken (2010.01.10.1255)]

Martin Taylor (2010.01.10.15.40) --

So what did you mean when you said it gave a 1% improvement?

I meant in terms of RMS deviation of model from person. Without
prediction the RMS deviation from the model for one subject was 6.15.
With prediction it was 6.05, a 1.6% improvement in prediction
accuracy.

Best

Rick

···

--
Richard S. Marken PhD
rsmarken@gmail.com
www.mindreadings.com