Buzz, buzz, buzz

From Greg Williams (920108)

AVERY: The PostScript files are BIG. The best thing to do is send Bill Powers
money for postage for a hard copy. It would cost me a lot to send via e-mail,
and disks would probably cost as much as the paper itself.

CHUCK: PictureThis works ONLY with PostScript printers. It is possible to get
programs to emulate PostScript on dot-matrix printers, but I don't recommend
it. If you have FULL-TIME access to a LaserJet with a PostScript cartridge,
let me know.

Bill Powers (930107.0830)

But Newton DID propose a generative model. It went "Every bit of
matter in the universe attracts every other bit of matter with a
force proportional to the product of the masses and inversely
proportional to the square of their separation." This was
certainly not what was observed. The observations had to do with
behavior of planets, the moon, and thrown and dropped objects.
Newton proposed an underlying set of entities called "masses"
which had the property stated in his universal law.

I think that Newton's "model" does not postulate an underlying mechanism for
gravity (in fact, folks are still working on models to do this), but only
generalizes the observations, as Skinner's functional relations among
observables describe behavior. Skinner could not say WHY a rat should EVER
become hungry; Newton could not say WHY gravitation should be at all (or why
it should not turn off at odd times, or even why the power involved is 2,
rather than something else). Both Skinner and Newton simply had faith that the
next bit of data (rat or matter) would be like previous ones they had
described correctly (in hindsight). Skinner deferred to physiologists and
evolutionary biologists to make generative models which would go beyond his
faith; Newton deferred to later physicists.

Newton did propose that an entity he called "mass" was the important feature
of matter with respect to gravitational attraction. That entity might be
construed as "underlying" the observable phenomena, but it doesn't seem to me
to provide a underlying mechanism. To me, it seems that Newton took exactly
the same position as Skinner: that functional relationships among observables
which have provided good predictions in the past are acceptable for making
future predictions without their being explained and delimited by models of
underlying mechanisms. If Newton had been asked, "But why is there such an
entity as 'mass'?", he probably would have answered that he wasn't about to
speculate on that, and not just because of timidity -- his purposes didn't
require it. Ditto for Skinner on "hunger."

Still, there is a sense in which generalized description/functional relations
can be said to be "generative": if precise predictions are "generated," even
though the basis of the "models" lies entirely at the level of the phenomena
being described, with no reference to unobservables. In PCT tracking models,
an unobservable reference level for target position is hypothesized in the
tracker, built into a supposedly "underlying generative model," and used to
predict handle movements. The behaviorist can build a mathematically similar
(even identical) "model" and claim that it makes no reference to
unobservables; he/she simply proposes a function relating observable "stimuli"
to observable "responses." He/she takes the "stimuli" as cursor positions and
velocities relative to the target position, and the "responses" as handle
velocities. Both the PCT "underlying generative model" and the behaviorist's
"functional relation" include a feedback connection from handle to cursor;
if they didn't, they wouldn't reflect the experimental set-up. And both can
generate precise predictions of handle and cursor movement (in fact, identical
predictions, if their forms and parameters are the same). Better models, in
both cases, will be judged to be ones which accurately predict both handle and
cursor positions when the tracker is given a different disturbance. Both
models are subject to limits on their predictions because of "noise" in the
subject. Such "noise" could be modeled either descriptively or with an
underlying generative model; the behaviorist would choose the former. If the
PCT modeler chose the latter, then there would be a genuine difference between
the models at the mathematical level. Otherwise, they would differ in
interpretations, only.

It's not worthwhile trying to make a model fit every anomaly.

There goes the PCT standard of "accept nothing less
99.9999999... correlations."

Curb your gazelle, sir. I have advocated requiring correlations
of at least 0.95 before taking data very seriously. If there are
so many anomalies in a data run that the correlation of model
versus real behavior drops below that level, the anomalies must
be investigated or the model must be improved. You shouldn't put
quotations marks around things I never said.

You say I'm a (the?) CSG gadfly. I'm also the CSG archivist, and here are some
quotes from previous posts on the net to show what you (and Rick) actually
have said. At one point, you say that correlations of ".99 upward" are needed
for a "true science." And Rick says that .99+ is a "reasonable" goal. My point
is that you have not reached that goal in predicting cursor position. Does
that mean that PCT (so far) is not a "true science"? Perhaps the most
interesting line in the excerpts below concerns the "brag" that "When you do a
real PCT experiment, you get an exact match between the model and the real
behavior." Apparently, no one to date has ever done a "real PCT experiment." I
believe that my quote accurately reflects the comments below, although you are
correct that it is not an actual quote of what you said. I'll even drop the
nines after the decimal point and apologize for the exaggeration -- but, hey,
what's a few tenths between friends? When I said "there goes the PCT
standard," I was referring to your apparently selective application of your
own contention that "somebody has to aim for 0.997 or better, and keep aiming
for it no matter how slow the progress. Because only in that way are we going
to understand and not just fool ourselves into believing that we understand."
So let's quit fooling ourselves about PCT models (to date) being able to
predict cursor position well enough, OK? Correlations of less than .9 just
aren't acceptable for true science, unless your definition of true science has
changed recently.

(BEGIN INCLUDED QUOTES)

[From Bill Powers (920112.1700)]

I don't consider any correlation of less than 0.95 to be of scientific
interest, and for correlations that low, a lot of added work is implied to
reduce the span of the error bars.

[From Bill Powers (920113.1200)]

Again, I don't think that any correlation lower then the 0.90s would be
scientifically usable. And you don't get results that you could call
*measurements* until you're up around 0.95 or better.

A true science needs continuous measurement scales so that theories about
the forms of relationships can be tested. This means that correlations
have to be somewhere in the high nineties. True measurements, with normal
measurement errors, require correlations of 0.99 upward.

[From Bill Powers (920213.1300)]

One of my objections to the statistical approach to understanding
behavior is that after the first significant statistical measure is
found, the experimenter quits the investigation and publishes. If you get
a correlation of 0.8, p < 0.05, you next question should be, "Where is
all that variance coming from?" If you set your sights on 0.95, p <
0.0000001, you won't quit after the preliminary study, but will refine
the hypothesis until you get real data.

[From bill Powers (920222.1400)]

You can't base
a science on facts that have only a 0.8 or 0.9 probability of being true.
Such low-grade facts can't be put together into any kind of extended
argument that requires half a dozen facts to be true at once. You need
facts with probabilities of 0.9999 or better -- if you want to build an
intellectual structure that will hang together.

[From Bill Powers (920515.2000)]

Traditional statistical analysis is
based on very low standards of acceptance and extremely noisy data. I would
rather see less data and higher standards: say, correlations above 0.95 and
p < 0.000001. This should reduce the literature to a readable size and make
its contents worth reading.

[From Rick Marken (920624.1030)]

As I said in an earlier post, if
the relationships in your data are not .99 or greater then you should
try to fix the research until you get such relationships.

[From Rick Marken (920624.1320)]

I said (or meant to say) that the criterion for what constitutes a
scientific fact in psychology should be far stricter than it is. I think
a reasonable goal is correlations of .99+. This can be done when you are
studying control -- at least when you are studying variables that can be
quantified relatively easily.

[From Bill Powers (920625.0830)]

Suppose that you're a psychologist just starting in with HPCT. You hear a
lot of bragging: "We can get correlations of 0.997 that hold up with
predictions over a span of a year." Or "When you do a real PCT experiment,
you get an exact match between the model and the real behavior."

When you've thought up an experiment to test a model, carried it out, and
found a correlation of 0.997 between what the model does and what a real
person does, there's only one response: jubiliation. You have actually
discovered a real true fact of nature, a high-quality fact, and fact that
sticks up out of the mass of other facts like a lighthouse.

If you can get 0.997 in a simple experiment, maybe you can get the same
result with a slightly more complicated one. Yes, you can, it turns out.

Once you've set foot on this road, you can see that it leads where we want
to go. Eventually it will lead to a solid reliable understanding of all
that is possible to know about human behavior. There's no point in trying
to skip ahead and guess how it will all come out. There's no point in using
methods that produce bad data and bad guesses and lead to knowledge that
has only a minute chance of being correct. Certainly, those bigger problems
are important. Certainly we need to solve them, as soon as we can.
Certainly, we have to go on trying to cope with them using experience as a
quide where we have no understanding. But if it's a science of life we
want, somebody has to aim for 0.997 or better, and keep aiming for it no
matter how slow the progress. Because only in that way are we going to
understand and not just fool ourselves into believing that we understand.

(END INCLUDED QUOTES)

In principle, underlying generative models are more complete
than descriptions at the level of the phenomena. But in
practice, the former might not be able to predict better than
the latter, due to the complexity of the underlying mechanisms
and/or hair-trigger situations.

I suppose that in some imaginary case that might be true. So far,
however, all the generative models actually developed and tested
under PCT have predicted a lot better than any descriptive models
have done. I'm not going to worry about complexity of underlying
mechanisms and hair-trigger effects until I run into them.

But you say that PCT tracking models fail to predict cursor position over time
with sufficient accuracy for true scientific work. The reason, you say, is
"noise" or "anomalies." Have you tried to make underlying generative models
for the "noise"? I think you have run into complexity and/or hair-trigger
effects and not realized it. I suspect that it will be impossible to predict
cursor position over time better by using underlying generative models of
"noise" than by using descriptive statistics to "model" the "noise."

Fine. I'll give you a data record showing the cursor, target, and
handle behavior point by point -- the raw data. I will generate
the disturbance pattern at random, and scout's honor will fit the
control model to the behavior without using any information about
the disturbance or even knowing what disturbance pattern was
created.

To adjust the parameters in a PCT model, you run trial models successively,
with the loop closed, using the given disturbance, and look for the best fit
to "real" handle position over time, right? To adjust the parameters in a
descriptive "model," the behaviorist would need to do the same. To be more
explicit, the fitting process requires that one USE the disturbance, even
though one needn't KNOW what it is. OK? You sent me data for handle and cursor
positions in a run, but what I really need to fit the parameters is the
ability to run the model with the feedback connection operative. I could do
that with your software, I suppose.

You devise an S-R model that will predict the cursor and handle
behavior for a new randomly-generated disturbance pattern,
unknown to either of us in advance, with exactly the same target
behavior. I will predict the new handle and cursor behavior with
the control-system model; you predict them with your S-R model.
We will then compare the results. You can use any definition of S
and R that you please, and any number of integrals or derivatives
that you can get from the data.

Any operations you like. But whatever you use, the S-R model must
be expressed as H = f(C,T).

I will be extremely interested to see what you decide to use for
C, without knowing what the disturbance is.

How about H = K * integral(C - T)? (K is a constant to be adjusted for best
fit to the data by running the simulation with the model in it). That's just a
first cut, of course, since it doesn't predict the cursor position well enough
for "true science."

Then you go through a long list of red herrings. What I meant
to say is that various PCT models such as proportional,
proportional-integral, and proportional-integral-derivative,
with various nonlinearities and various parameters, and various
descriptive "models" with various functions and parameters, ALL
give about equally high correlations between predicted handle
positions and actual handle positions, but all do NOT give
equally high correlations between predicted cursor positions
and actual cursor positions.

How do you know they all give about equally high correlations?
Isn't all this a conjecture?

A testable conjecture. If I make a PID descriptive "model," maybe it will fit
the cursor position better. And a bit of lag might help, too. Or maybe your
model is already "noise"-limited. If so, then adding a noise term should
improve prediction of cursor position IN A STATISTICAL SENSE, AVERAGED OVER
MANY RUNS.

I believe that the PCT models used to predict tracking can be
interpreted also as functional "models" -- the choice of
interpretation (underlying generative model or function
"model") depends only on whether one envisions a reference
signal for target position inside the organism (thus explaining
why cursor position is subtracted from target position) or one
simply notes that prediction is good if there is such a
subtraction in the function relating "input" to "output" at
each moment and doesn't care about explaining why.

Sure. The canonical mathematical forms are the same whether you
think of the signals as being real or not. If you're incurious
about how the mathematical functions are actually implemented,
you can just leave it there.

Then the argument is over interpretation of models, not over the models
themselves. The behaviorist will happily use your (or a better predicting)
closed-loop model (perhaps with a descriptive statistical term to model
"noise") and call it an input-output/descriptive/functional relationship
"model." And you will yell "Foul!" The sole determinant distinguishing your
interpretation from that of the behaviorist is the external stimulus vs.
internal reference signal issue. The math is the same.

As ever,

Greg