Newton; S-R and PCT

[From Bill Powers (930108.0800)]

Greg Williams (939108a) --

I think that Newton's "model" does not postulate an underlying
mechanism for gravity (in fact, folks are still working on
models to do this), but only generalizes the observations ...

The observations are things like cannonball trajectories, the
orbits of the moon and planets, and so forth: macroscopic
composite objects following paths according to applied forces.
Also observed is the fact that objects of different sizes have
various weights. Generalizing from the observations would not
involve postulating forces that depend on the geometry of the
arrangement of infinitesimal point-masses. It would involve
observing many cannonball trajectories and many orbiting objects,
and finding a typical behavior of a typical object having certain
gross observable characteristics. The lowest level of abstraction
would be the observed behavior of the objects.

Newton proposed a level of abstraction lower than the observed
behavior of the objects. He proposed, in fact, that it is not the
observable characteristics of the objects such as ponderosity,
impulse, affinity, density, size, or shape that counts: it's only
the fundamental property common to all matter of any kind or
observable characteristics, called mass, that counts.
Furthermore, his universal law spoke only of forces between
infinitesimal bits of mass; to deduce the force between the Earth
and Moon, for example, you must compute the forces between each
point in each object and all points in the other object,
integrating over the volumes of the objects under certain
assumptions about internal mass distribution. Doing this reveals
such things as the zonal harmonics in the gravitational "fields"
of objects that are not perfect spheres of uniform density.

Newton created an imaginary universe with far more detail in it
than we observe. The law he proposed consisted of force-distance
relationships between those detailed point-masses, not between
objects at the level where we observe them. In order to apply his
law, it is necessary to calculate its consequences for objects of
specific shape and size. The attraction of a cube for a
tetrahedron does not follow an inverse-square law at distances
comparable to the dimensions of the object, according to Newton's
law. In order to find out what the attraction is predicted to be,
one must do that volume integration, applying the law to the
hypothetical particles of mass making up each object. This
predicted attraction can then be compared with the observed
attraction to test the law.

[Here there was a 17-hour power failure while we finished getting
2 feet of snow]

To me, it seems that Newton took exactly the same position as
Skinner: that functional relationships among observables which
have provided good predictions in the past are acceptable for
making future predictions without their being explained and
delimited by models of underlying mechanisms.

So you see, I don't agree that Newton only proposed functional
relationships among observables. What made his proposal so
powerful was that he made a generative model, and did NOT simply
generalize from observations.

Both the PCT "underlying generative model" and the
behaviorist's "functional relation" include a feedback
connection from handle to cursor; if they didn't, they wouldn't
reflect the experimental set-up.

What's odd about this, in Skinner's case, was that he paid
elaborate attention to the feedback path and essentially no
attention to the forward path through the organism. The schedule
of reinforcements could be characterized exactly: one
reinforcement for so many bar-presses, according to a regular or
randomized (but known) function. So the reinforcement was
dependent on the behavior in a clearly defined way. In the
forward direction, however, all he could say about the effect of
reinforcement on behavior was that it increased the probability
of a behavior or caused an increment in behavior. As to the form
of the forward relationship he had nothing more specific to say.
If he had proposed a particular forward function, he could have
solved the two relationships together and come up with a control-
system model. But he was fixated on environmental control of
behavior, and was forced to conclude that behavior is controlled
by its consequences, even though the only CLEAR relationship he
could see was that of consequences being controlled by behavior.
I have always considered this to be his most intellectually
dishonest ploy.

From another standpoint, the behaviorist COULDN'T characterize

the experimental setup correctly. To do so would be to see that
the stimulus is not an independent variable. The assumption is
that the stimulus varies, and as a consequence of that the
response varies. To measure the response, one arbitrarily varies
the stimulus, so the stimulus has a known value or pattern that
is independent of the behavior. If the stimulus is defined so it
depends on the response, it's impossible to perform this
manipulation (without breaking any actual feedback loop that's
present). This is why behaviorists have been unable to grasp the
relationships in a control system. They are unwilling to give up
their own, or environmental, control of the stimulus.

Skinner saw the reinforcer as a consequence of behavior. But
being unable to give up the idea that the environment controls,
he then treated this consequence as an independent variable, and
said that it controls the behavior. To be sure it controls only
FUTURE behavior, but with his blind spot he never saw the obvious
implication: that the BEHAVIOR which produces this consequence
controls ("controls" meaning influences) the future behavior via
the apparatus. To see this loop whole would have meant giving up
the concept that the environment determines behavior, and that,
above all, he was unwilling to do.


Your various quotes from the archives completely support my
position. You have my words right, but the reasoning you apply to
them is all your own, and you're welcome to it.

I have proposed correlations of 0.95 as a practical target for
current research. We commonly get correlations of 0.99 and up in
simple experiments. It is true, however, that to build any
structure that can support a mature science of life, we must be
able to get correlations of 0.999...; the reason, as I have
explained, is the necessity for a mature science to make
deductions based on many facts, not just a small handful.
Clearly, our occasional achievement of correlations of 0.99+ is
not yet sufficient to serve as the base of a mature science. It
enables us to make certain deductions with a high probability
that they are correct, but the scope of these deductions is still

I will admit that speaking of "exact" predictions is premature,
and ultimately foolish. I was carried away by comparing
predictions of individual behavior having 1000% error or worse
(the norm in most behavioral studies) and predictions with 5%
error. I admit that 5% error is not 0% error.

But you say that PCT tracking models fail to predict cursor
position over time with sufficient accuracy for true scientific

If making the best predictions one can is not true scientific
work, I don't know what is. Do you expect a full blown science of
life as complete as that of physics to spring into being
overnight? Even worse, do you really think I expect it to? You
persist in misquoting or misunderstanding me by taking my
statements and carrying their interpretations to ridiculous
extremes. In talking about a mature science of life, I'm
describing a reference condition, a goal in which the standards
are set several orders of magnitude higher than those currently
accepted in the conventional behavioral sciences.

I did not say that PCT tracking models are unusable for true
scientific work. You got that by reasoning from disconnected
statements taken out of context and interpreted according to your
own agenda. I have no defense against that kind of intellectual
strategy. Nor do I need one.
Since sending you the data, I have had some additional thoughts,
some of which are reflected above. You say

To adjust the parameters in a PCT model, you run trial models
successively, with the loop closed, using the given
disturbance, and look for the best fit to "real" handle
position over time, right? To adjust the parameters in a
descriptive "model," the behaviorist would need to do the same.

The basic problem is that behaviorists can't close the loop,
because they must consider the stimulus to be an independent
variable. Only Skinner came close to seeing that it is a
dependent variable just like the behavior. If you were really
putting on a behaviorist hat, you would have seen nothing wrong
with being presented only with the cursor and handle records. You
would have assumed that I generated the cursor positions
independently, and that the subject simply moved the handle as a
response to them. You wouldn't need to ask me for my experimental
setup; it would be self-evident. To do a new experiment, you
would simply present a cursor moving in a new pattern, and
measure the new pattern of handle positions. And you would
probably be quite satisfied with the resulting modest

If I gave you the disturbance record too, you would quickly find
that the disturbance correlated very highly with handle
positions, and would conclude that the disturbance was the real
stimulus. The correlation of the cursor position with handle
position would be much lower, so the cursor would be judged to
contribute little to the variance.

And there's one final aspect of the behaviorist approach that you
put in but no conventional behavioral scientist would. I
presented a data set. The conventional approach would be to look
for a relationship between them -- not point by point, over time,
but as a complete data set. The result of the analysis would be a
correlation, a regression equation, and a statement of the
standard deviation. The thought of using these results to predict
the point-by-point handle positions in a new experiment simply
would not occur to a conventional behavioral scientist.

I will be extremely interested to see what you decide to use
for C, without knowing what the disturbance is.

How about H = K * integral(C - T)?

But C is not known until H and D are known, is it? And you can't
know H until you know the pattern of disturbances, can you?
You're expressing H as a dependent variable. What it depends on,
the independent variables, are T and C. In order to make a
prediction using the above model only, you would have to specify
the variations in C and T first.

If you include the equations for the experimental setup, C = H +
D, your equation plus this one constitute the equations for a
control system. In the solution of these equations, you find that
both C and H depend on the disturbance (and the omitted reference
signal, here assumed 0). You can't specify C in advance. So this
is not an S-R model. You can't manipulate S.

So I really don't see how you are going to handle this as an S-R
system instead of as a control system.

... maybe your model is already "noise"-limited. If so, then
adding a noise term should improve prediction of cursor

No, it will make the fit worse. You will never do better with a
noisy model than with a noiseless one. Noise adds in quadrature.

Then the argument is over interpretation of models, not over
the models themselves. The behaviorist will happily use your
(or a better predicting) closed-loop model (perhaps with a
descriptive statistical term to model "noise") and call it an
input-output/descriptive/functional relationship "model." And
you will yell "Foul!"

No, the behaviorist will hell "Foul!" when I point out that the
so-called stimulus is not an independent variable.

Bill P.