Baseball

[From Bruce Gregory (2000.1202.1224)]

If operants are strengthened through reinforcement, I have always wondered
why I couldn't develop fantastic pitching control simply by throwing in the
direction of home. No matter how infrequently I got the ball over the plate
low and outside, those operants should be the ones that get strengthened
and the operants leading to wild pitches should be selected against. The
more I throw the more quickly I will develop incredible control. Yet this
doesn't seem to happen. Why?

BG

[From Bruce Abbott (2000.12.02.1525 EST)]

Bruce Gregory (2000.1202.1224) --

If operants are strengthened through reinforcement, I have always wondered
why I couldn't develop fantastic pitching control simply by throwing in the
direction of home. No matter how infrequently I got the ball over the plate
low and outside, those operants should be the ones that get strengthened
and the operants leading to wild pitches should be selected against. The
more I throw the more quickly I will develop incredible control. Yet this
doesn't seem to happen. Why?

"Strengthened" means that they come to occur more often relative to variants
that are not reinforced. According to theory, if getting the ball over the
plate low and outside is a reinforcing event for you, then variants that
accomplish this should be selected over those that fail, and the more you
throw while trying to get the ball to cross the plate at that position, the
better you should become at achieving that end. So what prevents you from
developing "incredible control"? There could be several problems -- off the
top of my head I can think of three: (1) You may have uncontrollable muscle
twitches which act as disturbances. The reinforcement process could lead to
the selection of a particular way of throwing the ball, but if the
physiological mechanisms cannot reproduce the act accurately, they cannot be
repeated accurately and throws will go wild. (2) It may be difficult for you
to sense what you were doing when the ball went where you wanted it to go.
The reinforcement mechanism cannot "select" components of behavior for
"strenghening" that cannot be distinguished from other components. (3) You
may not be producing variant operants that permit the most accurate
throwing, so of course they cannot be selected. (Some sort of "response
shaping" may be required to bring these variants into being.)

I'm guessing, of course, so at this point these explanations are only
"just-so" stories in need of empirical testing.

If you think about it, I think you'll see that essentially the same
explanations are still logical possibilities if we substitute an adaptive
control model for the reinforcement model. We know that a set of
hierarchically organized control systems are working to keep muscles
contracting at specified time-varying rates, generating certain time-varying
tensions, attempting to produce the specified motion of the arm and ball
through space and over time despite disturbances such as uncontrolled muscle
twitches, muscle fatigue, wind against the body, and so on. But how does
the system "know" what motions to produce? Generally, from experience.
Errors in ball position as it crosses the plate must somehow lead to a new
selection of reference signal values, gains, and so on -- but which ones,
and by how much should they be changed? If the "right" variations are
eventually tried, error is minimized, or in other words, throwing becomes
accurate, but only if they are tried, and only if, with the "right"
selections, the system so configured is capable of opposing all disturbances
to the extent necessary.

I have a feeling that by now most everyone has drawn the wrong conclusion
about my theoretical preferences, so I'll state them explicitly. I favor
(too weak a word) an adaptive control model over the so-called reinforcement
model. I see what has been labeled as "reinforcement" as the outward
manifestations of an adaptive process that (usually) leads to better
control. The reinforcement explanation simply describes (from the
observer's point of view) changes observed in the behavior of the system as
adaptation takes place. However, I believe that, by and large, the changes
in behavior that reinforcement theory describes as taking place under
specific conditions are real; if true, it follows that any proposed adaptive
control system model must prove capable of changing its behavior as
described in the reinforcement account if it is to be considered a
legitimate model.

This is why I stated at the outset of this discusson my belief that the
reinforcement and control-system views are not necessarily incompatible, at
least if one sticks to the purely descriptive version of reinforcement
theory. The latter amounts to a theoretical (generalized) description of how
the system behaves under various conditions. PCT on the other hand is a
theory of how the system is organized. The latter is the more fundamental
as the behavior of a system can be deduced from its organization, whereas
its organization can only be inferred from its behavior. However, to the
extent that the description of behavior is accurate, a proposed system
organization can be considered valid only if it behaves in conformance with
the description.

Bill and Rick continue to disagree with me on this, and that, I believe, is
the source of any conflict we may have.

Bruce A.

[From Bruce Gregory (2000.1202.1644)]

Bruce Abbott (2000.12.02.1525 EST)

(2) It may be difficult for you
to sense what you were doing when the ball went where you wanted it to go.
The reinforcement mechanism cannot "select" components of behavior for
"strenghening" that cannot be distinguished from other components.

Distinguished by whom (or what)? Do I have to be aware of what I am doing?
Must the rat be aware of what it is doing?

(3) You
may not be producing variant operants that permit the most accurate
throwing, so of course they cannot be selected. (Some sort of "response
shaping" may be required to bring these variants into being.)

Should my continuing to throw the ball provide all the shaping needed? The
more variable my initial performance, the more like lit is to generate the
"right" operants. Or so I would think.

I'm guessing, of course, so at this point these explanations are only
"just-so" stories in need of empirical testing.

If you think about it, I think you'll see that essentially the same
explanations are still logical possibilities if we substitute an adaptive
control model for the reinforcement model.

Yes, I'm just as puzzled by the PCT model.

Bill and Rick continue to disagree with me on this, and that, I believe, is
the source of any conflict we may have.

My only question is, is it possible to build a working EAB model?

BG

[From Bill Powers (2000.12.0-2.1826 MST)]

Bruce Abbott (2000.12.02.1525 EST)--

If you think about it, I think you'll see that essentially the same
explanations are still logical possibilities if we substitute an adaptive
control model for the reinforcement model. We know that a set of
hierarchically organized control systems are working to keep muscles
contracting at specified time-varying rates, generating certain time-varying
tensions, attempting to produce the specified motion of the arm and ball
through space and over time despite disturbances such as uncontrolled muscle
twitches, muscle fatigue, wind against the body, and so on. But how does
the system "know" what motions to produce? Generally, from experience.
Errors in ball position as it crosses the plate must somehow lead to a new
selection of reference signal values, gains, and so on -- but which ones,
and by how much should they be changed? If the "right" variations are
eventually tried, error is minimized, or in other words, throwing becomes
accurate, but only if they are tried, and only if, with the "right"
selections, the system so configured is capable of opposing all disturbances
to the extent necessary.

This has got to be one of the main misunderstandings that exists between
us. When I say that organisms "learn control systems" I do not mean that
they learn specific acts as responses to specific discriminative stimuli
(or to anything else). What they acquire are functions that create
quantitative relationships among continuous variables.

In tracking, one of the functions that must be acquired is the conversion
of a distance between target and cursor into a rate of change of mouse
position. This does not mean learning to respond to each possible degree of
error with a response that generates a specific rate of change of position.
It means developing a neural circuit that converts _any_ degree of error
into a proportional velocity of mouse movement:

ds/dt = K*e
where e = error and
       s = separation of target and cursor

What is acquired is a neural function with an input e and an output s that
is the time-integral of e, so that the above equation holds true. There are
various ways to construct such a function from neurons, and none of them
requires learning specific pairs of values of e and s, or e and ds/dt. Once
the neural function exists, the velocity will be properly computed for all
values of e, whether or not any specific value has ever been experienced
before.

It's possible, of course, to set up a model in which the variables are
discrete events, such that what the model learns is a specific S-R link.
This kind of model can't handle continuous variables, and furthermore it
can't handle inputs it has never experienced before. Organisms, of course,
can do so, but this kind of model can't.

I can think of a simple test. Let a naive subject practice a tracking task
in which only one sign of disturbance is ever used. Thus all input-output
pairings will result in converting one sign of error into actions. After
some criterion level of skill is introduced, we make the disturbance
bidirectional. If input-output pairs are learned, the person will be unable
(without further practice) to correct errors caused by disturbances in the
direction that was not previously experienced, which can be opposed only by
acting in a direction that was never previously produced. If a control
system is learned, the person will control just as well with either kind of
disturbance pattern, from the moment the disturbance is changed.

Best,

Bill P.

[From Chris Cherpas (2000.12.02.2010 PT)]

Bill Powers (2000.12.0-2.1826 MST)--

I can think of a simple test. Let a naive subject practice a tracking task
in which only one sign of disturbance is ever used. Thus all input-output
pairings will result in converting one sign of error into actions. After
some criterion level of skill is introduced, we make the disturbance
bidirectional. If input-output pairs are learned, the person will be unable
(without further practice) to correct errors caused by disturbances in the
direction that was not previously experienced, which can be opposed only by
acting in a direction that was never previously produced. If a control
system is learned, the person will control just as well with either kind of
disturbance pattern, from the moment the disturbance is changed.

When you say naive, I assume you mean someone who has never used a mouse.
Otherwise, you'd have subjects with experience, for example, selecting
icons from a variety of starting positions.

What measure would you use to determine that the person is "unable
(without further practice) to correct errors caused by the disturbances
in the direction that was not previously experienced" versus the person
being able to "control just as well with either kind of disturbance
pattern, from the moment the disturbance is changed?"

If it takes them more than, say, 0.5 sec to track after the first
opposite-direction disturbance, is control theory then refuted?

Best regards,
cc

[From Rick Marken (2000.12.02.2120)]

Chris Cherpas (2000.12.02.2010 PT)--

What measure would you use to determine that the person
is "unable (without further practice) to correct errors
caused by the disturbances in the direction that was not
previously experienced" versus the person being able to
"control just as well with either kind of disturbance
pattern, from the moment the disturbance is changed?"

Measure the deviation of subject performance from that of
a simple control model. The subject should react to the
opposite-direction disturbance exactly as the control model
does if, what is learned, is to control. So this measure of
deviation should be close to zero if control theory is right;
it should be quite large if reinforcement theory is right.

If it takes them more than, say, 0.5 sec to track after
the first opposite-direction disturbance, is control theory
then refuted?

Almost certainly yes, because the control system would take
virtually no time at all to vary its actions as necessary
to protect the controlled variable from the new, opposite-
direction disturbance.

Best

Rick

···

--

Richard S. Marken Phone or Fax: 310 474-0313
Life Learning Associates e-mail: marken@mindreadings.com
mindreadings.com

[From Bruce Gregory (2000.1203.0429)]

Rick Marken (2000.12.02.2120)]

Almost certainly yes, because the control system would take
virtually no time at all to vary its actions as necessary
to protect the controlled variable from the new, opposite-
direction disturbance.

I wish my thermostat/furnace combination were a control system. It never
controls for disturbances in the summer.

BG

[From Bill Powers (2000.12.03.0325 MST)]

Bruce Gregory (2000.1203.0429)--

I wish my thermostat/furnace combination were a control system. It never
controls for disturbances in the summer.

On the other hand, it never _learns_ to control in the summer, either, even
though by refraining from heating the house it cools it off every night.

I think that the experiment I proposed would also establish that tracking
does not involve a one-way control system.

Best,

Bill P.

[From Bruce Gregory (2000.1203.0837)]

Bill Powers (2000.12.03.0325 MST)

On the other hand, it never _learns_ to control in the summer, either, even
though by refraining from heating the house it cools it off every night.

I didn't realize that PCT was a theory of learning. I thought it was a
model of control. I've seen no tests of PCT as a learning theory, have I?

I think that the experiment I proposed would also establish that tracking
does not involve a one-way control system.

I doubt anyone needs convincing. My point, perhaps trivial or too well
hidden, is that the failure to counter unique disturbances does not
necessarily rule out the existence of control.

All one needs to do to "save" reinforcement, it seems to me, is to argue
that what is reinforced is not atomic operants, but the mechanism used to
generate these operants. But hell, I'm no expert on either PCT or EAB.

BG

···

Best,

Bill P.

[From Bill Powers (2000.12.04.0530 MST)]

Bruce Gregory (2000.1203.0837)--

I didn't realize that PCT was a theory of learning. I thought it was a
model of control. I've seen no tests of PCT as a learning theory, have I?

There have been a few tests of the concept of reorganization (Robertson's
"Phantom plateau" was one), as well as observations from the literature
concerning how the variability of behavior increases just before something
is learned. Of course you're right in that we don't have any neat demos as
we have for control processes. The biggest difficulty is in getting
anyone's agreement about what would constitute a test of reorganization
theory, or a refutation of any opposing theory.

What PCT offers any learning theory, potentially, is a way of measuring
"before and after" characteristics of behavior. Just what is different
about behavior after learning has taken place? PCT casts the answer in
terms of changes in parameters of control. That might be more useful than
just counting correct answers on verbal tests.

I think that the experiment I proposed would also establish that tracking
does not involve a one-way control system.

I doubt anyone needs convincing. My point, perhaps trivial or too well
hidden, is that the failure to counter unique disturbances does not
necessarily rule out the existence of control.

No, but the system had better not fail to counter the disturbances in terms
of which control was defined in the first place.

All one needs to do to "save" reinforcement, it seems to me, is to argue
that what is reinforced is not atomic operants, but the mechanism used to
generate these operants. But hell, I'm no expert on either PCT or EAB.

That doesn't save it; to save it, you would have to prove that there was a
strengthening of a given behavior rather than merely a cessation of
switching to other behaviors. Since these are both _interpretations_ and
not facts, one can equally well argue either way in the absence of a
critical test, if "saving the theory" (and the theoretician's ego) is the
only point.

Thinking up plausible scenarios is not the same thing as testing a theory.

Best,

Bill P.