Human Op Cond Expt; PCT & EAB

[From Bill Powers (951206.1500 MST)]

Rick Marken (951206.0930) --

The human operant conditioning experiment is a great idea. As we design
it, we need to keep in mind that it must _discriminate_ between the two
types of explanation. When an experiment is set up so that the action
works on an undisturbed controlled variable, most of the discrimination
is lost, because it then seems that a particular action is needed to
produce a particular value of the controlled variable or reinforcer, and
it can be argued that the reinforcer is causing that particular action
to recur in the future.

Whatever is chosen to play the role of controlled variable or
reinforcer, it must be easily disturbed in a quantitative way. The
experiment can then be run in two modes: with and without the
disturbance.

In mode 1, without the disturbance, the person will learn to perform the
specific action that will produce some amount of reinforcer. It will
then seem that the probability of that action increases with time, until
it is producing some equilibrium rate of reinforcement. PCTers can then
claim that the action is controlling the rate of reinforcement, and
EABers can claim that the final rate of reinforcement is maintaining the
final rate of behavior.

In mode 2, with the smoothed random disturbance acting, PCT will predict
that the action will change through its entire range in a way closely
related to the disturbance waveform while the same rate of reinforcement
(near enough) is maintained. However, there will be no tendency for any
particular pattern of action to show an increased probability of
occurrance. To make this difference more dramatic, all learning can take
place without the disturbance; then both theories will be required to
predict the behavior on the first run after a known (to the predictors
but not to the subject) table of disturbances is introduced.

Chris Cherpas proposes using a VI choice experiment. I would resist that
on three counts. First, PCT has no model of choice behavior to offer
yet, and this experiment should be something we can do now. Second,
choice behavior introduces a second level of control, the first being
simply pressing a lever or pecking a key to obtain reinforcers, and the
second being a variation in _where_ the pecking takes place. I think
that this introduces complications that will obscure the basic
comparison. We could possibly come up with a two-level choice model, but
since we have none right now I would not like to get into that as a
starting point. Finally, a _variable_ interval schedule puts gratuitous
noise into the data, which means we would have to do long runs to get
accurate measurements of mean statistical relationships and would have
to introduce another parameter, the averaging time of the perceptual
function. I would be happy to use a fixed interval schedule because we
know the feedback function at all times. But a fixed ratio schedule
seems simplest to me.

I can forsee some disputes with EABers on just what constitutes a valid
experiment. In standard operant conditioning experiments, a certain
level of deprivation is maintained, with reinforcement levels being
kept, by adjustment of the schedule and reward size, far below the
levels that the animal would maintain for itself on a generous schedule.
Also, the reinforcement theory approach contains no notion of the
"desired amount" of reinforcer; the general assumption is, the more the
better. So when we design the experiment for PCT purposes, what
reference level should we assume, and how close to it will we allow the
subject to bring the controlled variable/reinforcer? If we say the goal
is just to "get as high a score as you can" or "make the picture remain
on the screen as long as possible," we're leaving it up to the subject
to decide what "as much as you can" is, or how long is "possible." If we
set a specific target, but far beyond what the subject could achieve, we
will never see the behavior of the control system in the vicinity of
zero error, and of course we will see errors of only one sign. To an
EABer, such conditions are quite normal, but in PCT terms we would be
seeing the control behavior only near one extreme of its range.

What we don't want, from the PCT viewpoint, is a situation that is so
extreme that we would see maximum behavior rates, or errors so large
that changes in behavior could have only small effects on them. I hope
we can find some compromise on these matters.

Maybe one way to avoid the dispute is to set up two experiments, one
which seems "normal" from the EAB point of view, and one which PCTers
would consider as involving normal behavior. If the situations are kept
simple enough, each side can design its own experiment and ask the other
to make predictions about it, as well as making predictions about its
own experiment.

ยทยทยท

-----------------------------------------------------------------------
Chris Cherpas (951206.0851 PT) --

Gee, it's so much easier to say that behavior which produces food is
produced to alleviate food deprivation.

     cc: Perhaps if you like to put causes inside the organism, then,
     yes, it is easier to say it like that. The EABer (but, apparently
     not Skinner) would say deprivation has (at least) two momentary
     effects: 1) evoking behavior and 2) altering reinforcement
     effectiveness.

I have no objection to putting causes inside the organism, although the
main thrust of PCT concerned a relationship between inner and outer
variables in which causation has little meaning. Do you have an
objection?

     The term is "establishing operations," [sorry! WTP] not "enabling
     operations." Nobody is forbidden from drawing logical conclusions;
     the question is whether the particular conclusions add anything to
     one's effective repertoire in that domain. The answer is
     determined by how well the person who has drawn the conclusion can
     subsequently control (and/or predict) the subject matter. Without
direct evidence of what's going on inside the organism, theories about
what's in there may be regarded as gratuitous physiologizing -- making
the statement seem more scientific than it actually is by raising images
of biological structures (which are, in such as case, only imagined).

Right, the criterion is how well the model predicts.

     Without direct evidence of what's going on inside the organism,
     theories about what's in there may be regarded as gratuitous
     physiologizing -- making the statement seem more scientific than it
     actually is by raising images of biological structures (which are,
     in such as case, only imagined).

Is all physiologizing gratuitous? Are all biological structures, like
sensory organs, neurons, and muscles, imagined?

     Please understand that "reinforcement" is _defined_ as what
     increases the probability of behavior. Period.

I was trying to give the benefit of the doubt. This way of defining
reinforcement means that you can have no basis at all for predicting
whether something will be reinforcing. You have to wait for the results
of the experiment, and THEN you can decide. This pretty much takes the
wind out of the sails of reinforcement theory as an explanation of
behavior, doesn't it?

In PCT, we have at least some physiological basis for assuming that
certain variables like temperature, food intake, and so on will be
controlled: the organism has to do so to survive. We can at least make
the connection that if the organism is controlling for X, depriving it
of X will result in behavior that tends strongly to produce X. Do
behaviorists really eschew talking even about physiological needs?

If this is what we would run into at the meeting, then maybe it's futile
to think of an experiment that would discriminate between reinforcement
theory and PCT. If the person doesn't show a regular pattern of behavior
when the disturbance is acting, the critic can just say "Well, then,
when your disturbance is acting the outcome is not reinforcing."
---------------------
     It's not always clear what a contradiction is empirically.

If your data are bad enough that may be true. But if you predict that
behavior will follow some time course with 5% and it doesn't follow
within 50%, it's not hard to say that reality contradicts the results of
your experiment.

     Even matching has variable support, but perhaps the right procedure
     still hasn't been settled on. Is searching for, and trying,
     different procedures considered a mere ad-hoc patch, or is it just
     a recognition that perfect knowledge doesn't just arrive in toto in
     a theory that need never change?

That depends on whether the patch becomes a permanent modification of
the theory and is applied in all future predictions. You're really in
trouble when you need a new patch for every new experiment.

     Can PCT ever change, or is it perfect now? If it's not perfect
     now, by what means would you change it? Through an ad-hoc patch?

Through modifying it until it ceases to predict incorrectly in future
experiments. That's what has been going on all along.
----------------------------------------------------------------------
Best to all,

Bill P.