# Modeling operant conditioning

Hi, Isaac --

you're trying to solve. I realize once again why I have done so little
modeling with reorganization. When you actually start figuring out what
has to be modeled in the situation you describe, the problems simply
multiply.

One thing that helps is to put yourself inside the rat. Mary and I did
that for a while this morning. She moved her finger around and whenever
she made a move in the right direction, I said "click," meaning that
water was now available. At first I said click for any move vaguely in
the right direction. Then I demanded more and more, and eventually got
her finger to touch the cup I had in mind as the "instrumental
behavior." Then each time she touched the cup I said "click." She would
run her finger over to where we sort of agreed the water was, then back
to the cup to touch it again. Then I changed the "schedule" so two
touches were required. She touched the cup -- no click! So immediately
she touched it again and I said "Click" because that was two touches.
About two trials later she was touching the cup twice in a row and
getting a "click" right away.

You might try this with a friend, both as experimenter as as subject.
This process is known as shaping. It obviously depends on some rather
complex processes going on in the rat, as well as an alert experimenter.
The basic idea of reorganization still works -- as long as there's a
deficit, changes in behavior continue; the changes cease only when the
deficit is corrected. But for this to work, there must already be a lot
of organized behavior available, and there's no real reason to think
that simple random reorganization is the only operative factor.

For example, after a "click," Mary would run her finger over to where
the water was agreed to be, and then _back to where the finger was when
the click occurred_. Do real rats do that? After the interruption of
going to get the water or whatever, do they try to re-establish the
conditions that held at the time the water was seen to be available, and
continue the search from there?

I've never seen any data on how rats behave while they're in the middle
of finding out what it is that makes the water or food appear. Skinner
just says that they emit behavior at random until a reinforcement
selects one of the behaviors to be repeated. But it seems to me that
there must be some very systematic rules in effect. The learning we see
is a _capability of the rat_ and not an _effect of the environment on
the rat_.

But the concept of reinforcement makes intuitive sense. It does seem
that as soon as the reinforcement occurs, there's a tendency to go back
and repeat the same behavior that was going on when the reinforcement
was first noticed. At least I think that happens; as I say, I don't know
of any data. We could say that after the interruption of getting the
reinforcement ingested, the control systems simply go back to the
condition that fits the reference signals that were present at the time
of the interruption -- maybe the reference signals simply haven't
changed, but the control systems were just turned off for a moment to
permit the reinforcer to be obtained, then turned on again, which would
naturally bring the rat back into the original condition.

Anyway, as you can see this is a far more complex situation than we have
with E. coli, and you may be biting off a much bigger chunk than you
realize. As Bill Leach said yesterday, one of the problems with
explanation in psychology is using a very simple model to try to explain
vastly complex processes. In trying to explain how a rat learns to press
a bar to get water, we're jumping into the middle of a very large
modeling problem, armed only with what we have guessed from E. coli. I
really think that to explain this behavior, we're going to have to build
up to it gradually, starting with much simpler problems. It may be that
the E. coli approach really will work, but that we have to understand a
lot more about multidimensional reorganization before we can see how it
will work.

Bill

ยทยทยท

CC: bourbon