[From Bill Powers (950712.0825 MDT)]
Bruce Abbott (950711.1615 EST) --
Same question as before: did the authors make any comment about the
fact that decreasing reinforcement _increases_ behavior?
They noted that response rate varied inversely with rate of
reinforcement and that this is opposite to what had been reported
in previous research for interval schedules.
OK, we now have two papers in which the authors noted the reversal, but
said nothing about its implications. By implication, there is previous
research for interval schedules showing only the left part of the
Motheral curve. It would be interesting to see if these papers show any
awareness that the opposite relationship can be obtained.
In learning one would expect that following a response with a so-
called reinforcing consequence would "strengthen" the behavior
(relative to other behaviors not so reinforced).
Right, so at least the relationship between reinforcement and behavior
goes in the "right" direction while a given behavior is being selected
from among alternate behaviors. Any mechanism that gradually shifts from
a search or trial-and-error pattern to a consistent pattern that
produces a specific consequence would create an appearance that supports
the basic reinforcement model (or would not contradict it).
Maintained performance is another game altogether--a dynamic
equilibrium involving all sorts of effects. You don't necessarily
expect _rate_ of responding to be directly related to _rate_ of
reinforcement under these conditons. There are too many other
factors to consider and could alter this simple relationship.
No, I can't accept this explanation. I hope you're just playing Devil's
Advocate. These are just words that vaguely indicate a lot of possible
interfering factors without naming any of them or laying out what their
individual effects would be -- or showing that they exist. This is an
age-old excuse that is used when a prediction fails: well, the spell
really does work, but the Moon was rising and Og was probably casting a
counter-spell while we were all asleep, and anyway you probably didn't
believe hard enough. That's why you still have warts.
In fact, the reinforcement model, taken as a general model, fails in the
case where only the amount of a single behavior needs to be varied in
order to produce the reinforcer -- and, of course, when the reward size
is large enough to put the behavioral regime to the right of the
Motheral peak.
Other, more recent data with pigeons show rate of responding
increasing with rate of reinforcement over a range from 8.4 (VI
427-s) to 300 (VI 12-s) reinforcements/hour in a manner that
resembles the left limb of the Motheral curves (Catania, A. C., &
Reynolds, G. S. (1968). A quantitative analysis of the responding
maintained by interval schedules of reinforcement. _JEAB_, _11_,
327-383). These authors also plot data from two other studies in
which the data follow the same curves as theirs. Those studies
included an 1800 rft/hr point which showed a further increase, as
if the curve were approaching an asymptote from below rather than
curving back downward as in the right limb of the Motheral-type
curves.
This is consistent with what I've been claiming: that the experimental
data supporting the positive relationship between reinforcement and
behavior are taken under conditions that put the behavioral
relationships to the left of the Motheral peak. If we could verify
experimentally that this is the region where the animal is spending less
and less time on task, thus reducing kt in my model, this argument would
be strongly supported.
···
--------------------------------
I believe what Sam may have said was that response rates _increase_
with increasing rate of reinforcement on interval schedules. When
you compare ratio and interval schedules on which the rate of
reinforcement is the same, ratio schedules support a higher rate of
responding.
Thanks, I must have misinterpreted.
Response rates would increase with increasing reinforcement rates if the
reinforcement size were small enough to put the behavior on the left
side of the Motheral peak. However, Staddon's Fig. 7.15 shows the
relationship bending over as on the right side of the Motheral peak for
the highest rates of reinforcement. The behavior rate is about 16/min
for a ratio of 1:4 minutes (r = 4/min), and about 9/min for a ratio of
1:8 minutes (r = 8/min).
If reward size is small enough, my model will make the entire curve show
an increasing rate of response with increasing rate of reinforcement
over the whole range, although for FR-1 there is usually a decline. If
you play around with the program I posted yesterday you can set up
almost any effect you want.
------------------------------------
I'd like to see your model extended so that kt emerges from the
competition of the two systems rather than being fit to the data.
But perhaps that can wait.
That's worth trying. I found, by the way, that the left limb of the
Motheral curve can be created (contrary to what I said earlier) by using
a zero offset and a cost function that simply goes as the square of the
error. You were right about that. And now that I think of it, the cost
should really go as the square of the behavior rate, not the error --
I'll try that when I'm finished here.
------------------------------------
The reason these rats continued to respond on the ratio schedule at
ratios up to 5000 is that this was their only way to obtain food--
it's that or starve. Food deprived rats working on a ratio
schedule in a 1-hr session will give up responding on ratio
schedules far less demanding. It's interesting to think about why.
It is obviously necessary to provide enough food or water at the
extremes of the schedule to keep the animals alive when they are in the
apparatus full-time. This puts a lower bound on the reinforcer size. I
have speculated that this is the size that determines where the Motheral
peak will be found. The Timberlake data show that as the ratio
increases, the amount of reinforcer actually obtained per access
increases by a factor of almost 10, so the reinforcer size is not fixed.
This is a problem with using "access" as a reinforcer. You would really
need a two-level control model for such a case, with the lower system
controlling rate of ingestion, and the reference level for rate of
ingestion being determined by error in the obtained amount of reinforcer
per unit time.
In 1-hr sessions, it's not necessary that the rate of reinforcement be
capable of systaining life, so reward sizes can be much smaller, and
probably are. The long-term deficit is made up between experimental
runs.
-------------------------------------
RE: interval schedules
That would fit the Catania and Reyolds data. The interesting
question is, does the curve just continue to climb to an asymptote
or is it an inverted U-shaped function, as Staddon's analysis
suggests? My guess is that rates of responding decline
monotonically (more or less exponentially) as size of the
interreinforcement interval increases, from a maximum at CRF.
Staddon's Fig. 7.15 suggests that there is a peak, not an asymptote. My
model and that Figure imply that if the reward size is large enough you
should get a peak. Think about it. If the animal gets as much reinforcer
as it wants without pressing the bar very fast, why should it maintain a
high rate of reponding? To get more than it wants? Remember that for any
fixed interval setting, responding at one press per interval is
equivalent to FR-1. If that provides nearly the reference amount of
reinforcement (rate times size or "value"), the rat will not respond
faster than the interval suggests (or not much faster). If the interval
is then increased, the rat will press faster because the error becomes
larger; we are to the right of the Motheral peak. So if we explore a
wide enough range of conditions, we will see the Motheral peak in
interval data just as in ratio data. The rat model is the same in either
case; the only difference is in the external feedback function. I'll bet
that we find essentially the same parameters for the rat model
regardless of the type of schedule in effect.
Remember that as intervals _increase_, rate of reinforcement is
_decreasing_. Your guess applies to the left side of the Motheral curve;
as interval increases toward infinity, rate of both reinforcement and
responding decreases toward zero. A decrease in the interval corresponds
to moving rightward on the Motheral curve.
-----------------------------------------------------------------------
Rick Marken (950711.1615) --
So reinforcement stops "strengthening" behavior when responses have
been learned and are being "maintained". How does the reinforcement
know when to stop strengthening and start maintaining? What are all
the sorts of effects that maintain dynamic equilibrium (betweem
responses and reinforcemnts, I presume)? What factors enter the
picture during the "maintaining" stage that alter the simple
relationship between reinforcmement and response rate that
presumably existed during the learning stage?
Good questions. It's obvious to me that two quite different mechanisms
are involved; calling them both "reinforcement" is a mistake. During the
"selection" phase, one process is at work that shifts the proportion of
time spend behaving in different ways and places, the shift slowing down
when a particular way or place produces reinforcement. This process
continues to work even after the selection is complete, but keeps
converging back to the same selection as long as the same amount of
reinforcer is obtained. Some version of the E. coli effect might serve
as a model of this process, or perhaps there is a systematic method that
would also work.
The other process is simply control. When control is predominant, we
have the negative relationship between behavior and reinforcement, with
a reference level at some (size times reinforcement rate). After the
right behavior is found, the loop gain of the control system gradually
increases until there is a comfortable margin of benefit over cost of
behaving.
Anyway, that's the general model I'm anticipating now.
(Marken commenting to Bruce Abbott)
Isn't it possible that what is happening is actually quite
different than what you describe -- and much simpler, too? Isn't is
possible that organisms are controlling the rate of reinforcement
as best as they can under the circumstances (the reinforcement
"schedule"). Nothing is "maintaining performance" (rate of
responding). Rate of responding is just one variable that affects
another variable that is under control. Nor is there a "dynamic
equilibrium" between input and output variables; the apparent
equilibrum disappears when there is a randomly varying disturbance
is added to the controlled variable; in that case response rate
changes randomly while reinforcement rate remains constant.
Yes, I think you're quite right about that. But the problem is to prove
it by using actual data from reinforcement experiments. There will
actually be a whole series of tests by which we can check the
predictions of both theories and show how well the data agree with the
theories. We're working on it.
-----------------------------------------------------------------------
Bruce Abbott (950711.2120 EST) --
Eventually, Skinner asserted, you will be led back to observable
events in the environment.
Yes. It's hard to find explicit statements of his underlying model, but
occasionally they show up. In _Science and human behavior_ (p. 28) we
find this:
Eventually, a science of the nervous system based on direct
observation rather than inference will describe the neural states
and events which immediately precede instances of behavior. We
shall know the precise neurological conditions which precede, say,
the response "No, thank you." These events in turn will be found to
be preceded by other neurological events, and these in turn by
others. This series will lead back to events outside the nervous
system and, eventually, outside the organism.
This is clearly the basic stimulus-response model. Neural inputs cause
neural outputs. I've never understood how EAB types could deny that they
are S-R theorists, when this model underlies all their thinking. I
suppose they must identify "S-R theory" with a _particular version_ of
this model, and because they use a more complex version, think that they
are doing something different. In fact, S-R theory utterly dominates
most branches of scientific psychology, including EAB.
It is not that Skinner believed such inner workings do not exist,
but that he believed that one could dispense with them by backing
up to the observable events: Being hit -----> Hitting back rather
than being hit -----> anger ------> hitting back. By eliminating
the ghostly middleman, he hoped to eradicate a nonscientific appeal
to the etherial wants, wishes, and desires of the ghost in the
machine.
What could be clearer? All that the inner workings of the brain can do
is to relay causal events to the muscles. Skinner believed that we could
therefore dispense with any discussion of the inner workings, because
whatever they might be, we would still have the same outputs being
generated by inputs, and we can observe those input-output relationships
without speculating on the intermediate processes that we can't see.
This goes all the way back to Watson, who saw all behavior as composed
of elementery reflexes and reached the same conclusion.
This is what I call the S-R model. Perhaps I can call it that only
because I have PCT with which to contrast it. I don't think that most
psychologists realize that there is any possible alternative to the S-R
model. The S-R model isn't seen as a model: it's just a description of
the nervous system, one of the facts that you start with before you try
to develop a theory. Psychologists who disclaim being S-R theorists,
like Skinner, are really arguing about _which_ S-R theory is right. It
has never entered their minds that ALL S-R theories could be wrong.
Skinner assumed that internally
generated behavior had to be random or "capricious."
Far from it. Skinner assumed that most behavior is in fact
"internally generated" but that its frequency of occurrence is
modified through experience and brought under control of
environmental conditions (stimulus control). Skinner was no S-R
psychologist.
Well, he used the term "capricious" a number of times, though I couldn't
quickly put my finger on an example. Basically, he said that behavior
would be capricious if the inner workings of the brain were not under
the lawful control of environmental conditions -- that was the only
alternative he could see. Skinner WAS an S-R psychologist, even though
he may have rejected other versions of S-R explanations such as those
that relied on single-purpose intervening variables like "anger."
As I pointed out earlier (it seems to have been overlooked),
Skinner did not believe there were no such things as wishes,
purposes, intentions, only that they could be explained by
environmental contingencies (past and present), and that a
scientist would do better to stick with the observable precursors
and conditions rather than appealing to these constructs for
explanation.
But Skinner saw these things only as intermediate stages of the
processes by which inputs caused outputs, so he missed their essential
character as reference signals. Skinner simply _renamed_ these processes
as a means of removing them from the set of causal variables. The result
was that whenever a change in reference signals occurred, shifting the
relationships between inputs and outputs, the only way to account for
the change was to say that some other reinforcing effect or some other
stimulus must have come into play, even though it was not noticed. So
SKinner really ended up no better off. Instead of relying on
explanations like changes in intentions or desires, he relied on
explanations that invoked unobserved external influences. Which is
better, to imagine an unseen internal cause or an unseen external cause?
----------------------------------
Seems to me that no
matter what happens, you have it covered.
That's only because of a lack of sufficient information to make a
prediction of outcome. I could say much the same thing, couched in
PCT language and referring to the various conflicting control
systems.
The difference is that when we run into unexplained effects in PCT
experiments, we track down their causes and incorporate them into the
model, or guess what they are and arrange things so they should be held
constant. What I was trying to point out was that the explanation you
offered was just one possibility, and when all possibilities are brought
out, you have to admit that you can't explain the effects observed or
predict what the outcome will be. EABers often act as though they have
an explanation of all behaviors, when in fact many other explanations,
even within EAB terms, are equally possible. The only correct answer in
such cases is "I don't know."
The predicted result of that conflict could go various ways
depending on parameters. If you just invent your parameters to fit
what does happens, then no matter what happens, the PCT model
covers it too. So what seems like a criticism of the theory is
only a problem of insufficient information.
Yes, that's true. And when there is insufficient information, you have
to reason to assert that your model, whether it be EAB or PCT, can
explain the phenomenon in question or make a correct prediction. If you
asked me to explain how social conflicts work, and especially to predict
how any particular conflict would turn out, I would simply say "I
can't."
If you want to be completely consistent in
your theoretical views, you must dismiss all prescriptive statements
and simply wait for the environment to shape your behavior as it will.
No, it does not tell you that goals are figments of the
imagination, but that they are products of your biology, your
history of experience, and your current environment, and that you
should appeal to the latter rather than to goals to explain your
behavior.
All that you have said is that goals are elements of experience. But
goals have no goal-setting effects, so what's the difference? If your
biology, your history, and your environment determine the goals, and the
goals determine the behavior, you can drop the middle term as Skinner
said. It still comes out that there is no point in trying to _set_
goals; these other factors, out of your control, either set them for you
or don't set them -- you have nothing to say about it.
But whatever it is, Skinner's view doesn't imply that he can't have
the equivalent of wishes, desires, and goals, and act on them. If
that were an implication of Skinner's view, it would have been
abandoned long ago.
He can "act on them" only in the sense that the goal, externally set and
manipulated, creates actions. But this middle term, the goal, has no
effects independent of the inputs; it can't be set arbitrarily or
"capriciously." This eliminates Skinner as an agent capable of setting
goals independently of the current environment. Skinner may _experience_
goals, wishes, and desires, but they have no independent causal force on
behavior. They are epiphenomena, links in a chain of causation that runs
from input to output.
As to such a view being abandoned if actually held, I remind you that
there are tremendous advantages in attributing one's behavior to
external influences and forces, particularly when other people don't
like what you do, or when what you do turns out badly. A person who
acquires great wealth ("Behind every great fortune there is a great
crime") does not like to say "I'm so wealthy because I want money and
power more than anything else, and don't care what I do to get them."
No, this person says "It's just human nature to want the best. I was
trained to accept the responsibilities of great wealth and to keep it in
trust. Self-preservation is the natural law. And anyway, I've been very
lucky." One of the first things psychotherapists have to deal with in
new patients is the long list of other people who will have to be
changed before the patient can be happy. Putting the blame on the
outside world is one of the most widespread indoor sports. The dog ate
my homework.
Skinner's humane leanings are in direct conflict with
his intellectual understanding of how behavior works. There is no way
out of it.
This is directly equivalent to stating that because evolution is
blind, the breeding of animals (artificial selection) is
impossible. Simply wrong.
Don't shift theories in the middle of the argument. The intentional
breeding of animals IS impossible, according to Skinner. Animals under
human care will be mated according to the pairings that the human being
is caused to create by his or her history of reinforcements and
heredity, and by the reinforcing or punishing outcomes of the matings.
So the environment determines the matings just as much as it determines
other forms of evolution. You're not allowed to put aside the thesis of
external causation and bring up a purposive notion like "artificial
selection." There is no such thing, because whatever selection happens
is also caused by the environment.
A control system is not a list of behaviors elicited by specific
stimuli.
And neither, as a matter of fact, are the functional relationships
empirically described in EAB. You're thinking S-R psychology
again.
I think that EAB is semi-aware of this consideration. But in many cases,
what is said to be learned is a specific response in a specific
situation, such as jumping when a puff of air occurs. In PCT we would
say that what is learned is like a constant of proportionality -- for
example, we might propose that action is proportional by some large
number to the difference between perceived air velocity and actual air
velocity. The PCT picture remains true no matter what the air velocity,
even if it is zero. In the usual psychological experiment, it is assumed
that there is some special connection between one specific event and
another specific event, rather than a relationship between a continuum
of values of variables. The two approaches are somewhat related, in that
the event-based interpretation handles specific pairs of values of
variables, while the PCT approach considers the whole range of possible
values of the variables, with particular values just being samplings. In
a PCT model of operant behavior, FR-10 is not considered a different
situation from FR-1. The only difference is the value of one parameter,
which doesn't have to be given a numerical value in the general model.
----------------------------------
I've looked and looked at the standard diagram of a control system
and I just can't find the place where disturbances enter the
perceptual input function. All I see there is the input variable.
The state of that variable is a joint effect of disturbances and
actions. As I stated, the system knows nothing of the disturbances
themselves--it can't distinguish them from the effect of actions.
That's certainly true. I was speaking, not very clearly, of the
relationship of disturbances to the actions of the system. Because of
the properties of the closed loop, it turns out that actions
automatically adjust to have equal and opposite effects to the effects
of disturbances on the controlled variable. There is no special
provision for this result; it falls out of the control equations.
Because of this effect, there can be no single relationship between the
amount of action and the amount of reinforcer, where the reinforcer is
in the position of the controlled variable. For any given state of the
controlled variable, any action within the range of possible actions may
be present, depending on the amount of disturbance and the setting of
the reference signal. It is not necessary to learn a different action
for each different amount and direction of disturbance. All that has to
be learned is a certain _organization_, from which the opposition to
disturbances will develop as an inevitable consequence.
-----------------------------------
Other than our little disagreement about whether control systems as
normally diagrammed can directly sense disturbances, I can't tell
much difference between these descriptions. First you claim that
structure does not determine actions (I said "depends," which is
weaker than your substitution of "determines.") Then you claim
that the actions somehow "act through" [depend on?] the existing
structure, which seems to me to restate my assertion while
appearing to contradict it.
This is the result of trying to put mathematical relationships into
words. The "structure" of a control system is the set of connections,
functions, and parameters which remain constant no matter what values
the variables may have. So structure itself doesn't produce any actions.
Actions arise only when independent variables are nonzero: in this case,
disturbances and reference signals.
In a control system, the action will be maintained by the operations of
the whole loop at the level that keeps the controlled variable matching
the reference level set by the reference signal. When there are no
disturbances of the controlled variable, the action can be deduced from
knowing the state of the controlled variable and the form of the
external feedback function. If the feedback function says that a
repetitive action of b actions per minute is required to maintain the
reinforcer at a level of b/m reinforcements per minute, then if we
observe r reinforcements per minute, we can deduce that there must be
m*r actions per minute going on.
If we observe a disturbance of d reinforcements per minute (plus or
minus), and that r total reinforcements per minute are occurring, we can
then deduce that there must be m*(r-d) actions per minute going on.
When a control system is observed, we see that to a first approximation,
r remains about the same with and without the disturbance. From this we
can deduce that the action must change equally and oppositely to the
disturbance. So the action no longer corresponds to the amount of
reinforcement; some of the reinforcement is being added by the
disturbance, or taken away by it. The net amount of reinforcement
remains nearly constant, but the action changes radically. So we have
lost the regular relationship between b and r.
That's the point I was trying to make.
-------------------------------
I'm not convinced the experiment [with a disturbance] will be all
that crucial, but I agree that it's a good idea to do it. Some
clever reinforcement theorist will be able to "explain" the data,
but I stongly suspect that the explanation (if offered) will have a
decidedly ad hoc ring to it.
An explanation already exists: "noncontingent reinforcement reduces
behavior" takes care of the case where disturbances add reinforcements.
A disturbance that reduces reinforcements (and creates more behavior)
might be explained as an effective change in the schedule of
reinforcements. However, anyone who offers those explanations in the
face of the simple explanation using the PCT model can probably be
written off as a lost cause anyway.
-----------------------------------------------------------------------
Richard S. Marken (950711.2040) --
Yes, that was an important post.
There is one potential problem in your explanation:
Operant behavior is a control, not an equilibrium, phenomenon. The
right side of the Motherall curve is one piece of evidence that
this is the case. Reinforcement rate remains approximately the same
despite changes in the effect that response rate can have on this
variable. The reinforcement rate remains nearly constant WHILE the
disturbances (different environmental functions relating behavior
to reinforcement tate) are in effect.
Unfortunately, the reinforcement rate does NOT remain approximately the
same. In the Motheral curves, it varies from about 200 per session down
to about 8 per session. These experiments subject the control system to
an extremely large range of conditions, so large that control is
essentially abolished at one end of the curves. And they do it in a way
that drastically changes the loop gain: from about 30 or 40 under an FR-
1 schedule down to 0.2 or 0.25 under an FR-160 schedule (assuming that
the output gain remains at 30 or 40).
If we put the animal on an FR-1 schedule, with a loop gain of 40, we
could expect VERY strong resistance to additive disturbances and
behavior much like what you describe. But on an FR-40 schedule, the loop
gain would be down around 1, and we would expect only half the effect of
an additive disturbance to be cancelled instead of 97 percent or so. The
schedules of reinforcement are designed, it seems, to make control very
difficult, so the behavior that is observed is seen mostly under
conditions of poor control. The same control model applies over most of
the range (FR-1 to FR-40), but the apprximations by which we
characterize the _ideal_ control system are grossly violated. We have to
go to the exact equations, and our usual simple descriptions of how
control works, p = r and o = -d, are no longer valid.
----------------------------------------------------------------------
Enough. Got to do some things for the conference.
Bill P.