[From Bill Powers (950725.0950 MDT)]
Bruce Abbott (950724.2135 EST) --
Staddon's "regulatory" model predicted that certain manipulations
should affect the slopes of the functions without changing the
intercepts and others should change both. The manipulations were
designed to test these predictions.
I was referring mostly to the drug runs. When you administer a drug with
unknown effects on the nervous system and measure its effects on non-
physical arbitrary parameters of a model, I don't think you're learning
much. When you're also changing two or three other experimental
parameters at the same time, I think you're losing ground.
7.15 s/rft - 5.77 s/rft = 1.38 sec/rft;
3600 sec rft
-------- * -------- = 2608.70 rft/hr * 2 rsp/rft = 5217.39 rsp/hr
hr 1.38 sec
Alternatively, 0.68 sec/rsp = 1.471 rsp/sec * 3600 = 5294.118
rsp/hr; the difference between this result and the previous one is
due to rounding error. Your figure is 4013.36.
This is basically doing what Rick said: applying the inverse of the
function that you used to get the intermediate result. The difference is
that you removed the intercept first, but other than that you haven't
really got any new result.
As you figured out later, my approach was to calculate the time
available within which the actions occurred, and convert that to an
equivalent peak rate of acting according to the proportion of total time
between reinforcements occupied by actions.
[Can we use "actions" when we see "responses" without any corresponding
stimuli? We can see actions, but we can't tell whether they are
responses unless we can see the specific stimulus to which each action
is a response.]
The "peak" rates do not include collection time (or so you
indicated above) so it can have no influence on the numbers at all.
The actual "peak" rates for a given animal do not change _at all_
over the 32:1 range of ratios; for Rat 1 it is about 0.68
sec/response, or something in the 5200+ rsp/hr range as calculated
above, regardless of the ratio requirement.
The "peak" rates are simply the rate of action that would produce the
same number of acts per reinforcement, but within a shortened time
interval during which the actions were occurring. I did not see any
perfect constancy over ratios -- the range was about 50% relative to the
lowest rates, which we have to assume to be real unless further
investigations show it to be a statistical fluctuation.
By the way, not to open another can of worms, but you can estimate
the collection-time from the Motheral-type graph as the point where
a straight line fitted to the right portion of the curve meets the
x-axis. This gives the rate of reinforcement at FR-0. Invert and
convert to seconds for the collection-time. You can estimate the
(constant) response rate by noting where the line crosses the y-
intercept. This gives the rate for FR-infinity, when no
reinforcement would ever be delivered (thus excluding collection-
time from the figure). Invert and convert to get the slope of the
line in the "sec/rft vs. ratio" plot.
I'm not going to do this now, but what is needed is to see whether there
is an assumed collection time that would make an equation fit the right
side of the curve under the assumption of a constant rate of acting
during the remaining time. While I didn't expect to get the relatively
constant rate of acting that you have brought out, this is what I was
after in recommending that we measure the time the rat actually spent in
front of the lever. But there could be additional pauses, so we really
need the detailed action-by-action record as well. I knew that
collection time would have an effect, but I never suspected that it
would be so large!
···
-----------------------------------------------------------------------
Bruce Abbott (950725.0930 EST) --
What you did was to get the intercept from the regression equation
and subtract it from the seconds per reinforcement number at each
ratio value, leaving the time required to complete the ratio. You
then divided this into the seconds per reinforcement to get the
ratio of total time to ratio-completion time, and used this as a
multiplier of the original response rate figures. You could have
done the same by dividing the time required to complete the ratio
into 3600 and multiplying the result by the ratio value.
Yes, that is what I did. The time that is calculated in this way is in
units of sec/reinforcement, because you had already divided the behavior
rate by the ratio to get reinforcement rate. So the time per
reinforcement has to be changed first to reinforcements per unit time
(which is meaningless because the reinforcements do not occur during the
action time) and then has to be multiplied by the ratio to get back to
behavior rate.
The resulting function has a slight curvature in three of the four
rats, with a maximum value at FR-16. The other rat's data are less
regular but show a peak at FR-8. It is possible that the observed
nonlinearity is real,but to a first approximation the data are well
fit by straight lines. (This is especially apparent when you plot
overall reinforcement rates rather than response rates.)
Right about the fourth rat. I think you will agree that nothing is
really "constant" in these curves -- only approximately so. I prefer to
leave approximations to the very end, because introducing them too early
can conceal important relationships.
A problem for interpretation of any curvature is that the values
for the lowest ratios are highly sensitive to the intercept of the
fitted line (as you noted). For example, if the intercept for Rat
1 were Rat 2's 5.77 instead of 5.36 (a difference of 0.47 s), the
computed response rate excluding collection time increases by over
1200.
That is indeed a problem. It's especially a problem because of using the
ratio as an independent variable and extrapolating b/m vs m to b = 0.
Near zero there can be all sorts of departures from straight lines. If
you did a second-order fit to the data, you'd get very different
intercepts for each rat. And because of the extreme sensitivity to
apparent collection time, the results could be very strongly affected.
Also, as I pointed out, when you use the corrected peak action rates
(the apparent mean rate corrected for the fact that actions take place
only part of the time), and THEN do the extrapolation to zero behavior
rate, the intercept would move to much higher reinforcement rates.
The data I presented were measured from graphs showing response
rate versus reinforcement rate. I used the response rate because
it provided a higher resolution and computed reinforcement rate
from this. But I could have used reinforcement rate directly.
Since one is just a multiple of the other, there can be no
"independent" data on reinforcement rate.
I agree. In ratio experiments, reinforcement rate is strictly a
dependent variable: it is completely determined by the behavior rate
when there are no disturbances.
So how would the actual position of the reference level (if there
is one) be determined? Would this be the rate at which the rat
consumed food pellets if given free access to them?
Yes, pretty near (strictly speaking we should probably equalize
exercise). Actually, since reference levels apply to inputs, not
actions, we do get a pretty good estimate of the reference level simply
by extrapolating the mean curves to zero. That is the state of the
(increasing) input variable at which the output just falls to zero, by
definition. What is causing us problems here is the fact that the output
is not simply proportional to the error: the output occurs in bursts
with pauses between them. So we're dealing with a non-simple output
function, which makes the required model more complex.
I find no simple proportional relationship between rate
of responding and reward rate. In fact, the rate of responding remains
essentially constant (plus-minus 25%) while the reward rate varies by
a factor of 5 to 6. Your generalizations above don't seem to fit the
data.
You seem to be misreading me. You just stated what I stated, and
then concluded that what I said was wrong.
What you said was
What does stay the same across all ratio requirements are (a) the
time required to collect the reinforcer and return to the lever
(about the same for all subjects) and (b) the average rate of
responding (which differs across subjects).
The average rate of responding does NOT remain constant: it varies +/-
25% across ratios. As I said, this is too early in the argument to be
making approximations. Your amplification was
For a given animal, the rate of responding between collections is
essentially constant regardless of the ratio requirement (I am
assuming the deviations are experimental error, to a first
approximation).
This is the approximation I don't to make. It is NOT "essentially
constant." It varies. That variation, although not large, may prove to
be essential in constructing a workable model. The same happens with the
control equations: if you assume that the error is "approximately zero"
too early in the argument, you'll end up being unable to explain why
there is any action from the output function.
So this leads to the question: where is there any evidence for
control of reinforcement rate? It would appear instead that a
given level of deprivation, size of reward, etc. as provided in
these experiments sustains a particular rate of responding. I'm
not really comfortable with that conclusion, but it seems to be
implied by the data.
The proof is in the fact that if it were not for the behavior, the
reinforcement rate would be zero. The behavior brings the reinforcement
rate up to some non-zero value, and when the loop gain is high enough
(low ratios) we can see about where the reference level for
reinforcement is (even though it can't be reached exactly). Applying
disturbances would settle the matter.
The reliable effect apparent in the graphs (of rsp rate vs rft
rate) is a reduction in slope. The 80% and 95% lines tend to
converge as the ratio decreases toward FR-2. The implied
"collection rate" is either constant or, oddly enough, somewhat
higher at 95% than at 80%. (I'm just estimating visually from the
plots.)
To say they "tend to converge" doesn't tell me which curve is higher. I
assume that under 95% body weight the apparent reference level for
reinforcement is lower.
If you maintain the same rate of responding, the reward rate will
decrease as the ratio increases. That's just arithmetic. So it seems
that the reinforcement rate has no effect of its own on the behavior
rate; what does affect behavior rate is the error signal, the level of
deprivation.
Yep. But does this make sense?
Not under our simple model. There are conditions under which the
behavior rate is quite uniform (_Schedules of reinforcement_) but these
are not those conditions.
--------------------------------------------
If you will look at my operant conditioning model in which the temporal
details of behavior are brought out (I'll post it again soon for those
who didn't save it or lost it), you'll see that the behavior in these
Staddon experiments is much more like what that model predicts, or can
predict with the proper adjustment of parameters.
In this model, reinforcements are accumulated in a leaky integrator that
generates the perceptual signal. A single reinforcement causes an abrupt
rise in the perceptual signal which then decays exponentially back
toward zero. When reinforcers are occurring at a given rate, the
perceptual sigual becomes a sawtooth wave with abrupt rises and slow
declines, oscillating above and below a mean value set by the mean
reinforcement rate. My latest model offered for the Staddon data uses
only mean values, so the oscillating character of the perceptual signal
is omitted. Note that the size of the upward step in the perceptual
signal depends on the size of the reinforcer.
When we include those oscillations, we can see that with the right value
of decay constant the mean perceptual signal will be below the reference
signal, creating some average rate of action. The peaks may rise above
the reference signal, depending on the decay constant and the reward
size. While the perceptual signal is greater than the reference signal,
the error signal is zero (this is a one-way control system). The result
is that the output rate of action will fall to zero immediately after
each reinforcement, and remain there until the perceptual signal decays
to a value smaller than the reference signal: in short, a pause will be
generated. The output function is a relaxation oscillator which produces
actions at a frequency proportional to the error signal.
When the perceptual signal declines below the reference signal, an error
signal will start to grow and the action rate will start to rise from
zero. How fast it rises, and how far, depends on the characteristics of
the output function set by parameters in the model. With a very
sensitive output function, the rate of action will rise very rapidly as
the error signal begins to rise; it can be made to rise to the maximum
rate (another parameter) as rapidly as you please, in the limit creating
the appearance of an output behavior rate that is either zero or some
nonzero constant value. A nonlinear output function could be used to
produce any desired relationship between on-off and some smooth
relationship.
This model, therefore, automatically generates pauses after each
reinforcement. The actions will occur at some high rate after the pause,
with some rate of rise in action frequency and some maximum value of
frequency. We have been assuming a constant behavior rate during the
active phase, but in fact we couldn't tell (from the mean data values) a
constant rate from a rate that changed during the active phase. This is
one reason we need the actual record of actions and reinforcements -- to
test the assumed form of the output function in this model.
--------------------------------------
It's obvious that the effect of the collection time will vary greatly
depending on the behavior rates and reinforcement rates that occur at
the lower ratios. In the Motheral data, the maximum rate of
reinforcement was about 400 per session. If a session lasted one hour,
that is one reinforcement per 9 seconds. With a collection time of 5.5
seconds, the burst behavior rate would be 9/3.5 or 2.6 times the
apparent mean rate. In the Staddon data, the minimum time between
reinforcements was only 6.4 seconds and the collection time was
apparently 5.2 sec, leaving only 1.2 seconds for the active phase: the
burst behavior rate is then 6.4/1.2 or 5.3 times the mean behavior rate.
I hesitate to predict what my model will do when an assumed collection
time is included in it. I'll put it in as a parameter before posting the
new code.
---------------------------------------
The main impression I'm getting from all these investigations is how
terribly complex the behavior is that is being explored. When you look
at all the details that have to be accounted for in a working model, it
becomes apparent that a simple empirical approach to this kind of
behavior can't possibly sort it all out. It seems to me that EABers have
jumped into the middle of a huge complex system without having looked
into the simplest behaviors that it produces.
One reason I think this is the very nature of the experiments. The
production of repetitive acts would require, in the HPCT model, at least
five levels of control (events). To vary these behaviors in accordance
with a complicated logical condition so as to produce food would require
even higher levels of organization. I don't see how you can do any sort
of orderly investigation of behavioral organization in this way.
What we can hope for is that all these complex relationships seen under
various complex environmental conditions will prove to be the outcomes
of putting a system with a relatively simple internal organization into
different environments. I suspect that once we have a good working model
for ratio schedules, it will continue to work for all other single-
schedule experiments of all types. But we'll see.
-----------------------------------------------------------------------
Best,
Bill P.