Cyclic-ratio data: one more time

[From Bruce Abbott (950725.2110 EST)]

Bill Powers (950725.0950 MDT) --

I still get an uncomfortable feeling that you do not quite follow my line of
thinking yet on these cyclic-ratio data. You say things like

This is basically doing what Rick said: applying the inverse of the
function that you used to get the intermediate result. The difference is
that you removed the intercept first, but other than that you haven't
really got any new result.

and

That is indeed a problem. It's especially a problem because of using the
ratio as an independent variable and extrapolating b/m vs m to b = 0.

The latter, especially, is a mischaracterization. So I hope you'll forgive
me if I go over it again.

If we take the inverse of the reinforcement rates and convert, we get
seconds per reinforcement. This is the average time required to complete
one reinforcement "cycle": complete each response, collect the pellet, and
return to the lever. This number is NOT, repeat NOT b/m. If it were, m = 0
would be undefined, yet we can get a perfectly reasonable number there.
Furthermore, it would reflect a different quantity at each ratio value
rather than the same old seconds to complete one cycle. It is NOT
appropriate to substitute "behavior rate" for reinforcement rate on this
graph, as you seem to insist on doing.

As to the x-axis, the ratio value is the number of responses required to
complete one reinforcement cycle. So we are plotting the average time
required to complete one cycle as a function of the number of responses
required to complete one cycle.

At the y intercept, no responses are required to complete one cycle. Thus
the value of the y intercept is the number of seconds required to leave the
lever, collect the reinforcer, and return to the lever.

The value for a given animal at a given ratio is the number of seconds
required to complete the given ratio (e.g., complete 2 responses on FR-2),
collect the reinforcer, and return to the lever.

The slope of the line for a given animal is the increase in time per
additional required response. Thus, if the slope is 0.50, it requires an
additional 1/2 second per additional lever press to complete the cycle.

If you plot the seconds/cycle as a function of the number of responses per
cycle, you get four essentially straight lines (within experimental error).
The lines all converge to nearly the same intercept (collection time) but
have different slopes (rates of increase in cycle time per additional
response). Note that the ratio value should be plotted on a linear, not
log, scale. Minitab gives the following regression results for these four
lines:

Rat intercept slope r r-sq
C1 5.35 0.679 1.000 1.000
C2 5.77 0.528 0.999 0.997
C3 5.47 0.458 0.999 0.999
C4 5.21 0.397 0.999 0.998

You will note that the correlations are extremely high, indicating an
excellent linear fit to the data. When the data are plotted as in the
Motheral-type curve, the points from different ratios are approximately
equally spaced along the x-axis, and this has the effect of exaggerating the
appearance of curvature in the data. When plotted as above, the curvature
all but disappears from view for the three animals whose data seem to
indicate it. Note that the linear function is accounting for at WORST 99.8
% of the variance, leaving only 0.2% to be expalined by the nonlinear
component, if such really exists.

One nice thing about plotting the data this way (in addition to the
excellent linear fit) is that the "collection time" and "peak" response rate
are orthogonal variables; one can vary without necessarily affecting the
other. They may in fact vary together in practice, but the will not do so
out of logical necessity. In contrast, plotting the data as in the Motheral
curve gives a line whose slope depends both on the "peak" response rate and
on the reinforcement rate at a behavior rate of zero.

What is causing us problems here is the fact that the output
is not simply proportional to the error: the output occurs in bursts
with pauses between them. So we're dealing with a non-simple output
function, which makes the required model more complex.

Yes, we have a "chain" in which bringing one variable to its reference value
(pellet present in food cup) must be completed using one set of actions
before a second variable can be brought to its reference value (food in
mouth) by means of a different set of actions. The two control systems must
alternate.

Ettinger and Staddon present data on interresponse times (IRTs) which
indicate that the peak of the IRT distribution is not affected by the ratio
requirement; essentially, when the animal responds to knock off the ratio,
it does so at the same relatively steady rate regardless of the length of
the ratio. The peak IRT is about 0.2 seconds. This rate is too high to
account for the average time per response as reflected in the slope of the
function relating cycle time to ratio value. The "missing" time is spent
almost entirely in the post-reinforcement pause. The length of this pause
increases in proportion to the number of responses in required in the
UPCOMING ratio. Since the "running rate" during completion of the ratio
requirement is essentially constant, the number of responses required
(effort) and the time required to complete the ratio are confounded. So
what we can say at this point is that the length of the pause prior to
beginning the next ratio is proportional to both the effort and time
required to complete that next ratio; we cannot separate these two at present.

At zero responses per cycle the "collection time" includes whatever
postreinforcement pause may follow consumption of the food pellet. This
would include the "decay time" following consumption of the pellet included
in your operant conditioning model. Raising the response requirement
increases the cycle time both by increasing the time required to complete
the ratio during the ratio run and by increasing the postreinforcement pause
by an amount proportional to the effort/delay imposed by the increased
requirement. Theoretically, increasing the delay from first response to
reinforcement and the effort required to obtain the pellet should reduce the
value of obtaining the pellet, leading to a slower return to activity aimed
at the goal of producing the next pellet.

The main impression I'm getting from all these investigations is how
terribly complex the behavior is that is being explored. When you look
at all the details that have to be accounted for in a working model, it
becomes apparent that a simple empirical approach to this kind of
behavior can't possibly sort it all out. It seems to me that EABers have
jumped into the middle of a huge complex system without having looked
into the simplest behaviors that it produces.

It is all so very seductive. It SEEMS so simple on the surface. What could
be simpler than rewarding every, say, 8th response? Not only that, but you
can get clear, reproducable functions as you manipulate such things as the
ratio requirement, level of deprivation, and effort requried to press the lever.

What we can hope for is that all these complex relationships seen under
various complex environmental conditions will prove to be the outcomes
of putting a system with a relatively simple internal organization into
different environments. I suspect that once we have a good working model
for ratio schedules, it will continue to work for all other single-
schedule experiments of all types. But we'll see.

I'm still hopeful. But I think we will be well served by paying attention
to the details revealed by studies like Ettinger and Staddon's and using
that information to improve our guesses about what variables may be under
control or may affect the parameters of the control systems we specify in
the model.

By the way, have you received the material I sent you last week via snail mail?

Regards,

Bruce