[From Bill Powers (951216.1245 MST)]
Bruce Abbott (951215.2030 EST)--
Just to tweak your interest a little, behavior on ratio schedules
has a strong tendency to be two-valued: high rate or zero. A high
rate is not infinite, of course, but the instablility is just what
the first equation predicts. Put THAT in you snipe and poke it!
According to your tentative findings last summer, repetitive pressing-
behavior on ALL schedules may have a strong tendency to occur at a high
rate or not at all. The main research question before us is whether
animals _ever_ vary their rate of pressing as a function of schedule. It
seems perfectly possible that the main variable is the intervals of non-
pressing that occur during a run. Of course when pressing is computed
over long periods of time as total presses divided by total time, there
can be an appearance that there is a smooth relationship, but this gives
entirely the wrong picture -- as your analysis suggested quite strongly.
Your "interest-tweaking" factoid is misleading, because it refers not
just to behavior on a single schedule but also to the division of
behavior between alternate keys -- so-called "matching." If your prior
analysis holds up, it may be true of both situations, but there are
really two distinct situations that have to be handled.
During the "acquisition" phase of bar-pressing behavior, there are two
dimensions of control to be considered by the rat: _where_ to apply
actions, and _how much_ action to apply (there are others: what kind of
action and under what conditions to act, for example). The "where"
dimension involves moving around in the cage and finding places where
there is a maximum in the rate of reinforcement. Then, when the
coordinates of maximum reinforcement rate are found, the problem is then
to vary the behavior in that one place to further increase the
reinforcement, if possible bringing the rate up to the reference level.
The "where" problem involves finding the location in space that goes
with the maximum yield of reinforcers. The organization of the control
system that does this must be like the organization we use for tuning a
radio (if it's systematic). A simple design is to monitor the rate of
change of reinforcement rate and reverse the direction of spatial
movement whenever the rate of change becomes negative. This will result
in an oscillatory solution where the location moves back and forth
across the peak yield. If the velocity of spatial movement of the target
position is proportional to the rate of change of reinforcement rate,
the oscillations can converge to a steady position. Many other designs
are possible, including an E. coli biased random walk. By observing how
animals zero in on the right location, we should be able to model the
strategy that is actually used.
The "how much" problem is confounded with the "where" problem when we
use only the apparent rate of behavior at a single location as the
criterion. A rise in the apparent rate of behavior can come from (1) an
approach of the position-control system to the right location with
constant amount of food-getting action, (2) a quantity-control system
that is increasing the amount of food-getting action (i.e., learning the
right pattern of actions to bring the reinforcement rate toward the
reference level) while the position of action remains in the right
place, or (3) any combination of the two. To separate the two dimensions
of control for modeling purposes, we have to keep track of both
variables: position of acting and amount of acting. These can be under
independent control if the controlled variables are reasonably
orthogonal. The value of a variable and its first derivative are
considered as defining independent dimensions in systems analysis.
In an ideal experiment, we should be able to monitor control of "where"
and "how much." Imagine a cage with a lot of identical levers ranged
along one wall (I saw a paper in which an arragement similar to this was
used, but don't have it at hand right now). One lever is the "right
place" and the other levers have no effect except to produce the same
click (but no food). After the rat has learned to press the right lever
for food, we can vary both the schedule and the assignment of the right
lever, and observe both "where" control and "how much" control.
The reinforcement and control models I described apply to the situation
where "where" is being maintained in the right position, and only "how
much" behavior is varied. It assumes that rate of behavior varies with
the schedule, but if that is not the operative output variable, the
analysis can be done using whatever the correct output variable is. The
actual output variable might be a combination: a crude adjustment of
press/don't-press combined with a fine adjustment of interval between
bursts of pressing. A single control-system model can be made to behave
this way (OPCOND5 will do this if the output gain is set high and the
error limit is set low).
Finally, the critical test of the reinforcement model and the control-
system model involves applying a disturbance to the reinforcement rate.
In the reinforcement model, this should _increase_ behavior, or if
limits are used, leave it the same. In the control model, it should
Have fun in the sun. This should give you something to think about while