Variable Interval Data

[From Bruce Abbott (950716.1400 EST)]

I was off-line yesterday (Steph's birthday); today I see I've got a bit of
catching up to do.

Bill Powers (950714.1205 MDT) --
    Bruce Abbott (950714.1215 EST)

So we agree that the mechanism involved in behavior maintenance is
different from the mechanism involved in "selection" of behavior, and
that the term "reinforcement," if used at all, should not be used for
both. That's a big step forward.

Yes.

    This latter situation is one in which I have always felt that a
    "regulatory" model (as I used to call control-system models)
    provided a far simpler account of the observations.

What PCT has to offer is the idea that it is perception not behavior
that is regulated, and that behavior (visible action) is simply the
means by which perception is controlled. The behavior varies with every
disturbance, wherease the perception (known to us as an input to the
organism) remains under regulation -- control.

My view of regulation was that the set-point could change and the system
would "follow." [I used to observe this process when I worked at O-I in
Glass Science. We used to program a series of changes in temperature in a
furnace containing a sample of glass; the program was an open-loop
cam-and-follower system which varied the set-point of the (closed-loop)
controller.] But as you note, what is actually controlled is the input to
the controller, not its output.

In the Staddon-Timberlake approach, this issue gets confused by their
use of "contingent behavior" in place of "reinforcement." It is true
that without eating behavior, the reinforcers produced by the
instrumental behavior would not be consumed, but the goal of the
instrumental behavior is not to produce the contingent behavior; it is
to produce the ingestion of food pellets that the contingent behavior
produces. If the instrumental behavior caused the food to be injected
directly into the rat's mouth, no contingent behavior would have to
interrupt the instrumental behavior.

This conception of behavior-as-reinforcer dates to Premack's early studies
in the 60's and his "probability differential" theory. I never bought into
this way of viewing things, for the reasons you note.

The problem with the models I have seen is that they aren't physical
models. Staddon approaches the problem as one of curve-fitting on a
Cartesian plot -- he actually deduces something resembling a reference
level, but as a geometric distance between a point and a curve.
Timberlake does nearly the same thing.

The result of this way of modeling is that the parameters of the model
have no physical significance, not even a proposed physical
significance. The main aim seems to be to find a mathematical form that
will have the same shape as the data plots. This is not modeling, at
least not the kind that goes on in engineering and PCT.

There may be some merit to this functional approach when the underlying
mechanism is unknown and one wants to make predictions. Boyle's Law is a
good example: it relates changes in pressure, temperature, and volume of an
ideal gas but does not explain why these relationships should exist, nor
does is suggest conditions under which the relationships would break down.
Once you have deduced Boyle's Law you can say that you "understand" the
behavior of gases in some limited sense. Staddon's "minimum distance"
approach is such a model.

A far superior approach (which is not always possible in the early stages of
research when very little is known) is to specify the underlying mechanism
through which the observed functional relationships emerge. This provides a
more fundamental level of explanation than the functional approach and will
indicate not only why the observed functions hold under the usual conditions
in which they are observed, but will also indicate those conditions under
which the observed relationships will no longer hold.

The difference between PCT models and these other models is that PCT models
are mechanistic whereas the others are functional. I don't know why anyone
would prefer a functional approach when there is sufficient information
available to formulate and test mechanistic models, which among other
advantages require the proposed mechanism to be housed in a physically real
structure whose properties can be tested.

This is one reason I had reservations about your Ecoli4 model. The logic
circuits I could accept as being physically realizable, but the
reinforcement process, which changed a probability, simply didn't look
physical to me. A probability isn't a physical entity, and changing a
probability isn't a physical operation. I would have much preferred to
see some output device described in a way that _behaved_ as if a
probability were being changed, but which actually operated in ways that
ordinary neural circuits could carry out in a simple way. The
description in terms of probability changes doesn't propose any physical
means of implementing those changes.

As we discussed some time ago, I would prefer that, too. We talked about
ways in which changes in response frequency might be mechanistically
represented.

Re: Hansen-Timberlake model

All that is really going on is fitting of an arbitrarily-selected
mathematical form to the data. There is no physical reason for choosing
this model, and its parameters have no physical meaning, no meaning in
terms of properties of a behaving system's components. The authors have
set up two arbitrary differential equations and found a solution for
them, with no reason to think that these equations have anything to do
with processes in the animal (other than the fact that the resulting
equations can be fit to the data using 6 parameters).

I did take a look at this model yesterday, but I have to admit I can't
really follow it. I don't think Timberlake pursued it after this paper was
published, although he continues to talk about "behavioral set points." I
don't know what Hanson's training was in; he was a Bell Labs at the time and
apparently supplied the mathematical know-how for the paper.

It is a strange paper. The authors assert that they are going to apply
control theory to the problem and then fail to introduce any element of
control theory beyond the concepts of reference and error. The then attempt
to derive a general model for conflict between two control systems whose
actions mutually disturb each other's controlled quantities (behavioral
rates), without supplying any specific environmental "feedback" functions.
Very odd.

Bill Powers (950714.2130 MDT)]

I've been concerned about how to get the model for the Motheral data to
scale up properly when the reference level is increased. I think we are
agreed that the two curves in question differ because the reference
level is different -- lower for the lower levels of deprivation. I've
been looking for some rational basis for the scaling, and I think I have
it.

Yes.

When you plot the various fixed ratios as reinforcements per unit
behavior, they are straight lines drawn from the origin at various
angles. For each ratio, the actual behavior point must lie somewhere on
the corresponding line.

And if raising the reference simply increases the response rate/reinforcer
delivery, all points should merely move outward the same distance along
their respective lines, as appears to be the case in the Motheral
deprivation data.

The line that passes through the peak for the lower curve also passes
through the peak for the upper curve. The scaling of the two curves is
radial around the origin. This says that the droop on the left begins at
the same ratio for both curves. Now all we need is some physical reason
for that to happen (other than sensing the setting of the apparatus!).

I'll need some time to think about this model; it looks reasonable but I'm a
bit concerned about your locating the peak of the curve at the point where
benefit - cost = 0. Ideally behavior should cease when the cost exceeds
benefit, although there are data which indicate that rats and pigeons can be
induced to work at a loss on high-ratio schedules if the ratio is approached
gradually from below.

Samuel Saunders [950716:0055 EDT] --

Last fall when there was some discussion of VI schedules, I mentioned that
Dougan and his colleagues were trying to determine when response rate was a
bitonic function of reinforcement rate rather than the more standard
monotonic function. In the July 1995 JEAB there is an article on this
subject:
  Campbell, L.S., and Dougan, J.D. (1995). Within-session changes in the
     VI response function: Separating food density from elapsed session
     time. _JEAB_, 64, 95-109.

This is one of the two ("count them--TWO") articles I mentioned last week
that appear in the latest JEAB which show this kind of functional
relationship. A problem with the Campbell and Dougan data is that these
authors report only the _programmed_ rates of reinforcement. Before
presenting these data on CSG-L I was hoping to obtain the actual rates (if
available) from the authors.

The mean data are not all that useful as the curves shown for individual
animals are quite different from each other. I plan to present individual
data after I get the obtained reinforcement rates.

Bill Powers (950716.0730 MDT) --

General note: The scheme that I described last night does work: the model's
curves scale up along both axes as the reference level increases just as
the two Motheral curves do. The detailed shape doesn't fit as well as my
linear model did (the point for FR-80 is too high), but at this point we're
talking about nonlinearities in the cost/benefit function and there's no _a
priori_ way to guess what they would be. I think the model works reasonably
well now.

Great! How about posting the code?

Re: Campbell and Dougan

A question: the number of reinforcers per minute seems to be simply 3600
divided by the nominal interval in seconds. In other words, it is the
"scheduled" rate of reinforcement, not the actual rate of reinforcement,
that is shown. It seems unlikely that the actual rate would be exactly that
number. What were the actual (obtained) reinforcement rates? We can't
really compare these data to the FR data without those numbers. The
differences may be small, but we might as well do it right.

As I mentioned above, the obtained rates of reinforcement were not reported.
I'll see if I can get them.

    In their discussion, Dougan and Campbell say "Because the present
    experiments showed food density to be an important factor, it may be
    tempting to dismiss all reported instances of bitonicity as being
    merely due to an artifactual satiation process." They present
    several arguments against such a conclusion. They note that little is
    known about satiation.

Bruce Abbott, take note: this is exactly what I guessed would be said. The
dropoff of behavior rate with reinforcement rate, as the authors note,
could be attributed to a "satiation process" even though actual satiation
(cessation of behavior) has not occurred.

The authors are suggesting that at high reinforcement rates the animals
would begin to become satiated (less hungry) earlier in the session than at
low reinforcement rates. From the point of view of the Motheral curves, it
would be as though data from early in the session had come from the 85%
deprivation condition and the data from late in the session had come from,
say, a 95% deprivation condition: the entire curve should shrink without
changing its overall shape. The problem is, however, that rate of satiation
depends on rate of reinforcement, so satiation would begin earlier for the
smaller ratios than for the higher ratios. Thus satiation rate and ratio
requirement are confounded.

The rise in behavior rate as
reinforcement rate rises from 7.5 to 120 per hour is not considered
problematic (in that it fits the Law of Effect). It is the fall in behavior
rate as reinforcement rate rises above 120/hr that presents a puzzle to
reinforcement theory -- the "bitonicity" of the curve. If the behavior rate
had simply leveled off at 1500-1920 per hour, this would have been called
the maximum possible behavior rate, and there would have been no problem:
the conclusion would be that experimental conditions should be confined to
the range from 7.5 to 120 reinforcements per hour in order to see the
effects clearly. How many experimenters have simply ASSUMED that the
leveling-off represented maximum behavior rate or approaching satiation,
and failed to explore higher reinforcement rates or sizes?

In fact there is less evidence in the Campbell and Dougan data of a fall-off
in the early-session data than in the late-session data, and prefeeding
makes early-session data look like late-session data. Furthermore there are
several earlier studies which did not find a fall-off even at high
reinforcement rates. These earlier studies (which I described in an earlier
post) led experimenters to conclude that behavior rate increased to an upper
asymptote.

At the longest intervals, we have almost reached the conditions of the FR-1
schedule: By very rough extrapolation, there would be 1 response per
reinforcement at about 560 reinforcers per hour (interval = 6.4 sec), and
the reference level (actual satiation level) would be 617 reinforcements
per hour (average of all three lines of data). These numbers are at least
comparable to those we have seen on the Motheral curves, where the
reference level at 85% of free-feeding weight is about 400 - 420 responses
per session (what was the session length?). At 80% of ffw that reference
level would have been somewhat higher.

You mean at the _shortest_ intervals, right? Nice to see that consistency
in estimated reference levels between studies after adjustment for
deprivation level.

It looks to me as though the Motheral curve is seen for interval schedules
as well as ratio schedules, and the same control-system model will work for
either kind of schedule.

Yes, although I still harbor some reservations that the same model of the
rat should apply under both schedules. What concerns me is that ratio
schedules permit rate of behavior to control rate of reinforcement
throughout the range of behavior rates, whereas interval schedules do not.
In interval schedules, once the rate of behavior produces the maximum
(scheduled) rate of reinforcement, further increases in behavior rate have
no further effect on reinforcement rate--in this region the control system
is essentially operating in open-loop mode. Wouldn't this change its
performance?

    Whatever the EAB implications, the results fit nicely with the current
    dicsussion of PCT interpretation of schedules of reinforcement.
    Campbell and Dougan cite 6 previous studies in the literature showing
    "bitonic" functions for interval schedules.

Good, can we get the data for those, too? Gentlemen, I think I smell a
paper coming up here, with at least three authors, two of whom have EAB
credentials. Does it seem that way to you, too?

I think we still have quite a bit of work ahead of us, but my intention all
along has been to get to the point where we can write that paper, in a
language that will be understood by researchers in this area. And it
appears that the theoretical problems posed by data of this type are
beginning to spark interest, so the timing is excellent.

Regards,

Bruce

[From Bruce Abbott (950717.1210 EST)]

Bill Powers (950716.1315 MDT) --
    Bruce Abbott (950716.1400 EST)

    There may be some merit to this functional approach when the
    underlying mechanism is unknown and one wants to make predictions.

I have a problem when the approach treats the plots of variables as
phemonena, as Staddon does, and _visual_ relationships are considered
important. That's getting pretty far from the system being modeled.

Not to defend a purely functional approach, but I don't think this is what
Staddon was doing. The geometrical stuff was just a way to represent how a
system gets from State A to State B when its two rate variables are subject
to the contstraint imposed by the schedule, assuming that is functions in an
"optimal" way and given a particular definition of "optimality."

     Boyle's Law is a good example: it relates changes in pressure,
    temperature, and volume of an ideal gas but does not explain why
    these relationships should exist, nor does it suggest conditions
    under which the relationships would break down. Once you have
    deduced Boyle's Law you can say that you "understand" the behavior
    of gases in some limited sense. Staddon's "minimum distance"
    approach is such a model.

Boyle's law at least looked for relationships among physical variables
as measured.

Staddon's plot shows rate of reinforcement versus rate of responding. What
could be more physical than that?

    I don't know why anyone would prefer a functional approach when
    there is sufficient information available to formulate and test
    mechanistic models, which among other advantages require the
    proposed mechanism to be housed in a physically real structure
    whose properties can be tested.

Right, me too. In fact, I can't imagine a functional model that isn't
really just curve-fitting. A useful term is _generative_ model; a model
that generates behavior out of its own structure. To make a generative
model you have to propose components of the system, each with its own
properties, and then put them together to deduce what behavior they
would create. Functional models seem to me to amount to little more than
manipulating equations, hoping that some interesting relationship will
come out of them -- but with no basis for the equations.

Now there's a term I like: _generative model_. Has a nice ring to it. I'll
have to remember that.

Functional models are not entirely without merit; there are some useful
things that can be done with them. For example, you might develop a set of
curves showing how size, moisture content, temperature, and baking time
relate when cooking, say, a turkey. It may be possible to come up with a
detailed generative model of a turkey [and the roasting environment] that
would give you the same answer, but the functional approach is simpler,
easier, and provides a generally adequate solution to the problem. And, of
course, the observed functinal relationships provide a test of a given
generative model. If the model is good, it should produce the observed
relationships when subjected to the conditions under which those
relationships were identified. Functional models are a good first step, but
unfortunately in EAB they are considered to be the final step.

    Ideally behavior should cease when the cost exceeds benefit

Behavior should cease _increasing_ when a further increase would produce
no "marginal utility" (as Staddon calls it). Since we're talking about
slopes, think in terms of increments. At FR-40, the animal is expending
energy at a rate that gives it as much reinforcement as possible under
those conditions. If it increased its behavior rate any further, the
increase in cost would exceed the increase in benefit. If it responded
slower, it would lose more benefit that it would gain by a reduction in
cost. So this is the optimum point for gaining as much net benefit as
possible.

Well, I knew there was _something_ wrong! I was thinking in terms of
absolute values rather than rates. Is my face red . . .

Re: Campbell & Dougan data

In finding the reference levels for these data, you used
responses/reinforcement = 0 rather than response rate = 0. I'm not clear
why. For the ratio data we used the latter.

Regards,

Bruce