# VI feedback function

[From Bruce Abbott (960215.1505 EST)]

Bill Powers (960215.0100 MST) --

Killeen offers this as the formula for the feedback function ( in terms
of interresponse intervals):

R = [1 - exp(-IRT/T)]/IRT

where T is the mean interval

I think your formula is an approximation of this, based on the series
expansion of the exponential; if so, it isn't a valid approximation for
all values of IRT/tau as it drops the terms with exponents greater than
1. You might want to plug this formula into your program to see if the
numbers change very much.

I worked up a simulation in which a "statrat" presses a lever at some
specified rate and earns reinforcers on a constant-probability VI schedule.
Both the rat and schedule used a RANDOM < p mechanism in which p is the
probability of a response (or reinforcement) on each iteration. After each
"session" the program prints out the number of reinforcers programmed, the
number actually obtained, the average programmed interreinforcement
interval, and the average obtained interreinforcement interval. I tried
several schedule values and response rates; it looks like my formula for the
VI feedback function is a good approximation under these conditions. For
example, over ten simulated 30-min sessions at VI 30-s with a 30-s ITI there
were on average 59 reinforcers programmed, the average programmed interval
was 30.38 s, there were 28.3 reinforcers delivered, and the average obtained
interval was 62.01 s. From the formula the predicted average obtained
interval is 61.29 s, about a 1% error and within the margin of error for
estimation. Repeating with a 1-s IRT produced an average of 28.75
s/interval programmed, 29.61 s/interval obtained, and 29.75 sec/interval
expected from the formula, for an error of less than 0.5% and again within
the margin of error for estimation.

It would be very interesting to see where, on interval schedules, the
maximum of the R vs B curve occurs, in terms of the effective loop gain,
g. Keep in mind that we have not established the reason for the low
behavior rates at the lowest reinforcement rates -- whether the animal
is "emitting operants" at a steady low rate, or at a very high rate, but
interspersed with long periods of doing something else. It is possible
that the maximum of the R/B curve represents the point where a
significant proportion of the time is spent not pressing the lever but
engaging in other behaviors. If this turnover region tends to occur
where the loop gain has fallen close to 1 (or some low number), we might
have a regularity that will tell us something interesting.

Yes, I'd like to investigate that -- if I can find some good representative
curves. I've discovered something interesting while simulating VI schedule
performance under the assumption that reinforcement rate is controlled.
Assume that the reference rate is something relatively high, like one pellet
per 10 seconds. On a VI 10-s schedule, this rate equals the programmed rate
of reinforcement. Assuming a gain of 100 for the organism side of the loop,
the rat will respond at an average IRT of 1.05 s and receive one pellet per
11.05 s. The loop gain at this rate is 9.512. If we now raise the schedule
requirement to VI 15-s, the schedule does not permit the rat to reach its
reference level of 1 pellet/10 s. The error increases, driving behavior
rate up into the lower-gain region of the VI curve. IRT falls to 0.29
s/response, loop gain drops to 1.891, and rat receives one pellet per 15.29
s. Now increase the schedule requirement to VI 30-s. The irreducible error
becomes even larger, driving responding even further up the low-gain part of
the curve, IRT is now 0.15 s, gain = 0.496, and the rat receives 1
pellet/30.15 sec. At VI 60-s the IRT becomes 0.12 s, gain = 0.20, and the

Of course, the external observer knows nothing of the rat's 10 s/pellet
reference rate and loop gain. Assuming that the rat is attempting to
collect each reinforcer when it becomes available, the observer uses the
scheduled VI as the reference value and computes the gain from the error
between this reference and the obtained value, obtaining the following:

VIsch VIobt Error Gain
10.00 11.05 1.05 10.5
15.00 15.29 0.29 52.7
30.00 30.15 0.15 201.0
60.0 60.12 0.12 501.0

Loop gain appears to be _increasing_ with the size of the VI schedule! The
rat appears to be maintaining its obtained rate of reinforcement closer to
the programmed rate as the programmed rate of reinforcement declines. By VI
60-s, the loop gain appears to have grown to 501 when in fact it is only 0.20.

There are, of course, a couple of problems with this scenario. First,
response rate actually _declines_ as the size of the average interval
increases, whereas the simulation predicts that response rate should
increase. Second, we have no idea whether rats and pigeons attempt to
control the _rate_ of food delivery on operant schedules.

If it is assumed in the simulation that the rats are attempting to collect
reinforcers at the scheduled rate (i.e., as soon as available), then the
following is true. First, the loop gain remains constant at 9.512 at all
programmed schedule values. Second, response rate declines as the interval
becomes larger:

VIsch IRT Resp/m
10 1.05 57.1
15 2.10 28.6
30 3.15 19.0
60 6.31 9.5

Note that the increase in IRT is a linear function of the interval size.
Here it is 0.105*VI. Reducing the output gain from 100 to 10 changes the
slope coefficient to 0.370, giving an IRT of 3.7 s at VI 10-s and 22.2 s at
VI 60-s.

These changes are at least in the right direction. However, McSweeney's
data show a larger error between scheduled and obtained reinforcement rates
at VI 15-s than at lower values. This inconsistency may be due to the
inclusion of collection time in the average rates, something not taken into
account in my current simulation. Shorter schedules will pay off more
frequently, leading to a greater proportion of time spent in collection, and
this could account for the discrepancy.

Some time ago I conducted a study in which rats had to respond within 1.5 s
of shock onset in order to guarantee that the shock would end at 1.5 s after
onset. What happened was that the rats adjusted their response rates until
the IRT distribution had few IRTs exceeding the 1.5 s duration. IRTs longer
than that produced shocks the duration of the IRT, so the rats were
obviously minimizing the average shock duration, although as a consequence
many responses occurred earlier than necessary. Rats may do something
similar when responding on VI schedules, adjusting the mean of their IRT
distributions so that few reinforcers are delayed much beyond their
programmed time of availability.

Regards,

Bruce