[From Bill Powers (951228.1900 MST)]

To all who wanted one, I hope you had a happy Christmas.

## ···

----------------------------------------------------------------------

Samuel S. Saunders (951228:17:37:00 EST) --

A neat pair of answers, showing by demonstration that I was wrong: at

least 1 EABer knows how to solve simultaneous equations. Both of your

answers were right. My first set of data was actually all right -- I was

compounding errors by being in a hurry. The second set of data did fit

the equations I started with, so we only need to consider that set.

You will notice that your answer to my re-stated problem,

B = -10 R + 100

... can also be expressed as

B = 10 (10 - R)

This is the equation of an elementary control system with an output gain

of 10 (behaviors per unit time)/(unit error) and a reference level of 10

(reinforcers per unit time). The _same_ control system equation applies

to both sets of observations (with different ratio schedules).

------------------------------

For those following along, the problem was stated as two sets of

observations, both involving fixed-ratio schedules:

Ratio R: Reinforcement rate B: Behavior rate

Set 1: 5 6.67 33.3

Set 2: 10 5.00 50.0

I stated that we were looking for a linear approximation of the equation

describing the behaving system. This means we wanted an equation of the

form B = a*R + b, which is the equation of a straight line. This same

equation should fit both pairs of values of B and R. Thus

Gen. form: B = a * R + b

Eq 1: 33.3 = a * 6.67 + b

Eq 2: 50.0 = a * 5.00 + b

Subtract equation 1 from equation 2, term by term; the "+ b" term will

be eliminated, leaving us an equation in _a_ alone:

16.7 = a*(-1.67)

Therefore a = -10.

Substitute -10 for a in either equation above. Using equation 2, we get

50.0 = -10*5.00 + b, or

b = 50 - (-10*5.00), or

b = 100

If we use equation 1 instead, we get

33.3 = (-10*6.67) + b, or

b = 100,

the same answer. So a = -10 and b = 100, giving us the organism equation

Eq. 3: B = -10*R + 100.

An equivalent form is B = 10*(10 - R).

Notice that the organism equation is completely different from the

apparatus equation, R = B/m. The variable m does not appear in the

organism equation.

-----------------------

We can now combine this equation for the behaving organism with the

equation describing the ratio, which is

Eq. 4: R = B/m,

where m is the ratio.

Substituting B/m for R in Eq. 3, we get

B = -10*(B/m) + 100, or

B(1 + 10/m) = 100, or

Eq. 5: B = 100/(1 + 10/m).

From the equation for the ratio, we know that R = B/m, so

Eq. 6: R = 100/(m + 10)

Equations 5 and 6 are the solutions of the system of equations, showing

B and R respectively as functions of the two system constants and the

one independent variable, m.

Setting m = 5, we find B = 100/(1 + 10/5) = 33.3 behaviors per unit

time, and R = 100/(5 + 10) = 6.67 reinforcements per unit time. Those

numbers match the first set of observations. You can check to see that

putting m = 10 into equations 5 and 6 will yield the correct numbers for

the second set of observations. This is a check to see that the system

of equations was solved correctly.

A natural next step would be to try an intermediate ratio, say 7.

Plugging m = 7 into equations 5 and 6 predicts that B = 41.2 and R =

5.9. We could then set the apparatus to a ratio of 7 and measure R and

B, to see how close the prediction is. Of course we could also try

values of m outside the measured range.

------------------------------

It's important to see that the only assumption about the organism was

that its behavioral response to reinforcements followed a linear form, B

= a*R + b, with no assumptions about the signs or magnitudes of a or b.

The values of a and b were unknowns. To solve a system of equations for

two unknowns, we must have two equations, which are obtained from

observations made under two conditions of the independent variable m.

With the required two data sets, we could find the values of a and b

that would satisfy both the apparatus equation and the organism equation

under both sets of conditions. This we did, finding that a = -10 and B =

100. Plugging those values into the original equations, we found that

the solutions fit both data sets, as they should if we have made no

mistakes.

Having derived the organism equation and having shown that it does

reproduce the original two sets of observations, we can then go on to

test the assumption that the organism equation is in fact linear. By

picking a new value of the independent variable m between the original

two, we predict new values of B and R. If the observed values of R and B

with the new value of m are close to the prediction, we can say that the

linear equation is sufficient to explain the organism's behavior. If

not, we then have to propose a non-linear organism equation and try

again. We can, of course, explore the organism equation over a wider

range, to see what range of m still produces values of R and B that are

predicted by the same linear equation.

-----------------------------------

In none of this do we have to say out loud that the linear equation we

get is a control-system equation. Recognizing it as such depends on

having seen other control systems and their equations, and using the

appropriate form, B = G*(R0 - R), in which the constants G and R0 are

given the physical meanings of gain and reference level respectively.

Even without recognizing the organism equation as a control system

equation, we can still see that the effect of reinforcements on the

organism is not like the effect previously imagined by theorists. The

effect is uniformly _ negative_, not positive, because the data show

that the constant _a_ must be a negative number. Also, over the range of

ratios considered, the data show that the way the organism reacts to

reinforcers is independent of the ratio: that is, the dependency of B on

R that we derive does not include the variable m.

Remember, this analysis did not begin by assuming a control system. It

began by assuming only a linear form of dependence of behavior rate on

reinforcement rate. The data supplied the values of the system

constants, and showed that the constant _a_ has to be a negative number.

From that result we can conclude that the system is acting like a

control system, not like a system in which more reinforcement produces

more behavior.

----------------------

I think a brief statement of where we stand in the comparison of control

theory and reinforcement theory may help organize the discussion.

Months ago, Bruce Abbott showed that for some real multiple-ratio data

sets in which increasing reinforcement rates go with decreasing behavior

rates (as was the case in my imaginary data sets above), the

reinforcement-collection time confounds the data. What is presented as a

rate of bar-pressing is really a compound of two effects: the real rate

of bar-pressing (unreported) and a collection time (presumed constant

but also unreported). Bruce found that to a first approximation, a

_constant_ actual rate of pressing would account for the data, given the

right assumed constant collection time. I found that by varying the

assumed collection time, curves could be generated that were like the

control-system curves for low ratios, but that tended toward a constant

rate of pressing at high ratios (in any case leaving the fall-off of

behavior rate as ratios increase even further unaccounted for).

Before Bruce came up with this shocker, I had applied an analysis just

like the one above and had found the equation for the organism function

that would match the observed behavior and reinforcement rates rather

well over a wide range of fixed-ratio schedules, from 40 to 1. This was

done under the assumption that the reported behavior rate was the actual

rate of bar-pressing.

Now we have a second possibility. It is possible that under the

experimental conditions that were reported, the animals did not actually

vary their rate of pressing at all, over the range of ratios where the

control system model was fit to the data. Instead, the animals were

simply pressing at a constant rate, taking a short time-out to collect

the reinforcer when the ratio was satisfied and then going back (after a

fixed time) to pressing the bar again.

All this is a nice lesson in system analysis. I proved that it is

possible to fit a control-system equation to spurious data. Only Bruce's

careful analysis revealed that the data were not what they seemed. So

now we have to scratch the original analysis and start again, this time

getting the information we need which was not reported in the

literature.

We are going to get data in which we can detect the actual pressing

rates at all times, and the actual collection times. If Bruce's analysis

holds up, a case can be made that behavior rate rises with reinforcement

rate for very high ratios, and then simply levels off, becoming constant

for all lower ratios. If this were the only consideration, it would be

very strong support for the basic concept of reinforcement, under which

increasing rates of reinforcement must produce increasing behavior

rates. The leveling off would then be just some sort of saturation or

limiting effect, which any model would have to deal with at some point.

However, there is a third possibility which bears on the behavior under

high ratios. We have been discussing this recently as the difference

between _acquisition_ and _performance_. When ratios become very high

and reinforcement rates very low, animals may not actually remain at the

bar, pressing slowly. Instead, they may wander away from the bar (or

key) and search the cage elsewhere, then return to the bar and execute a

very fast string of behaviors continuously until the ratio is satisfied.

So the apparent drop-off of behavior rate at very low reinforcement

rates may also be an artifact of the way behavior rates are reported (as

total recorded behaviors per session divided by total time per session).

It is thus possible that the actual rate of pressing remains essentially

constant over the _entire range of ratios_, with the apparent changes in

behavior rate being caused simply by the fact that the animal spends

more or less time doing something beside pressing the bar. At high

ratios we may be seeing an effect of _where_ the animal is doing its

behaving, rather than an effect on the amount of behavior. At lower

ratios, we are possibly seeing the effect of the animal's spending some

small fixed time collecting the reinforcer (which is also time not spent

pressing).

The third possibility gives us quite a different picture of behavior.

What we see under this third choice are animals that learn one simple

act as a way of controlling the reinforcers: rapidly pressing the bar.

The apparent changes in rate are due entirely to the execution of other

behaviors that take the animal away from the bar, so its activities are

simply not recorded. There is some indication that the actual rate of

pressing increases with deprivation, but this may well apply to the

rates of behaviors away from the bar, too, so it is not specific to bar-

pressing.

Only good data can tell us what we actually have here. We have to record

each bar-press with the exact time at which it occurred (with a

resolution of at worst 0.01 sec). Likewise, we have to be able to tell

where the rat was at any given time, which Bruce proposes to do with

videotaping. From the record of bar-pressing we can measure the actual

rate of pressing and the duration of collection-times; the video tapes

will show us whether an apparent slowdown in pressing rate is actually

caused by being physically located away from the bar. Also, the detailed

record of presses will show whether the true pressing-rate actually

varies with the schedule, and if so, along exactly what curve.

I hope that all concerned realize that we are ALL going back to the

drawing board. At stake are both the control-system model and the

reinforcement model. It is perfectly possible that both will fail to

account for the data. The reinforcement model is in jeopardy because of

the way behavior rate is customarily reported, being confounded with

time spent doing something else. The control-system model is in jeopardy

because there is no guarantee that any controlled variable will be

found, or that any organism equation derived from the actual data will

show a stable reference level and a stable negative gain factor.

Let the chips fall where they may.

-----------------------------------------------------------------------

Best,

Bill Powers