simultaneous equations; where we stand

[From Bill Powers (951228.1900 MST)]

To all who wanted one, I hope you had a happy Christmas.

···

----------------------------------------------------------------------
Samuel S. Saunders (951228:17:37:00 EST) --

A neat pair of answers, showing by demonstration that I was wrong: at
least 1 EABer knows how to solve simultaneous equations. Both of your
answers were right. My first set of data was actually all right -- I was
compounding errors by being in a hurry. The second set of data did fit
the equations I started with, so we only need to consider that set.

You will notice that your answer to my re-stated problem,

B = -10 R + 100

... can also be expressed as

B = 10 (10 - R)

This is the equation of an elementary control system with an output gain
of 10 (behaviors per unit time)/(unit error) and a reference level of 10
(reinforcers per unit time). The _same_ control system equation applies
to both sets of observations (with different ratio schedules).
------------------------------
For those following along, the problem was stated as two sets of
observations, both involving fixed-ratio schedules:

           Ratio R: Reinforcement rate B: Behavior rate
Set 1: 5 6.67 33.3
Set 2: 10 5.00 50.0

I stated that we were looking for a linear approximation of the equation
describing the behaving system. This means we wanted an equation of the
form B = a*R + b, which is the equation of a straight line. This same
equation should fit both pairs of values of B and R. Thus

Gen. form: B = a * R + b

Eq 1: 33.3 = a * 6.67 + b
Eq 2: 50.0 = a * 5.00 + b

Subtract equation 1 from equation 2, term by term; the "+ b" term will
be eliminated, leaving us an equation in _a_ alone:

          16.7 = a*(-1.67)

Therefore a = -10.

Substitute -10 for a in either equation above. Using equation 2, we get

50.0 = -10*5.00 + b, or

b = 50 - (-10*5.00), or

b = 100

If we use equation 1 instead, we get

33.3 = (-10*6.67) + b, or

b = 100,

the same answer. So a = -10 and b = 100, giving us the organism equation

Eq. 3: B = -10*R + 100.

An equivalent form is B = 10*(10 - R).

Notice that the organism equation is completely different from the
apparatus equation, R = B/m. The variable m does not appear in the
organism equation.
-----------------------
We can now combine this equation for the behaving organism with the
equation describing the ratio, which is

Eq. 4: R = B/m,
where m is the ratio.

Substituting B/m for R in Eq. 3, we get

B = -10*(B/m) + 100, or

B(1 + 10/m) = 100, or

Eq. 5: B = 100/(1 + 10/m).

From the equation for the ratio, we know that R = B/m, so

Eq. 6: R = 100/(m + 10)

Equations 5 and 6 are the solutions of the system of equations, showing
B and R respectively as functions of the two system constants and the
one independent variable, m.

Setting m = 5, we find B = 100/(1 + 10/5) = 33.3 behaviors per unit
time, and R = 100/(5 + 10) = 6.67 reinforcements per unit time. Those
numbers match the first set of observations. You can check to see that
putting m = 10 into equations 5 and 6 will yield the correct numbers for
the second set of observations. This is a check to see that the system
of equations was solved correctly.

A natural next step would be to try an intermediate ratio, say 7.
Plugging m = 7 into equations 5 and 6 predicts that B = 41.2 and R =
5.9. We could then set the apparatus to a ratio of 7 and measure R and
B, to see how close the prediction is. Of course we could also try
values of m outside the measured range.
------------------------------
It's important to see that the only assumption about the organism was
that its behavioral response to reinforcements followed a linear form, B
= a*R + b, with no assumptions about the signs or magnitudes of a or b.
The values of a and b were unknowns. To solve a system of equations for
two unknowns, we must have two equations, which are obtained from
observations made under two conditions of the independent variable m.
With the required two data sets, we could find the values of a and b
that would satisfy both the apparatus equation and the organism equation
under both sets of conditions. This we did, finding that a = -10 and B =
100. Plugging those values into the original equations, we found that
the solutions fit both data sets, as they should if we have made no
mistakes.

Having derived the organism equation and having shown that it does
reproduce the original two sets of observations, we can then go on to
test the assumption that the organism equation is in fact linear. By
picking a new value of the independent variable m between the original
two, we predict new values of B and R. If the observed values of R and B
with the new value of m are close to the prediction, we can say that the
linear equation is sufficient to explain the organism's behavior. If
not, we then have to propose a non-linear organism equation and try
again. We can, of course, explore the organism equation over a wider
range, to see what range of m still produces values of R and B that are
predicted by the same linear equation.
-----------------------------------
In none of this do we have to say out loud that the linear equation we
get is a control-system equation. Recognizing it as such depends on
having seen other control systems and their equations, and using the
appropriate form, B = G*(R0 - R), in which the constants G and R0 are
given the physical meanings of gain and reference level respectively.

Even without recognizing the organism equation as a control system
equation, we can still see that the effect of reinforcements on the
organism is not like the effect previously imagined by theorists. The
effect is uniformly _ negative_, not positive, because the data show
that the constant _a_ must be a negative number. Also, over the range of
ratios considered, the data show that the way the organism reacts to
reinforcers is independent of the ratio: that is, the dependency of B on
R that we derive does not include the variable m.

Remember, this analysis did not begin by assuming a control system. It
began by assuming only a linear form of dependence of behavior rate on
reinforcement rate. The data supplied the values of the system
constants, and showed that the constant _a_ has to be a negative number.

From that result we can conclude that the system is acting like a

control system, not like a system in which more reinforcement produces
more behavior.
----------------------
I think a brief statement of where we stand in the comparison of control
theory and reinforcement theory may help organize the discussion.

Months ago, Bruce Abbott showed that for some real multiple-ratio data
sets in which increasing reinforcement rates go with decreasing behavior
rates (as was the case in my imaginary data sets above), the
reinforcement-collection time confounds the data. What is presented as a
rate of bar-pressing is really a compound of two effects: the real rate
of bar-pressing (unreported) and a collection time (presumed constant
but also unreported). Bruce found that to a first approximation, a
_constant_ actual rate of pressing would account for the data, given the
right assumed constant collection time. I found that by varying the
assumed collection time, curves could be generated that were like the
control-system curves for low ratios, but that tended toward a constant
rate of pressing at high ratios (in any case leaving the fall-off of
behavior rate as ratios increase even further unaccounted for).

Before Bruce came up with this shocker, I had applied an analysis just
like the one above and had found the equation for the organism function
that would match the observed behavior and reinforcement rates rather
well over a wide range of fixed-ratio schedules, from 40 to 1. This was
done under the assumption that the reported behavior rate was the actual
rate of bar-pressing.

Now we have a second possibility. It is possible that under the
experimental conditions that were reported, the animals did not actually
vary their rate of pressing at all, over the range of ratios where the
control system model was fit to the data. Instead, the animals were
simply pressing at a constant rate, taking a short time-out to collect
the reinforcer when the ratio was satisfied and then going back (after a
fixed time) to pressing the bar again.

All this is a nice lesson in system analysis. I proved that it is
possible to fit a control-system equation to spurious data. Only Bruce's
careful analysis revealed that the data were not what they seemed. So
now we have to scratch the original analysis and start again, this time
getting the information we need which was not reported in the
literature.

We are going to get data in which we can detect the actual pressing
rates at all times, and the actual collection times. If Bruce's analysis
holds up, a case can be made that behavior rate rises with reinforcement
rate for very high ratios, and then simply levels off, becoming constant
for all lower ratios. If this were the only consideration, it would be
very strong support for the basic concept of reinforcement, under which
increasing rates of reinforcement must produce increasing behavior
rates. The leveling off would then be just some sort of saturation or
limiting effect, which any model would have to deal with at some point.

However, there is a third possibility which bears on the behavior under
high ratios. We have been discussing this recently as the difference
between _acquisition_ and _performance_. When ratios become very high
and reinforcement rates very low, animals may not actually remain at the
bar, pressing slowly. Instead, they may wander away from the bar (or
key) and search the cage elsewhere, then return to the bar and execute a
very fast string of behaviors continuously until the ratio is satisfied.
So the apparent drop-off of behavior rate at very low reinforcement
rates may also be an artifact of the way behavior rates are reported (as
total recorded behaviors per session divided by total time per session).
It is thus possible that the actual rate of pressing remains essentially
constant over the _entire range of ratios_, with the apparent changes in
behavior rate being caused simply by the fact that the animal spends
more or less time doing something beside pressing the bar. At high
ratios we may be seeing an effect of _where_ the animal is doing its
behaving, rather than an effect on the amount of behavior. At lower
ratios, we are possibly seeing the effect of the animal's spending some
small fixed time collecting the reinforcer (which is also time not spent
pressing).

The third possibility gives us quite a different picture of behavior.
What we see under this third choice are animals that learn one simple
act as a way of controlling the reinforcers: rapidly pressing the bar.
The apparent changes in rate are due entirely to the execution of other
behaviors that take the animal away from the bar, so its activities are
simply not recorded. There is some indication that the actual rate of
pressing increases with deprivation, but this may well apply to the
rates of behaviors away from the bar, too, so it is not specific to bar-
pressing.

Only good data can tell us what we actually have here. We have to record
each bar-press with the exact time at which it occurred (with a
resolution of at worst 0.01 sec). Likewise, we have to be able to tell
where the rat was at any given time, which Bruce proposes to do with
videotaping. From the record of bar-pressing we can measure the actual
rate of pressing and the duration of collection-times; the video tapes
will show us whether an apparent slowdown in pressing rate is actually
caused by being physically located away from the bar. Also, the detailed
record of presses will show whether the true pressing-rate actually
varies with the schedule, and if so, along exactly what curve.

I hope that all concerned realize that we are ALL going back to the
drawing board. At stake are both the control-system model and the
reinforcement model. It is perfectly possible that both will fail to
account for the data. The reinforcement model is in jeopardy because of
the way behavior rate is customarily reported, being confounded with
time spent doing something else. The control-system model is in jeopardy
because there is no guarantee that any controlled variable will be
found, or that any organism equation derived from the actual data will
show a stable reference level and a stable negative gain factor.

Let the chips fall where they may.
-----------------------------------------------------------------------
Best,

Bill Powers