The "Matching Law"

[From Bruce Abbott 931014.1440]

Bill Powers (941012.1155 MDT) --

Bill, it's interesting that you chose the "matching law" as an example of
quantitative analysis gone wrong in EAB. As I graduate student I spent quite
a bit of time thinking about the matching law and trying to understand its
implications. For the benefit of those who may be less familiar with this
area of research, I will provide a brief synopsis.

The relationship first emerged when pigeons were exposed to a so-called
"concurrent VI-VI schedule," in which two variable-interval schedules of grain
reinforcement are programmed simultaneously on two separate response keys. A
single VI schedule delivers a reinforcer on the first response to occur AFTER
a given time interval has elapsed; when that response occurs, a new interval
is selected at random from a series having a given average duration. Thus, a
VI 30-s schedule "sets up" the reinforcer for delivery on average once each 30
seconds, giving an average rate of 2 reinforcers per minute. Such a schedule
normally produces a relatively steady rate of responding sufficiently high to
collect the reinforcer as soon as it becomes available (although occasional
lapses do occur).

With two such schedules simultaneously setting up reinforcement opportunities
on two separate keys, how would the pigeons divide their responses? The
answer was summarized by the matching law: the proportion of keypecks emitted
on a key was proportional to the relative rate of reinforcement provided by
the schedule associated with that key:

P1/(P1 + P2) = R1/(R1 + R2) ["P" represents keypecks]

The denominators of the left and right terms of the equation represent total
keypecks emitted during the session and total reinforcers delivered,
respectively. The matching law thus indicates that if 75% of the
reinforcement opportunities are programmed to occur on the schedule associated
with Key 1, then 75% of the keypecks will occur on Key 1.

The matching law represents nothing more than an empirical generalization,
like Boyle's Law--you vary this, and that changes--and was originally intended
only to summarize behavior on concurrent VI-VI schedules. Bill, keeping this
context in mind, let's now have a look at what you had to say about the
matching law.

After a bit of manipulation, you came up with an alternative form of the same
relationship, which I will re-express slightly as follows:

B1/B2 = R1/R2

(You used "B" where I have used "P" but the meaning is the same.) This
equivalent form of the matching law is well-known to researchers working in
this area. From here it is an easy step to

B2/R2 = B1/R1

You then note the following:

The ratio Bn/Rn is the average ratio of bar-presses per unit time to
reinforcements per unit time, for a particular choice. With unit time in
numerator and denominator, it is also the ratio of total bar-presses to
total reinforcements. It is, in fact, the average schedule of

What is missing here is an appreciation of the fact that variable interval
schedules typically generate response rates far in excess of those required to
collect all reinforcers as they become available. The ratio Bn/Rn need not
equal B'n/Rn, where B'n is the minimum response rate required by the schedule.
When we observe that these rates track the schedule requirement, we do learn
something about the animal's ability to adjust to schedule requirements; we
are not fooling ourselves by re-expressing the schedule requirements in
behavioral units.

This point stands out clearly when you examine the procedure actually used to
obtain empirical matching. Switching from one key to the other produces a
"changeover delay" (COD) of about 2 seconds, during which responses will not
produce the reinforcer EVEN IF SET UP BY THE SCHEDULE. On concurrent VI
schedules, the longer the subject responds on one key, the more likely it is
that the reinforcer has been set up on the other key. Without the COD, the
first peck on the key following a switch often produces immediate
reinforcement; with a bit of exposure to this contingency, the birds often
adopt a simple strategy: peck left, peck right, peck left, peck right. The
proportion of keypecks emitted on a key becomes 0.5 REGARDLESS OF VI SCHEDULE
VALUES. It is only by imposing the COD that matching emerges. Clearly,
subject behavior is not simply a mathematical reexpression of the schedule

for fixed-ratio experiments it is exactly the schedule
in bar presses required per reinforcement. If m is that ratio, then the
matching law states that m1 = m2.

For concurrent ratio schedules this fact is well known. What happens
empirically is that the pigeons learn to respond exclusively to the schedule
offering the lower ratio, making matching trivially true, since then

P1/(P1 + P2) = R1/(R1 + R2) = either 0 or 1.

Look what the equation says: it says that for any choice experiment, THE
is what that big fat equal sign in the middle means. What the
generalized matching law says is B1/R1 = B2/R2 = ... Bn/Rn: it says that
all schedules of reinforcement in choice situations, in terms of total
bar presses over total reinforcements or rates over rates, are the same.

The equation we have been working with is not the generalized matching law but
the matching law in its original form. The original form was developed as a
mathematical summary of empirical findings from concurrent VI-VI schedules.
Its original scope included only concurrent VI-VI schedules, but it was
natural for researchers to ask whether matching would be observed on other
concurrent schedules, or in other, similar situations such as multiple
schedules, in which two schedules of reinforcement take turns on a single key.
Empirical work showed that matching was systematically violated by certain
schedule combinations, such as concurrent VI-FI. A little fooling around with
the matching equation in ratio form showed that the data could be "fit" in
many of these schedules by raising (R1/R2) to some exponent and/or multiplying
it by some constant. These modifications produced the "generalized" matching

B1/B2 = b*[(R1/R2)^k]

The constant b represents response bias, the tendency of subjects to prefer
one key over the other even when schedules on the two keys are identical. The
exponent k allows the equation to model "undermatching" (when the response
ratio tends to be closer to 0.5 than the reinforcement ratio) and
"overmatching" (when the reverse is true).

In my view the generalized matching law is far less satisfactory than the
original as it introduces two free parameters whose values are derived from
the data rather than from any theoretical considerations. However, neither
the matching law (ORIGINAL flavor!) nor the generalized matching law (NEW and
IMPROVED!) were intended as anything more than empirical summaries of the kind
B. F. Skinner was so fond (i.e., "functional relationships"). They represent
empirical relationships that an actual THEORY of behavior must explain. (Rick
Marken, are you listening?)

PCT is, I am certain, that theory. The actual problem illuminated by all that
research conducted to investigate the limits of the matching law is to
identify what perceptual variables the subjects of these experiments are
attempting to control. On the surface, the answer seems easy--being hungry,
they are trying to maximize their access to food. However, there are more
interesting questions here. Why, for example, when confronted with a
concurrent VI-FI schedule, do pigeons stubbornly respond at higher rates on
the VI key when such behavior leads to a higher ratio of responses to
reinforcement than could otherwise be attained? What is the perceptual
variable being controlled here? Clearly it is not average reinforcement rate,
for that would lead to matching rather than overmatching.

This is not obvious from the way it is customarily written; you have to
reduce the stated equations to simplest terms before you can see this
"law" as an assertion. And then you can see that the assertion is false
unless all the schedules of reinforcement are in fact the same, in terms
of average ratios.

I am now in a position to address the above statements. The matching law as
an empirical generalization describes responding on VI-VI schedules, where the
observed proportions do not NECESSARILY reflect those that would be optimal in
terms of schedule parameters. (The empirical fact is that the observed
proportions are close to optimal.) It is not and never was intended to model
behavior on any type of schedule whatsoever. In some cases what is actually
observed are systematic deviations from matching that can be DESCRIBED by
introducing the parameters of the "generalized" matching law, but this only
provides a convenient functional representation of the data: the real work is
then to explain what conditions lead to bias, overmatching, matching, or
undermatching. Much of the research in this area has been conducted in an
attempt to identify those factors.

What if those studies showed that subjects are in fact optimizing, not the
rate of reinforcement, but the perceived delay to reinforcement, where this
perception is some non-linear (perhaps negative exponential) function of the
actual delay? Would finding the function that yields matching be tantamount
to discovering the perceptual variable being controlled?

Bill, I want to reemphasize my view that these problems can be approached much
more effectively by adopting PCT, which would give a Newtonian-style
theoretical account of the Keplerian matching law--and perhaps demonstrate
that it is a trivial consequence of control. That is in the (perhaps near)
future. Be that as it may, the matching law is not vulnerable to the
criticism that it states an obvious absurdity. This conclusion depends on
incorrect assumptions about its scope and application.

Now what would be really fun to do is to develop a computer simulation of
concurrent schedule behavior, complete with those independent equations of
which you spoke, and blow these curve-fitters out of the water. Perhaps we
could collaborate on a paper for JEAB....



p.s. Thanks for the ARM1 source code!