PCT and op cond

[From Bill Powers (951028.0600 MDT)]

Bruce Abbott (951027.1600 EST) --

You seem to have been reading ahead in the text, and have somehow
managed to complete PCT 301, which is pretty close to all there is.
However, there is still one outstanding paper from 101, concerning
reinforced behavior in the presence of unsystematic disturbances of the
reinforcement rate.

I concur in your invitation to Chris Cherpas to join up and share the
work.

···

-----------------------------------------------------------------------
CHUCK TUCKER (951027) --

Nice exercise in separating observation from interpretation. For another
slant on the same subject, I recommend the Coin Game (B:CP p.235f).
-----------------------------------------------------------------------
Chris Cherpas (951027.1626 PT) --

      Matching is based on interval schedules. With concurrent ratios,
     you usually just get exclusive responding on the smaller ratio,
     which, of course, still yields the matching law, since it's based
     on obtained, not programmed, rates of reinforcement. If b1/r1 is
     the smaller ratio, you get: b1/(b1 + 0) = r1/(r1 + 0) ==> 1 = 1. Of
     course, the matching law itself doesn't tell you if the exclusive
     preference would be for the smaller or the larger ratio!

The amount of responding on each key for a ratio schedule doesn't affect
the "matching" relationship. b1/(b1 + b2) = r1/(r1 + r2) is
algebraically identical to b1/r1 = b2/r2. That is, the first equality is
just a more elaborate way of expressing the second one. With exclusive
responding on schedule 1, you have b1/r1 = m1, but

limit(b2/r2) = m2
  b2 --> 0

as before.

     cc:
      Viewed this way, matching is impossible in concurrent ratios
      unless the ratios programmed are identical. If one alternative is
      FR10 (10/1) and the other is FR20 (20/1), can 10/1 = 20/1?

Exactly. For all pairs of schedules that are not identical, the matching
law for ratio schedules is simply a mathematically false statement.

The matching law for ratio schedules is merely a way of describing the
apparatus, which assures that for every m presses there will be exactly
1 reinforcement. This ratio is unaffected by how often or how many times
or at what rate (or by what agency) the lever is pressed. Whatever leads
to the animals' pressing one lever more than the other can't be
expressed simply in terms of the ratio schedules.

Fixed-interval i schedules are equivalent to FR-1 for all rates of
behavior b such that 1/b < i, and produce constant rates of
reinforcement when the rate is less than the critical rate. When noise
is added to this relationship, as in a variable-interval schedule, the
observations simply become more uncertain. There is still nothing that
constrains the behavior rate to have any particular relationship to the
scheduled reinforcement rate. Given the pattern of behavior rate, we
could calculate exactly, from the properties of the apparatus, the
pattern of reinforcement rate. But given only the schedule, we can't
calculate the behavior rate OR the reinforcement rate. Again, whatever
it is that makes the animals press one lever more than another can't be
expressed strictly in terms of the schedule.

      However, as Rachlin (1973) points out, if the first law of
     thermodynamics were found to be violated, one would assume
     something wrong with the procedure or measurement, not the
     (tautological) law.

This isn't the problem. The matching law for ratios doesn't actually say
anything about the pattern of behavior; it is strictly a description of
the schedule, but put into a form that disguises this fact. The
expression b1/r1 = b2/r2 (in whatever form this relationship is
expressed) will remain true even if the two keys are pressed at random.
Provided, of course, that it IS a correct statement, which is true only
if the two schedules are the same.

For interval schedules the extension of the ratio law is purely a shot
in the dark; there is no reason to suppose that b1/r1 = b2/r2 when the
schedules are nonlinear functions unless the functions are identical. In
fact, we can express the general matching law as

b1/[f1(b1) + f2(b2)...] = b2/[f1(b1) + f2(b2) + ...] = ...

where the functions f1, f2 and so on represent arbitrary different
schedules. This relationship can be true only for certain forms of the
functions, and is false for all other forms. The truth or falsity of the
asserted equality does not depend on empirical data, but only on the
rules of mathematics and the forms of the functions. No matter how
elaborate and complicated the expression, it is still describing only
the apparatus, and it either describes it correctly or incorrectly.
Nothing is said that determines the behavior.

    The matching law is a tautology (e.g., see Rachlin, 1971). What
    matches is local reinforcement rates (at steady state); matching
    theory requires "melioration" theory to explain why you get it.

It requires _some_ model of the organism, and the final behavior you get
depends on the parameters assigned to the model as well as on the
schedule. Somehow you have to supply the missing equations.

Also, in the few examples of matching data that I have seen, it seems
that behavior analysts have been driven to the same practices they decry
in mainstream psychology, using statistics to excuse rather large
differences between theory and data. The "matching" that is demonstrated
by experiments I have seen is not exactly an impressive demonstration
that matching occurs.

There is one test that nobody seems to have thought of. What would
happen if you simply applied random presses to the schedules and
recorded the resulting relationships between reinforcement rates and
"behavior" rates? This could easily be done in simulation. That kind of
result is the baseline that is needed against which actual data can be
compared to determine how much more predictability is added by the
theory.

Have you seen any evidence that behavior analysts know how to use
simultaneous equations to analyze a system?
-----------------------------------------------------------------------
Best to all,

Bill P.

[From Bruce Abbott (951028.1110 EST)]

Bill Powers (951028.0600 MDT) --

Bruce Abbott (951027.1600 EST) --

You seem to have been reading ahead in the text, and have somehow
managed to complete PCT 301, which is pretty close to all there is.
However, there is still one outstanding paper from 101, concerning
reinforced behavior in the presence of unsystematic disturbances of the
reinforcement rate.

Dear Professor Powers:

Enclosed is my final paper for PCT 101. I hope you can get it graded soon
so I know what courses to take next semester.

Sincerely,

Bruce

···

----------------------------------------------------------------------------
  REINFORCED BEHAVIOR IN THE PRESENCE OF DISTURBANCE TO REINFORCEMENT RATE

                            Bruce Abbott
                              PCT 101

In my previous paper I described how reinforcers can be defined as events
that, when made contingent on an action, tend to reduce error in the system
producing that action. In the prototypical operant procedure, error is
produced by food deprivation and the event that tends to correct this error
is delivery of food. If the error (deprivation level) is constant the
output function will develop a particular level of output (e.g., rate of
lever-pressing). That rate of output will produce a given rate of food
delivery, depending on the schedule imposed by the experimenter.

If the rate of food delivery is itself a controlled perception, then
disturbances to this rate will be opposed. One way to disturb food rate is
to interpose free (noncontingent) food deliveries between those produced by
the rat's lever-pressing actions. The extra deliveries would be expected to
raise the rate of food delivery above its reference level, developing an
error in the food-rate control system (perceived rate above reference rate).
This would produce a reduction in the rat's rate of lever-pressing and thus,
through the schedule parameters, a reduction in food-rate, reducing the
error. Withholding some proportion of scheduled (response-contingent) food
deliveries would be expected to have the opposite effect, reducing food rate
below reference and producing a compensatory increase in response rate. In
consequence the overall rate of food delivery will remain roughly constant
despite these experimenter-imposed distrubances.

The expected result is paradoxical from the point of view of reinforcement
theory, if one assumes that the disturbances have little effect on
deprivation level: interspersed free food deliveries would appear to
suppress responding on the lever, whereas deleted food deliveries would
appear to reinforce responding.

In order to simplify the explanation, I have ignored several important
details, such as the fact that the observed performance (lever pressing)
would involve several levels of control; in addition we have evidence that,
in at least one situation (ratio schedules), one kind of disturbance to
reinforcement rate (changing the ratio requirement) did not appear to
produce the expected compensatory changes in response rate (although we need
better data to confirm this conclusion). The predictions offered here are
therefore intended only to indicate how control theory can be applied to the
problem of disturbance to reinforcement rate, under the assumption that
reinforcement rate is in fact a controlled perception under these
conditions. Developing a proper model of actual rat behavior under these
schedules will require additional research to determine what perceptual
variables are in fact being controlled and how the various control systems
involved interact.
------------------------------------------------------------------------------