runaway

[From Bill Powers (951215.0600 MST)]

Martin Taylor 951214 17:20 --

ME:
     In order to convert the positive exponential runaway predicted by
     the basic definition of a reinforcement effect into a negatively-
     accelerated curve, you must not only propose a source of cost, but
     you must propose that it increases nonlinearly, and at some
     behavior rate becomes large enough to halt the runaway effect.

YOU:
     I thought SS had done exactly that:

     Saunders:
     1. In order to engage in the target response, it is necessary to
     forgo other sources of reinforcement (scratching an itch;
     meditating on the implications of PCT; whatever) which are
     contingent on other behavior which is incompatible with the target
     response. The more of the target response is made, the more loss
     other reinforcement. No runaway.

You are forgetting that additive combinations of linear functions
produce a net linear function.

Consider two behaviors, B1 and B2. B1 produces reinforcers at a rate R1
= k1* B1 and B2 produces them at a rate R2 = k2*B2. In a linear system,
therefore, if an increase in B2 implies a corresponding decrease in B1,
the net change in reinforcement R2 goes as (k2-k1)*B2. If k2 is greater
than k1, the increase in net reinforcement due to an increasing rate of
B2 is still positive and runaway will still occur.

If I'm wrong about the runaway, you'll have to prove it mathematically.

This argument is coming close to assuming that different kinds of
reinforcement are interchangeable; that a loss of reinforcement from a
running wheel implies a loss of food reinforcement. The interaction is
not through the reinforcements, but through the partitioning of
behaviors. If an increase in food-getting activities causes a decrease
in running-wheel activities, the running-reinforcement decreases, and
the tendency to engage in running behavior should begin to extinguish,
which reduces the competition for time. This should lead to an
accelerating curve for B2, not the decelerating curve that is needed.

In order to get the net behavior curve to rise at first and then level
out at some specific value, you have to do two things: name the
competing behaviors and assign specific reinforcement/decay functions to
them, and propose specific forms of nonlinearity for each function that
will cause the net reinforcement per action B2 to fall back to zero at
high rates of behavior -- at just the rate of behavior where the real
behavior is observed to level out (or a little higher if there is a
decay term assumed in order to produce extinction).

I wouldn't venture to guess at the actual outcome when you have two or
more nonlinear positive-feedback systems in conflict. The result will
depend not only on the relative reinforcement constants, but on the
exact forms of nonlinearity that are chosen. Determining the parameters
from the data for both (or all) systems would be quite a job.

YOU:
     As you demand, the cost is low when the operant takes little of the
     subject's resources, so that the other sources of reinforcement are
     only slightly affected. As the operant comes to take a substantial
     amount of time, the proportionate effect on the other possible
     activities grows until at some fixed level of the operant there is
     nothing left for the other sources of reinforcement--effectively
     infinite cost.

No, not infinite cost; merely a cost equal to the benefit. The net
reinforcement is not the ratio of benefit to cost, but the difference.
The costs burn up some of the reinforcement effect, leaving a net amount
of reinforcement. Are you really proposing that a loss of running-wheel
reinforcement is equivalent to a loss of food reinforcement?

The main problem with purely verbal arguments is that you can't keep in
mind all the relationships you have proposed; you tend to view the
system through a small moving window that takes only part of the whole
system into account at a given time. You focus on implications of the
statement under immediate consideration, and forget the implications of
statements you made before.

When you convert from words into mathematical statements, ALL the
proposed relationships have to be considered simultaneously; that is
what simultaneous equations mean, and in solving them simultaneously,
you are bringing in ALL the constraints that have been stated. Modelers
need the same skill that good liars need: they have to remember
everything they have said. That's much easier to do with mathematics --
in fact, the mathematics does it for you. Simulations do the same thing.

     It seems to me that this cost function goes up somewhat faster than
     exponentially, and must "become large enough to halt the runaway
     effect."

I don't think you know that. But if you can demonstrate it I'll believe
you.

···

-----------------------------------------------------------------------
Bob Clark (9512145.1440 EDT) --

Your suggestion about the relationship between the autonomic nervous
system and the reorganizing system is very interesting. I would like to
see more information on that. Other candidates (not mutually exclusive)
are the limbic system and the reticular formation, systems widely
connected to all parts of the brain. Unfortunately, I'm too far out of
touch with modern neurological research to say anything useful about
this.
-----------------------------------------------------------------------
Best to all,

Bill P.

[Martin Taylor 951218 15:00]

Bill Powers (951215.0600 MST)

Let me say before starting that I think the "reinforcement" notion that
underlies this discussion is as uninteresting as the precise layout of
epicyclic centres. But it is still an intellectual exercise to make the
epicycles fit the planetary orbits, so I am responding to Bill's request:

If I'm wrong about the runaway, you'll have to prove it mathematically.

Here goes:

Martin Taylor 951214 17:20 --

WTP:
    In order to convert the positive exponential runaway predicted by
    the basic definition of a reinforcement effect into a negatively-
    accelerated curve, you must not only propose a source of cost, but
    you must propose that it increases nonlinearly, and at some
    behavior rate becomes large enough to halt the runaway effect.

MMT:
    I thought SS had done exactly that:

    Saunders:
    1. In order to engage in the target response, it is necessary to
    forgo other sources of reinforcement (scratching an itch;
    meditating on the implications of PCT; whatever) which are
    contingent on other behavior which is incompatible with the target
    response. The more of the target response is made, the more loss
    other reinforcement. No runaway.

You are forgetting that additive combinations of linear functions
produce a net linear function.

No, I wasn't forgetting. Just under the impression that we aren't talking
about linear functions, and that SS had proposed what you ask for--a source
of cost that increases non-linearly at a rate that becomes large enough
to halt the runaway effect.

Consider two behaviors, B1 and B2. B1 produces reinforcers at a rate R1
= k1* B1 and B2 produces them at a rate R2 = k2*B2. In a linear system,
therefore, if an increase in B2 implies a corresponding decrease in B1,
the net change in reinforcement R2 goes as (k2-k1)*B2. If k2 is greater
than k1, the increase in net reinforcement due to an increasing rate of
B2 is still positive and runaway will still occur.

You are treating B1 and B2 as if they had access to infinite resources, and
therefore had zero cost. I thought Saunders was dealing with costs, where:

    1. In order to engage in the target response, it is necessary to
    forgo other sources of reinforcement...
            ... The more of the target response is made, the more loss
    other reinforcement.

The cost function I envisage can be seen by analogy. To a person with $2,
the "cost" of a $1 item is greater than it is to a person with $10. After
the item has been purchased, the first person is left with $1 for all other
purchases, whereas the second person is left with $9. I therefore make the
first-order presumption that the "cost" of the $1 item is 9 times greater
for the first person than for the second. This presumption is clearly
simplistic but probably OK for a first-order approach to the problem.

I don't believe my simple relationship is likely to be correct, but
then I don't believe the whole notion of reinforcement, anyway. Any
sufficiently rising function will do the same job, and the cost of using
the last unit of resources must be extreme, because it could be fatal
to the organism if another requirement for resources came up when the
last unit was already in use.

In the case of the behaviours B1 and B2, the presumption is that
B1 takes C1 of a limited resource C0, and the "cost", C, of doing so is
C = C1/(C0-C1). If B2 is also taking up resources, the "effective" value
of C0 for B1 is reduced by C2. Forgetting B2 for the moment, the marginal
cost of B1 is dC/dB1, where C1 = h*B1, and h has the dimension "quantity
of resource/units of behaviour".

dC/dB1 = h*C0/(C0-h*B1)^2 (if I got the algebra right). This rises with B1,
and approaches infinity for h*B1 approaching C0, when B1 is taking up almost
all the available resource. If at the same time, B2 is taking C2 of the
same resource, the effective C0 for behaviour B1 is (C0-C2), and the
marginal cost for increasing B1 goes to infinity for B1 approaching (C0-C2)/h.
So competing reinforcements would lead to stabilization at different values
of behaviour. Whether this is what "matching" is about, I cannot say.

You are treating only the "values" of the reinforcers R1 and R2, when you say

B1 produces reinforcers at a rate R1 = k1* B1 and B2 produces them at
a rate R2 = k2*B2. In a linear system,
therefore, if an increase in B2 implies a corresponding decrease in B1,
the net change in reinforcement R2 goes as (k2-k1)*B2.

I don't think the cost function comes into this way of looking at the
cause-effect relation, but if you look at it the other way around as if the
reinforcement caused the behaviour:

B1 = R1/k1

then you can put the cost function with R1, since the value of a reinforcer
is a net value rather than a gross value. If it costs more to get a reinforcer
than the reinforcer is worth, it won't act as a reinforcer (or so I
understand the argument to run). It will be a "punisher". R1 is the
"intrinsic" value of a reinforcer--which itself is presumably a function
of how much of the reinforcer the subject already has, though that's another
story I'm not proposing to get into. So we can write:

B1 = (R1 - C'1)/k1 where R1 is now the value of getting one reinforcer,
and C'1 is the cost of getting it, which is C1/(C0-C1):

As B1 increases, there comes a point where R1 = dC'1/dB1 = h*C0/(C0-h*B1)^2
The marginal cost of getting a reinforcer has grown to equal the value
of getting it. This is where the runaway stops.

This is more or less what I assumed Saunders to mean by the competition
among behaviours and how it affects the stability of behaviour. With
an infinite resource supply, and therefore zero cost per unit of behaviour,
the system would be as you say. But the cost of the last unit of resources
cannot be the same (for the organism as a whole) as the cost of the first
unit.

If he didn't mean that or something like that, then I retract. And as usual
when I get into algebraic manipulations, I wouldn't be at all surprised to
find errors of sign or missing scale factors somewhere or other. But the
argument is straightforward, regardless.

Martin