[From Bill Powers (950704.0100 MDT)]

Bruce Abbott --

Here is the operant conditioning model I developed and posted last year,

or perhaps it was two years ago, to fit the Motherall data (and, I hope,

other data as well). The following development expresses the

relationships entirely in terms of observable variables in EAB terms,

without any model of the internal organization of the animal. I have

added a "value" parameter and a "relative time" parameter; in the model

I posted last fall, I lumped all the gain parameters into one, but the

effect is the same.

## ···

----------------------------------------

The satiation level of reinforcement rs can be defined as that level of

average obtained reinforcement ra at which the average behavior ba will

just go to zero. When ra is less than rs, average behavior ba will occur

at a rate proportional by km to to the difference (this is the

"motivation due to deprivation"):

ba = km*(rs - ra)

The average obtained reinforcement is the average of the feedback

function ff of the behavior measure plus the noncontingent reinforcement

rn (in my model, rn was zero):

ra = average(ff(ba) + rn)

The "value" kv of the reinforcer to the organism is measured by an

increase in behavior due to increasing some aspect of the reinforcer

such as its size. This requires kv to be used in a position where

increasing kv will increase average behavior:

ba = kv*km*(rs - ra)

The measured average amount of behavior depends on the fraction of the

time kt that the animal actually spends performing the behavior b, as

opposed to some other behavior, where kt varies from 1 (full-time

behavior) to 0 (none of the time spend on the behavior). As kt decreases

from 1, the apparent average behavior rate also decreases. Thus kt must

also appear in the expression for b:

ba = kt*kv*km*(rs - ra)

If the average behavior ba consists of continuous behavior bc for a

time tc alternating with zero behavior for a time tz, the

continuous behavior bc can be obtained from ba by

bc = [(tc + tz)/tc]*ba, tc > 0.

This gives us the main two system equations:

(1) ba = kt*kv*km*(rs - ra)

(2) ra = average(ff(ba) + rn)

Note that rn will tend to reduce behavior, by increasing ra and thus

reducing the motivation.

---------------------------------------------

These equations apply to the right-hand side of the Motherall curve. As

we move left on that curve, the observed behavior rate begins to fall

below the straight line predicted by the above equations, with the curve

turning downward. One possible explanation is that as the average

motivation ma increases beyond some critical value, the animal begins to

spend more time on other behaviors, thus reducing the value of kt.

Average motivation ma is defined as

ma = (rs - ra)

A linear point-slope model for the effect of ma on kt can be defined as

(3) kt = kt0 - k1*(ma - mc), ma >= mc

where mc is a critical amount of motivation.

When this expression is used for kt in equation (1) and the constants

are properly adjusted, the model fits the Motherall data (for body

weight equal to 80% of normal) over the entire range of ratios. Other

mathematical forms may give nearly the same results.

--------------------------------------

The loop gain of this control system is kt*kv*km*(partial of ra with

respect to ba). The output sensitivity is kt*kv*km, which I treated as a

single gain constant in the posted model.

This model makes average behavior a two-valued function of average

reinforcement, with a maximum point. Left of the maximum point, average

behavior increases as average reinforcement increases. Right of the

maximum point, average behavior decreases as average reinforcement

increases, reaching zero when the average reinforcement reaches the

satiation level. The effects of costs of behavior can be absorbed into

km if they are assumed linear.

The position of the maximum of the curve depends on the output

sensitivity kt*kv*km, the critical motivation value mc, and the

parameter k1.

For any range of schedules of reinforcement and constant values of k1

and mc, the output sensitivity can be adjusted so that behavior rises

with reinforcement over the whole range, decreases with reinforcement

over the whole range, or first rises and then decreases with

reinforcement as in the Motherall data. In general, decreasing any

output sensitivity factor (kt, kv, or km) will move the operating region

toward the condition of increased reinforcement going with increased

behavior; the peak of the curve will move to the right.

So, for example, by decreasing the "value" of the reinforcer (as by

decreasing its size), the behavior can be made to rise monotonically

with increases in reinforcement over the whole range of schedules. This,

I propose, is how the general rule of "more reinforcement, more

behavior" was initially established, and how, in fact, the concept of

"reinforcement" gained credence. As long as the product kt*kv*km is kept

low enough, this rule will apply. It is always possible, therefore, to

set up an experiment to prove that an increment of reinforcement will

cause an increment of behavior. All that is required is to keep the

product kt*kv*km small enough, which can be done by manipulating kv as

by reducing the size of the reinforcer.

When this relationship is not the primary subject, a large reinforcer

can be used (large value of kv), moving the peak of the curve far to the

left. Then as the schedule of reinforcements, starting with an easy

schedule such as FR-1, is changed to produce less and less average

reinforcement per unit behavior, the behavior will rise as the

reinforcment decreases, leading to very large values of the behavior at

low rates of obtained reinforcement. This is how Skinner demonstrated

shaping the pecking behavior of pigeons to very high rates. He did not

seem to notice that the "normal" relationship of reinforcement rate to

behavior rate had reversed.

-----------------------------------------

In this way of developing the model, we accept certain relationships

without explaining why they hold -- for example, why excessive levels of

motivation lead to spending less of the total time on the behavior in

question. We accept the satiation level as given and fixed. In a more

complete model, we would try to relate a variable satiation level to,

for example, weight gain and loss, and to relate excessive motivation to

the commencement of a trial-and-error search for other sources of

reinforcements. The latter consideration, starting with the search in

progress, would come to explain the apparent "selective" effects of

reinforcement.

-----------------------------------------

All of this, as you know, is pure PCT. Several phenomena that are new

may be explained by this model, particularly the shift of the peak of

the Motherall curve with changes in certain parameters. The apparent

effect of reinforcement on behavior is explained in terms of the model,

with the "standard" effect appearing only over a certain range of

parameters. The reinforcement variable does not play any special role in

behavior other than the apparent one; it does no "maintaining" of

behavior, although the observed relationships can be interpreted in that

way. Causation is completely circular, with the only independent

variables being rs and rn: the satiation level of reinforcement and the

noncontingent reinforcers.

Does this bear any resemblance to the model you're working on?

-----------------------------------------------------------------------

Best,

Bill P.