Killeen Model

[From Bruce Abbott (960122.2115 EST)]

Bill Powers (960122.0930 MST) --

Nice exposition of Killeen's forward organism equation, Bill. Some comments
follow.

                aR
(2) B' = Bmax* ------
              aR + 1

So this is the forward equation of the organism, showing how behavior
rate depends on reinforcement rate.

This isn't the form Killeen uses, but it's the same equation. It says
that at low reinforcement rates (aR << 1), the behavior rate is
proportional to the reinforcement rate, and at high reinforcement rates
(aR >> 1), B approaches Bmax.

Actually, Killeen _does_ present the equation in this form on some occasions
(using k for Bmax; see equation 3 in the 1995 _JEAB_ article). The product
aR is what Killeen defines as A, or "arousal," in the 1995 paper.

If we multiply the original relationship between B' and R by zeta, we
get

                      aR
(2b) B' = zeta*Bmax* ------
                    aR + 1

This, supposedly, is the final organism equation showing how behavior
rate depends on rate of reinforcement. However, this equation is a
function of N (which is part of zeta), so there is a different organism
equation for every different ratio: both B and R are functions of N.
Killeen says nothing about how the organism detects the value of N.

This is where all the stuff about some representation of the responses being
added into short-term memory comes into play. As each "response" occurs
(whether "target" or "other"), its representation is added to STM; and the
representations of previously experienced responses "decays" to some
fraction of their previous values. The current representation of the
"target" response is the sum of all these decaying representations. When a
target response is finally followed by an incentive, the current
representations of all responses (both target and other) are "associated"
with the incentive delivery according to the current strengths in memory.
The strength of this association is reflected in the "coupling" between the
incentive and the various responses, both target and nontarget. Killeen
says that the incentive delivery, by filling STM with representations of the
consummatory responses, tends to effectively remove the target and other
responses from STM so that their representations are essentially zero after
the incentive has been dealt with.

If the STM representations decay at rate beta and STM begins each new ratio
run essentially reset to zero, and if all responses in the behavior stream
are target responses, then the current representation of the target response
in memory after the jth response is M = beta*[(1-beta)^(1-j)]. The coupling
between incentive and target responses is the value of M when the incentive
is delivered, which on the FR schedule is on the Nth response. There is
thus no requirement that the organism "detect" the value of N; the incentive
just happens on the Nth response (because of the schedule function) and the
state of the organism at that moment determines the coupling.

M is multiplied by rho, the proportion of responses that are target
responses, to give the steady-state value of zeta, the coupling coefficient.
By entering target response values as 1.0 and nontarget response values as
0.0, one can arrive at the value of zeta directly in the iterative version
of the equation for M, which is just our old buddy the leaky integrator:

   M = beta*y + (1 - beta)*M,

where M is initialized at zero after each incentive delivery and y is either
1.0 or 0.0 depending on whether the current "response" is or is not a target
response.

The real problem for Killeen's scheme is a familiar one: defining
"responses." Killeen tells us how to compute the "representation" of the
response but does not really tell us what a "response" is. He seems to view
behavior as a continuous stream that may or may not include the instrumental
act ("target" response) but does not tell us how to parse that stream. And
like most behaviorists he seems to take the response as some repeatable (and
repeated) movement or sequence of movements, apparently unaware of the
difficulties variable environmental disturbances pose for such an analysis.
So long as one is dealing with a repeating _consequence_ of varying
behavioral acts, one can treat all acts having this consequence as instances
of the "same" target response, perhaps even narrowing the definition a bit
by observing the minimal repeating acts required to produce the consequence.

Thus, once the pigeon is standing within a comfortable striking distance of
the key, "responses" can be defined as the cycle of forward head-thrust,
leading to the striking of the key with sufficient force to close the
switch, and the return movement that carries the head back to its initial
position. But this analysis fails to explain why the pigeon moves to the
key in the first place and would use a different set of muscles to strike
the key from a different angle if required to do so by the placement of some
partial obstruction between the pigeon and the key.

The basic notion behind Killeen's scheme is that organisms will tend to
repeat whatever they were doing at the time an incentive suddenly appears,
with the "whatever" being encoded as a series of act memories which are
rapidly fading as new acts enter STM. The notion that they will "do again"
that which they "did before" seems a reasonable one, but
"that-which-is-done" needs to be redefined in terms of reference signals of
control systems at appropriate levels. The pigeon is pecking the key, not
because some incentive has forced it to emit a forward-thrust of the head,
but because it wants to repeat the act of striking the key with its beak.

An important theoretical question to resolve is how the delivery of the
incentive leads to the establishment of this reference.

Regards,

Bruce

[From Bruce Abbott (960123.1005 EST)]

Bruce Abbott (960122.2115 EST)

If the STM representations decay at rate beta and STM begins each new ratio
run essentially reset to zero, and if all responses in the behavior stream
are target responses, then the current representation of the target response
in memory after the jth response is M = beta*[(1-beta)^(1-j)]. The coupling

I'm sure having trouble typing in formulas lately. This one should be:

  the sum from i = 1 to j of beta*[1-beta)^(i-1)], which sums to:

  M = 1-(1-beta)^j-1

If beta = 0.10, then after 5 responses,

  M = 1-(1-0.10)^4 = 1-(0.9)^4 = 1-0.6561 = 0.3439.

zeta is then rho*M. Asymptotically in ratio schedules, rho = 1.0. On an
FR-5 schedule, zeta will equal 0.3439 at the time of incentive delivery.

RIck Marken (960122.2130) --

The pigeon is pecking the key, not because some incentive has forced
it to emit a forward-thrust of the head, but because it wants to repeat
the act of striking the key with its beak.

Actually, it probably wants to repeat the _perception_ of striking the
key; and it probably wants to repeat it so that it can keep perceiving
the "incentive" at the desired level. But these are empirical, not
theoretical, possibilities.

No, it wants to repeat the act. It doesn't know the difference between
repeating the act of striking the key and repeating the perception of
striking the key. Perception IS reality.

Your statement about these being empirical, not theoretical, possibilities
needs some explanation. Until they are empirically confirmed, do they not
remain conjecture and thus theoretical? And isn't that at the heart of this
discussion -- which theory best explains the observations?

An important theoretical question to resolve is how the delivery of the
incentive leads to the establishment of this reference.

The important theoretical question to resolve, I believe, is
_whether_ delivery of the incentive leads to the establishment
of a reference for perceiving the key struck. The theoretical answer,
I believe, will be an emphatic (and maddening) "no"; the delivery
of the incentive does not "lead" to the establishment of a reference
for perceiving the key struck (if this reference exists); rather, the
reference for perceiving the key struck is established (by the
reorganization control system) because it allows control of the
delivery of the "incentive".

Let me see if I understand you correctly. You are asserting (sans evidence)
that the emergence of the correct output for control of incentive delivery
(reference for the pecking system, etc.) is unrelated to the appearance of
the incentive (over which the pigeon would like to establish control)
immediately following a keypeck. I take it that you are wedded to the
random reorganization notion. Fine, but keep in mind that it is only one
theoretical possibility among several, and that these are not mutually
exclusive.

Regards,

Bruce

[From Rick Marken (960123.0800)]

Bruce Abbott (960123.1005 EST) --

Me:

Actually, it probably wants to repeat the _perception_ of striking the
key

Bruce:

No, it wants to repeat the act. It doesn't know the difference between
repeating the act of striking the key and repeating the perception of
striking the key. Perception IS reality.

I assumed that you used the word "act" as we use it in PCT; to refer to
the unperceived (and necessarily variable) outputs that influence the state
of a controlled perception. A control system varies its acts to produce the
desired perception. I presume that a pigeon pecking at a key wants to repeat
the perception of a struck key; it does not want to repeat the acts that
produce this result on any particular occasion. The acts (variations in
muscle forces) that produce the perception of a struck key must be different
each time the pigeon strikes the key since it is in a slightly different
orientation with respect to the key each time; organisms repeat (control)
perceptions (ends); not acts (means).

Bruce:

An important theoretical question to resolve is how the delivery of the
incentive leads to the establishment of this reference.

Me:

the delivery of the incentive does not "lead" to the establishment of a
reference for perceiving the key struck (if this reference exists); rather,
the reference for perceiving the key struck is established (by the
reorganization control system) because it allows control of the delivery of
the "incentive".

Bruce:

Let me see if I understand you correctly. You are asserting (sans evidence)
that the emergence of the correct output for control of incentive delivery
(reference for the pecking system, etc.) is unrelated to the appearance of
the incentive

Not quite. I'm am saying what Bill Powers (960123.0600 MST) said to you --
only more politely;-) Bill said:

Aw, Bruce! How about the theoretical question of what establishes
the reference for the incentive itself, turning an ordinary sensory
experience into a controlled variable?

That's because he apparently understood your initial statement that
"incentive leads to the establishment of this reference" the same way I did;
as a statement about the causal influence of an environmental (or sensory)
variable (the "incentive") on behavior (the reference). In fact, the variable
that conventional psychologists call an "incentive" or "reinforcer" is
actually a controlled variable; it doesn't "lead to" anything (it's part
of a closed loop); rather, the perceptual representation of that variable
is under control.

So your "important theoretical question" (about how the delivery of the
incentive leads to the establishment of this reference) is only important
from a reinforcement theory perspective. From a control theory perspective,
it's neither important nor meaningful.

Best

Rick

[From Bruce Abbott (960123.1305 EST)]

Rick Marken (960123.0800) --

Bruce Abbott (960123.1005 EST)

No, it wants to repeat the act. It doesn't know the difference between
repeating the act of striking the key and repeating the perception of
striking the key. Perception IS reality.

I assumed that you used the word "act" as we use it in PCT; to refer to
the unperceived (and necessarily variable) outputs that influence the state
of a controlled perception. A control system varies its acts to produce the
desired perception. I presume that a pigeon pecking at a key wants to repeat
the perception of a struck key; it does not want to repeat the acts that
produce this result on any particular occasion. The acts (variations in
muscle forces) that produce the perception of a struck key must be different
each time the pigeon strikes the key since it is in a slightly different
orientation with respect to the key each time; organisms repeat (control)
perceptions (ends); not acts (means).

Here we go again with vocabulary. I thought that the preferred descriptor
for behavioral output was _action_, not act. In my definition, an _act_ is
the (usually intended) perceptual consequence of behavior. For example,
pecking the key is an act (which can be accomplished by means of a variety
of somewhat different actions), and so is drawing a circle in the air. I
can do the latter in many ways and even against various disturbances if they
are not too severe. Why say that "I want to perceive my finger-tip moving
in a circular pattern in the air" when "I want to draw a circle in the air"
will communicate the same in ordinary English? I can not only perform the
act, I can perceive (through a variety of sensors) what I have done.

Let me see if I understand you correctly. You are asserting (sans evidence)
that the emergence of the correct output for control of incentive delivery
(reference for the pecking system, etc.) is unrelated to the appearance of
the incentive

Not quite. I'm am saying what Bill Powers (960123.0600 MST) said to you --
only more politely;-) Bill said:

Aw, Bruce! How about the theoretical question of what establishes
the reference for the incentive itself, turning an ordinary sensory
experience into a controlled variable?

That's because he apparently understood your initial statement that
"incentive leads to the establishment of this reference" the same way I did;
as a statement about the causal influence of an environmental (or sensory)
variable (the "incentive") on behavior (the reference). In fact, the variable
that conventional psychologists call an "incentive" or "reinforcer" is
actually a controlled variable; it doesn't "lead to" anything (it's part
of a closed loop); rather, the perceptual representation of that variable
is under control.

If that is the case, then neither of you understood my statement as I
intended. There is a difference between "causal influence" (read S-R) and a
relationship perceived by the organism and made use of by that organism in
its attempts to establish control of a perceptual representation over which
it currently lacks such control. I'm talking about the latter.

So your "important theoretical question" (about how the delivery of the
incentive leads to the establishment of this reference) is only important
from a reinforcement theory perspective. From a control theory perspective,
it's neither important nor meaningful.

Not true if you understand my statement as I intended it to be understood.

From a control theory perspective, it is both meaningful and important; not

only that, it is unresolved.

Or then again, perhaps you have the answer. Why is it that, having pecked
the key and perceived that grain becomes immediately available thereafter,
the pigeon, after eating from the hopper until the hopper drops out of
reach, returns to the key and pecks it again? Assume that, prior to the
conjunction of these perceptual events, the pigeon did not "know" how to
control the hopper, and thus the incentive.

Regards,

Bruce