[From Bruce Abbott (960122.2115 EST)]
Bill Powers (960122.0930 MST) --
Nice exposition of Killeen's forward organism equation, Bill. Some comments
follow.
aR
(2) B' = Bmax* ------
aR + 1
So this is the forward equation of the organism, showing how behavior
rate depends on reinforcement rate.
This isn't the form Killeen uses, but it's the same equation. It says
that at low reinforcement rates (aR << 1), the behavior rate is
proportional to the reinforcement rate, and at high reinforcement rates
(aR >> 1), B approaches Bmax.
Actually, Killeen _does_ present the equation in this form on some occasions
(using k for Bmax; see equation 3 in the 1995 _JEAB_ article). The product
aR is what Killeen defines as A, or "arousal," in the 1995 paper.
If we multiply the original relationship between B' and R by zeta, we
get
aR
(2b) B' = zeta*Bmax* ------
aR + 1
This, supposedly, is the final organism equation showing how behavior
rate depends on rate of reinforcement. However, this equation is a
function of N (which is part of zeta), so there is a different organism
equation for every different ratio: both B and R are functions of N.
Killeen says nothing about how the organism detects the value of N.
This is where all the stuff about some representation of the responses being
added into short-term memory comes into play. As each "response" occurs
(whether "target" or "other"), its representation is added to STM; and the
representations of previously experienced responses "decays" to some
fraction of their previous values. The current representation of the
"target" response is the sum of all these decaying representations. When a
target response is finally followed by an incentive, the current
representations of all responses (both target and other) are "associated"
with the incentive delivery according to the current strengths in memory.
The strength of this association is reflected in the "coupling" between the
incentive and the various responses, both target and nontarget. Killeen
says that the incentive delivery, by filling STM with representations of the
consummatory responses, tends to effectively remove the target and other
responses from STM so that their representations are essentially zero after
the incentive has been dealt with.
If the STM representations decay at rate beta and STM begins each new ratio
run essentially reset to zero, and if all responses in the behavior stream
are target responses, then the current representation of the target response
in memory after the jth response is M = beta*[(1-beta)^(1-j)]. The coupling
between incentive and target responses is the value of M when the incentive
is delivered, which on the FR schedule is on the Nth response. There is
thus no requirement that the organism "detect" the value of N; the incentive
just happens on the Nth response (because of the schedule function) and the
state of the organism at that moment determines the coupling.
M is multiplied by rho, the proportion of responses that are target
responses, to give the steady-state value of zeta, the coupling coefficient.
By entering target response values as 1.0 and nontarget response values as
0.0, one can arrive at the value of zeta directly in the iterative version
of the equation for M, which is just our old buddy the leaky integrator:
M = beta*y + (1 - beta)*M,
where M is initialized at zero after each incentive delivery and y is either
1.0 or 0.0 depending on whether the current "response" is or is not a target
response.
The real problem for Killeen's scheme is a familiar one: defining
"responses." Killeen tells us how to compute the "representation" of the
response but does not really tell us what a "response" is. He seems to view
behavior as a continuous stream that may or may not include the instrumental
act ("target" response) but does not tell us how to parse that stream. And
like most behaviorists he seems to take the response as some repeatable (and
repeated) movement or sequence of movements, apparently unaware of the
difficulties variable environmental disturbances pose for such an analysis.
So long as one is dealing with a repeating _consequence_ of varying
behavioral acts, one can treat all acts having this consequence as instances
of the "same" target response, perhaps even narrowing the definition a bit
by observing the minimal repeating acts required to produce the consequence.
Thus, once the pigeon is standing within a comfortable striking distance of
the key, "responses" can be defined as the cycle of forward head-thrust,
leading to the striking of the key with sufficient force to close the
switch, and the return movement that carries the head back to its initial
position. But this analysis fails to explain why the pigeon moves to the
key in the first place and would use a different set of muscles to strike
the key from a different angle if required to do so by the placement of some
partial obstruction between the pigeon and the key.
The basic notion behind Killeen's scheme is that organisms will tend to
repeat whatever they were doing at the time an incentive suddenly appears,
with the "whatever" being encoded as a series of act memories which are
rapidly fading as new acts enter STM. The notion that they will "do again"
that which they "did before" seems a reasonable one, but
"that-which-is-done" needs to be redefined in terms of reference signals of
control systems at appropriate levels. The pigeon is pecking the key, not
because some incentive has forced it to emit a forward-thrust of the head,
but because it wants to repeat the act of striking the key with its beak.
An important theoretical question to resolve is how the delivery of the
incentive leads to the establishment of this reference.
Regards,
Bruce