[From Bruce Abbott (2000.11.25.1125 EST)]
Bill Powers (2000.11.25.0116 MST) --
I see you're receiving congratulations on your reply even before I've had a
chance to respond to it. That's disappointing, as it shows that at least
some CSGneters have their minds already made up and have taken up
cheerleading rather than serious consideration of my argument. But for
those who have managed to stay open to the argument, here is my return salvo:
Bruce Abbott (2000.11.24.1410 EST)
... you need to be made aware that you really don't
understand how EABers think of reinforcement.
I'm basing my comments on Fred's explanation. Perhaps you would want to
modify it, but here it is:
Fred explained the system's operation as follows. The little bursts of heat
produced when the coal dust ignited, he said, were serving as reinforcers
for relay action. However, these reinforcers became effective only after the
house had cooled below a particular value; cooling the house thus served as
an "establishing operation" for reinforcer effectiveness. The system
generates responses, Fred stated, in the form of the pulses sent to the
currently selected injector. Those responses repeat as long as they
continue to be reinforced (by heat flashes) at least once every few
responses. However, if reinforcement is not delivered after a number of
responses, this behavior extinguishes and the system begins to vary its
actions (the stepper operates), selecting new responses (commanding
different injectors) until reinforcement once again occurs. As the room
heats up, the reinforcers lose their effectiveness (satiation) and
responding stops.
These are theoretical interpretations interspersed with observations of
what happens. The use of terms like "response," "reinforcement," and
"extinction" introduce Fred's theory about what is happening. Let's see if
we can separate the interpretation from the description.
What is observed is that activations of the igniter (not "responses")
generate heat flashes. We observe a physical mechanism by which the puffs
of coal dust and electric sparks cause a flash of heat. These flashes,
repeated, are seen to warm the house. When the temperature of the house
rises to some specific level, the activation of the igniter ceases. If the
igniter fails, when repeatedly activated, to cause a flash, nothing changes
immediately, but if the flash fails to occur for some time, causing the
room temperature to drop significantly, a new behavior occurs: new injector
nozzles (and ignitors, I guess) are rotated into place. This behavior
continues until the house temperature begins to rise again because a
working nozzle is in place. Then we revert to the previous behavior: a
(small) decrease in temperature results in activations of the igniter, and
an increase is followed by cessation of the repeated activations.
I believe this is an unbiased description of what is observed, favoring
neither reinforcement theory nor control theory. You may want to correct it
if I have misrepresented your intention, but I'll go on.
I agree that this is a description of what Fred would have observed just by
watching the system in action. But Fred did not arrive at his description
of the system's operation just by watching the system work. (Remember, I
said that Fred was something of a tinkerer himself.) This is extremely
important (see below).
Now Fred's embellishments of this bare description:
1. Something is "responding" to some unnamed stimulus by sending impulses
to the igniter. This unnamed stimulus is not itself observable; all we
observe is that there are impulses sent to the igniter, and that they cause
flashes of heat.
Fred did not like the term "response" because it implied a stimulus to
produce it. I should have used Fred's substitute term "operant." That term
defines behaviors in terms of their observable consequences in the
environment, and does not suggest a stimulus cause. (I'll substitute
"operant" for "response" in the remainder of this post.) The operant in
this analogy is sending an impulse to a particular injector.
2. The fact that the impulses continue to be emitted is interpreted as
meaning that the flashes of heat are "reinforcing" the responses to the
unnamed stimuli. This reinforcing action can't be observed; all we see is
that the igniter continues to operate.
Why do you assume that Fred was not allowed to tinker with the machine --
that he must simply observe the system during normal operation, without
being allowed to perform any tests that might help him to discover the
system's basic operating principles?
A careful experimenter, Fred tried covering the sensor in the firebox.
After a few more injection cycles, the system stopped firing the currently
selected injector, the stepper operated, and impulses were sent to the new
injector. After a few more impulses were delivered to the new injector, the
stepper operated again -- and this continued as long as the sensor was
covered. As soon as the sensor was uncovered and a flash occurred, this
reselection stopped, and the currently selected injector was repeatedly sent
impulses. Fred concluded that receiving the heat-flash stimulus at least
once following every few impulses is a necessary condition for maintaining
the current operant -- sending impulses to the presently selected injector.
3. When the nozzle quits working, the "responding" is no longer
"reinforced," so it "extinguishes." This does not imply, however, that
impulses cease to be sent to the igniter. If they were, changing nozzles
would do no good: there would never be another flash of heat even with a
working nozzle in place.
Let's not destroy the analogy by redefining what Fred called operants.
Operants are impulses to a particular injector, not just impulses period.
Sending impulses to a particular injector ceased. If you extinguish a
flame, it ceases to exist. If you stop sending impulses to an injector,
impulses to that injector cease to exist. The term "extinction" simply
denotes what is observed -- the frequency of the operant declines (in this
case to zero).
4. When reinforcement ceases," ... the system begins to vary its
actions (the stepper operates), selecting new responses (commanding
different injectors) until reinforcement once again occurs." Of course it
is not reinforcement that is observed to occur, but flashes of heat. Their
reinforcing effect is imagined.
Not so! The reinforcing effect is not imagined, it is observed. Fred
demonstrated that the flashes of heat had to be received by the sensor or
the current operant would cease to occur. When those flashes reappeared,
the currently selected operant continued to be repeated. This is a matter
of observation, not imagination.
5. Note that since the emission of impulses must continue if a flash of
heat is to occur, what is reinforced becomes uncertain. The only observable
result of reappearance of flashes of heat is that new nozzles cease to be
rotated into place. Emission of impulses must already have been occurring,
and it continues as long as the room temperature is low enough.
Again, impulses are not analogous to operants in this story. Impulses are
just activity that can operate on the environment in various ways: press
lever A, press lever B, touch nose to upper left front corner of chamber.
What is reinforced is whatever operant is followed by the flash. There is
nothing uncertain about it.
6. "As the room heats up, reinforcers lose their effectiveness (satiation)
and responding stops." Satiation is introduced as an explanation of why the
igniter ceases to be operated when the temperature rises to a certain
level. What is observed is not a loss of effectiveness of the reinforcers,
of course, but cessation of igniter operations.
Flashes following operants no longer keep the current operant selected.
Here my analogy breaks down a bit as a real organism is more complicated
than the furnace analogy would suggest. In the real organism, once
sufficient food had been taken in, the delivery of food pellets for
lever-pressing would no longer keep the rat returning to the lever, and
other activities with different consequences would take the place of
lever-pressing. One would observe that this particular consequence was no
longer keeping lever-press behavior going. One observes that this cessation
of the effectiveness of this consequence is related to the amount of food
the animal has consumed.
In the furnace analogy, the system only has a limited "repertoire"
consisting of sending impulses to one or another of several injectors, so it
has no choice but to continue selecting alternative routes to the same
environmental consequence (heat flashes), which is not the case with the
real organism.
I think it's clear that Fred's description of what is happening is not a
description, but a theory that invokes unobservable factors and influences
(as all theories do). A purely factual description does not require the use
of terms like reinforcement, response, and extinction: ordinary English
will do it (for an English-speaker).
If all you have been arguing about is whether there is such a thing as
theory-free observation, we could have saved everyone's time. No, there
isn't. But what happened to trying to understand how the terms of Fred's
explanation relate to the system's components and their operation? That was
supposed to be the topic under discussion. I'm not going to let you wriggle
out of it by diverting the debate to a different topic.
Come on, Bill, it isn't all that difficult. The model I've presented could
hardly be clearer in demonstrating how the consequences of certain behaviors
relate to the selection of those behaviors for repetition over others, and
the continued repetition of those behaviors (maintenance) so long as those
consequences continue to follow those behaviors and so long as those
consequences continue to be desired. And guess what, the process I've
described comes down to two control loops, one nested inside the other.
Bruce A.