Feeding finches

[From Bruce Abbott (2000.12.02.1200 EST)]

Bill Powers (2000.11.02.0351 MST) --

Bruce Abbott (2000.12.01.1035 EST)

Nobody has made the claim that all operants in
the same functional class are equally reinforced when one of them is.
(Sheesh!) Membership in the same functional class means they produce the
same end (e.g. they all get the lever down). If you arrange to reinforce a
lever-press, then any act that accomplishes this will be reinforced, but of
course only the act that actually occurs will be marked for repetition.

Double sheesh. What does it mean to speak of "all operants in the same
functional class?" I thought an operant _is_ a functional class, the class
of all actions that have the same effect. Is there some super-class made up
of operants, each operant being a class of actions having the same effect?

_Double_ sheesh? Ouch!

I assume that "reinforcing a lever press" is shorthand for a more precise
meaning, such as "reinforcing the tendency of the organism to do something
that causes the lever to be pressed." You EABers have got to learn how to
talk intelligibly with the outside world.

Well, I'm _trying_ to, if that's any consolation . . .

Perhaps an analogy will help. Consider the "bird feeder." It's called a
bird feeder because it contains food (e.g., seeds) that many birds like and
has other features designed to make it easy, safe, and attractive for birds
to use. There exists a class of birds that this particular sort of feeder
will attract and feed from -- say, finches. Functionally, then, this is a
finch feeder, not a robin feeder or a pelican feeder or an eagle feeder.
Any finch that comes to it will find an attractive meal there. What the
feeder attracts is a class of birds called "finches." When we talk about
the feeder, one finch is functionally the same as any other -- a member of
the class of birds that will use this feeder.

Obviously this does not imply that feeding one finch at the feeder feeds all
members of the class. If we want to talk about why one finch is feeding
there and not some others, then we have to switch our focus from the general
class "finch" to specific individual members. Some have flown over the
feeder, spotted it, and come to investigate. Finding suitable food there,
they now return there when they are hungry. Others have not spotted the
feeder and continue on their ways to other places, even when hungry.

So if you want to talk about what sorts of individuals will be attracted to
the feeder, then all individuals of the class "finch" are functionally the
same. If you want to talk about what individuals from within that class are
actually feeding there, then you have to treat individual members of that
same general class as different. Thus, in the operant conditioning study,
all acts that get the lever pressed are functionally the same operant when
you are talking about what these acts have in common that they accomplish,
but they must be treated as different operants when your focus is on
determining why certain individual members of the class, as opposed to
others, become the dominant ones the rat actually uses to accomplish the act.

Bruce A.

[From Bill Powers (2000.12.03.0330 MST)]

Bruce Abbott (2000.12.02.1200 EST)--

So if you want to talk about what sorts of individuals will be attracted to
the feeder, then all individuals of the class "finch" are functionally the
same.

If you want to talk about what sorts of individuals will be attracted to
the feeder, you've gone back to Aristotle. Informal language like this
(even though it once represented the pinnacle of technical understanding)
is of no use in a modern scientific discussion (I'm not averse to the other
kinds of discussion, but I like to keep them distinct from each other).
Appearances often deceive, and while it may appear that a container of
bird-seed exerts some sort of force on birds bringing them toward it, and
then feeds the seed to them, it is much more likely that the birds exert
the forces on their own bodies with their own wings, and that(relatively
speaking) they use the wings to bring bird seed, wherever they find it,
close enough to be eaten. If they didn't actively eat the food, the
"feeder" wouldn't feed it to them, would it? In informal language, we often
say things we don't mean.

If you want to talk about what individuals from within that class are
actually feeding there, then you have to treat individual members of that
same general class as different. Thus, in the operant conditioning study,
all acts that get the lever pressed are functionally the same operant when
you are talking about what these acts have in common that they accomplish,
but they must be treated as different operants when your focus is on
determining why certain individual members of the class, as opposed to
others, become the dominant ones the rat actually uses to accomplish the act.

The appearance is that the rat's behavior becomes focused on the
manipulandum after and because it produces a food reward. That appearance
is created because that is where the rat had focused its behavior just
before the changes in the focus of its behavior ceased. Given this
appearance, it is not unreasonable, although it's incorrect, to assume that
the food somehow did something to the rat to make it repeat the behavior
that produced the food, interrupting or superseding the behavior that
otherwise would have occurred next. This is simply a figure-ground mistake,
just as it would be a mistake to assume that the little man sawing wood on
the lawn ornament is making the windmill go around and thus causing the
wind, or that the reaction force from the driving wheels of a car is making
the engine turn. It reflects a mistake in the assumption of a mechanism
(whether explicit or implicit).

It's especially easy to make a mistake in assigning causation when a
closed-loop system is involved. In a closed-loop control system, the only
true causes (independent variables) are disturbances d and reference
signals r; both the input quantity and the output quantity are functions of
d and r and thus are dependent variables. When you solve the control
equations for the input quantity, the output quantity drops out, leaving
only d and r as variables. Ditto when the equations are solved for the
output quantity: the input quantityh drops out. This encourages the
classical "third party" error: assuming a cause-effect relationship between
A and B when in fact both are being caused by C (or jointly by C, D, ...).

The dominant behavior that a rat ends up using is the first one that works
to get it what it wants (that is, that it has a reference level for and
does not already have). Given that the rat is actively seeking to get the
so-called reinforcer, and that its search is driven only by the lack of it,
it is not surprising that the search ceases when any source of the item is
found. Nor is it surprising, in a cage where there is only one source, that
the search ends in the vicinity of the food cup, or that it entails
performing the only action that will make food appear in the cup. The
animal would not stop its search for food until its actions made food appear.

Bruce, I know that you prefer the control-system explanation of behavior,
but what is hanging us up here is your insistence that the behaviorist's
interpretation is simply an objective report of what anyone would see
happening. It is not: it is a description of a subjective impression of
causality.

Look at what you said (speaking for behaviorists) about the language I used
a few posts back: that it smacked of "mentalism" and "spookiness." Why did
it seem to do that? Because it assumed _internal_ causes of behavior,
causes that can't be directly observed. And what is the language that does
_not_ speak of internal causation? It's the language of external causation,
which Skinner insisted upon for most of his life.

Skinner had the mistaken idea that a real scientist sticks strictly to what
he can observe; because of this, he sneered at anyone who spoke of inner
causes of behavior, saying that such behavior would be unpredictable and
"capricious," not subject to natural law. In his ignorance of how real
science works, Skinner branded as unscientific and superstitious anyone who
proposed an internal cause of behavior, or any sort of "intervening
variable" such as a model of how behavior works. If Skinner had actually
been trained as a scientist -- say, a physicist or a chemist -- he would
have realized that all the successful sciences propose and test models of
what cannot be observed. And he would have realized that appearances often
deceive; without a systematic model, human intuition is untrustworthy. But
of all possibilities, that is the very last one that would have occurred to
Skinner: that he, a trained observer, could misinterpret what was before
his very eyes. Skinner was not prepared to admit that _all_ his
observations were subjective and thus fallible. If any one attitude showed
that he did not understand science, that was it.

We're going to have a disagreement as long as you find it reasonable to see
reinforcers as acting on the organism in some way to make more likely the
behavior that produced them. As long as that seems to you to be merely a
statement of an objective observation, you will have to continue defending
behaviorism.

Here's something to think about. Why would we need the concept of the
operant, if not to try to explain why an animal will sometimes switch to a
different means of producing an effect like a key press? That would be hard
to explain in terms of a pure reinforcement theory unless one somehow could
find a connection between the two actions, so that performing one of them
somehow made performing the other more likely. But that could occur only if
the reinforcer reinforced _all_ the actions capable of causing the key
press, even those which hadn't yet been tried. There's a logical
contradiction in here somewhere.

Best,

Bill P.

···

Bruce A.

[From Bruce Abbott (2000.12.05.2035 EST)]

Bill Powers (2000.12.03.0330 MST) --

Bruce Abbott (2000.12.02.1200 EST)

So if you want to talk about what sorts of individuals will be attracted to
the feeder, then all individuals of the class "finch" are functionally the
same.

If you want to talk about what sorts of individuals will be attracted to
the feeder, you've gone back to Aristotle. Informal language like this
(even though it once represented the pinnacle of technical understanding)
is of no use in a modern scientific discussion (I'm not averse to the other
kinds of discussion, but I like to keep them distinct from each other).
Appearances often deceive, and while it may appear that a container of
bird-seed exerts some sort of force on birds bringing them toward it, and
then feeds the seed to them, it is much more likely that the birds exert
the forces on their own bodies with their own wings, and that(relatively
speaking) they use the wings to bring bird seed, wherever they find it,
close enough to be eaten. If they didn't actively eat the food, the
"feeder" wouldn't feed it to them, would it? In informal language, we often
say things we don't mean.

Judging from your reply, I'm afraid you've missed the point of the analogy.

I offered this analogy because you and Rick had made the claim that I was
being inconsistent in my definition of an operant. At one time I had said
that all lever-presses are functionally the same operant (because they
produce the same consequence). You and Rick concluded from this statement
that when one means of pressing the lever is reinforced, the probability of
all means of pressing the lever should be equally affected. My apparent
inconsistency was noted when I objected to this conclusion, now claiming
that these different means of pressing the lever should be considered
different operants.

The bird-feeder analogy shows why there's nothing inconsistent about my
definitions: On the one hand, if you're going to reinforce lever-pressing,
then any act that gets the switch on the lever closed will be counted as the
same operant. All will be treated as functionally equivalent by the
apparatus: all will produce the reinforcer. On the other hand, if you ask
which of that class of acts that gets the lever pressed will actually have
its future rate of occurrence changed by reinfocement, then one has to
distinguish those members of the class that have actually occurred and been
reinforced from those that have not. (That lesson holds even if you rewrite
the story to eliminate the word "attract," which was the focus of your
objection to it.)

I had really expected you agree, after seeing the logic laid out for you,
that it not really inconsistent to categorize different acts as the "same"
operant when focusing on their common result (e.g., lever-depression), but
as "different" operants when focusing on which of those acts is actually
getting reinforced. If you said anything at all about that issue, I'm
afraid I missed it in your reply. May I have that agreement now?

Bruce A.

[From Bruce Gregory (2000.1205.2055)]

Bruce Abbott (2000.12.05.2035 EST)]

I had really expected you agree, after seeing the logic laid out for you,
that it not really inconsistent to categorize different acts as the "same"
operant when focusing on their common result (e.g., lever-depression), but
as "different" operants when focusing on which of those acts is actually
getting reinforced. If you said anything at all about that issue, I'm
afraid I missed it in your reply. May I have that agreement now?

May I assume that you agree that this approach predicts that a single way
of accomplishing any outcome will emerge since it will be consistently
reinforced? In other words, behavior will become stereotyped.

BG

[From Bruce Abbott (2000.12.05.2215 EST)]

Bill Powers (2000.12.03.0330 MST) --

The appearance is that the rat's behavior becomes focused on the
manipulandum after and because it produces a food reward. That appearance
is created because that is where the rat had focused its behavior just
before the changes in the focus of its behavior ceased. Given this
appearance, it is not unreasonable, although it's incorrect, to assume that
the food somehow did something to the rat to make it repeat the behavior
that produced the food, interrupting or superseding the behavior that
otherwise would have occurred next. This is simply a figure-ground mistake,
just as it would be a mistake to assume that the little man sawing wood on
the lawn ornament is making the windmill go around and thus causing the
wind, or that the reaction force from the driving wheels of a car is making
the engine turn. It reflects a mistake in the assumption of a mechanism
(whether explicit or implicit).

I disagree. The mistake being made is in the assumption that, when EABers
attribute the observed changes in behavior to the occurrence of the food
pellet, they mean by this that the pellet plays an active causal role in
changing the behavior. The pellet of food is just a pellet of food. It
can't do anything but lay there. However, observations reveal the following:

1. Initially, when a bit of the rat's behavior (depressing the lever) did
_not_ produce a food pellet, the rate at which that act occurred remained low.

2. When it was then arranged that the same bit of behavior now produced a
food pellet each time it occurred, the rate at which that act occurred
increased over the rate observed in (1).

3. When it was then arranged that the same bit of behavior once again did
_not_ produce a food pellet, the rate at which that act occurred diminished
to the level observed under (1).

Because nothing else changed except whether lever-pressing produced a pellet
or not, and because these same changes occur reliably every time these tests
are performed, it can be inferred with a high degree of confidence that it
is the delivery of the food pellet as a consequence of a lever-press that is
responsible for the observed increase in the rate of lever-pressing in (2),
and that the loss of this consequence is responsible for the observed
decrease in the rate of lever-pressing in (3). This can be stated with
confidence because nothing else has changed between conditions that could
explain the change in rate of lever-pressing. We can thus conclude that the
increased rate of lever-pressing in (2) relative to (1) and (3) is due to
the fact that in (2) the lever-press produces pellets, whereas in (1) and
(3) it does not.

There is no question here about whether the little man sawing wood is making
the windmill turn or vice versa. It is true that pressing the lever is what
produces food pellets. That is mechanical cause and effect. But the
demonstration shows that the rate of lever-pressing observed in (2) is
higher than the rates observed in (1) and (3) because only in (2) does
lever-pressing produce food pellets. This effect of the contingency between
lever-pressing and food-pellet delivery on response rate is termed
"reinforcement" and if delivering food-pellets for lever-presses has this
effect on lever-presses, then the pellets are said to serve as
"reinforcers." There is no assumed mechanical cause-and-effect in which the
pellet, through some sort of magical powers, forces the rat to emit a
lever-press in the way that the little man might make wind by sawing.
Instead, the term "reinforcement" simply refers to the observation that when
the rat's lever-presses do produce food pellets, the rate of lever-pressing
increases and remains higher than when lever-presses do not produce food
pellets.

Observation reveals that food pellets do not always serve as "reinforcers."
Providing them as a consequence of lever-pressing will not always yield an
increase in the rate of lever-pressing. Their effectiveness has been shown
to depend on a variety of conditions, including the level of food
deprivation, the health of the rat, and certain qualities of the pellets
themselves. One can easily determine experimentally what these conditions
are by varying conditions and observing how much the rate of responding
increases (if at all) when lever-pressing produces the pellets.
"Reinforcingness" is NOT a property of the pellet, but a joint function of
the properties of the pellet (taste, smell, texture, etc.), the
physiological state of the rat (food-deprived, not food-deprived, etc.), and
the availability of alternatives (e.g., an alternate source of food).

If the rat is a living control system (as I believe it to be), then there
must be something about the way this control system is organized that
produces these observed relationships, including the relationship between
lever-press-contingent pellet-delivery and the rate of lever-pressing that
has been labeled as "reinforcement." Instead of arguing about which account
is right, control theory or reinforcement theory, we should be trying to
develop a control model that duplicates in detail the changes in behavior
that are observed between conditions in which a given behavior produces a
food pellet and those in which it doesn't.

I'm only speaking for myself. If there are EABers out there who think that
reinforcement involves something else (e.g., linear cause-effect), then I
say nail 'em. But first we need a model that demonstrates how a living
control-system generates those changes in behavior referred to as
"reinforcement," "extinction," etc. when exposed to the changes in
contingency under which those effects are demonstrated. Thus far we haven't
got one.

Bruce A.

[From Bruce Abbott (2000.12.05.2240 EST)]

Bruce Gregory (2000.1205.2055) --

Bruce Abbott (2000.12.05.2035 EST)

I had really expected you agree, after seeing the logic laid out for you,
that it not really inconsistent to categorize different acts as the "same"
operant when focusing on their common result (e.g., lever-depression), but
as "different" operants when focusing on which of those acts is actually
getting reinforced. If you said anything at all about that issue, I'm
afraid I missed it in your reply. May I have that agreement now?

May I assume that you agree that this approach predicts that a single way
of accomplishing any outcome will emerge since it will be consistently
reinforced? In other words, behavior will become stereotyped.

Yes, except that there are always little variations in behavior due to
different starting positions etc., and these may result in a small pool of
minor variants being supported rather than one single behavior. Also, when
behavior ceases to be reinforced, variation increases. This is adaptive as
it increases the chances that a variant will be found that reestablishes
control.

We also must be careful to define what we mean by a "single way." I
wouldn't imagine it to be a particular set of muscle contractions unless we
specifically reinforced a specific set of muscle contractions independently
of the rat's position or other sensory inputs. Rather I would expect it to
include sensory feedback from the behavior. In theory, what the rat is
reproducing is not a set of movements per se but rather the act it perceives
it was doing at the time of reinforcement. That perception depends on
sensory input from all sorts of receptors including those for touch.

Bruce A.

[From Bill Powers (2000.12.06.0241 MST)]

Bruce Abbott (2000.12.05.2035 EST)--

I offered this analogy because you and Rick had made the claim that I was
being inconsistent in my definition of an operant. ...

The bird-feeder analogy shows why there's nothing inconsistent about my
definitions: On the one hand, if you're going to reinforce lever-pressing,
then any act that gets the switch on the lever closed will be counted as the
same operant. All will be treated as functionally equivalent by the
apparatus: all will produce the reinforcer. On the other hand, if you ask
which of that class of acts that gets the lever pressed will actually have
its future rate of occurrence changed by reinfocement, then one has to
distinguish those members of the class that have actually occurred and been
reinforced from those that have not. (That lesson holds even if you rewrite
the story to eliminate the word "attract," which was the focus of your
objection to it.)

How you class the acts that get the lever pressed is of no significance;
that is merely how the eye of the beholder interprets things. What you're
saying is that only the action that actually gets the lever pressed is
reinforced, isn't it? There can be no physical significance to acts (not
performed) that could have (but didn't) result in a key press.

I had really expected you agree, after seeing the logic laid out for you,
that it not really inconsistent to categorize different acts as the "same"
operant when focusing on their common result (e.g., lever-depression), but
as "different" operants when focusing on which of those acts is actually
getting reinforced. If you said anything at all about that issue, I'm
afraid I missed it in your reply. May I have that agreement now?

The problem here is that your definition shifts back and forth between
considering an operant as being a class of actions and considering an
operant as being a class of operants. You've been using the word "operant"
both ways. Why not simply say that whatever action is actually performed
and produces the critical consequence is reinforced? Why do we need the
concept of the operant at all?

I had thought I understood why: it was because different actions are often
used by an organism to produce the same result, for example a key press.
Unless each action that is used can be shown to be explicitly reinforced
and selected by an explicit discriminative stimulus, there is no way to
explain this phenomenon using naive reinforcement theory (understand,
please, that I am using EAB language to explore its own logic here without
necessarily accepting it as valid). How can reinforcing one action increase
the probability that a _different_ action will be emitted? I had thought
Skinner invented the concept of the operant to allow for the fact that such
effects are observed; reinforcement hardly ever leads to repetition of
_exactly_ the same action (and sometimes not even approximately the same
action). Skinner was saying that it is not the specific motor action that
gets reinforced, but the _operant_, the class of all motor actions that
could depress the key or more generally initiate the production of a
reinforcer. So delivering reinforcement after a key press makes _any_
action that can depress the key more likely. If that's not true, it becomes
pretty hard to say just what it is that is reinforced. If it's not true,
then every action that produces the same effect must have been individually
reinforced in the past, and that would be extremely hard to demonstrate.

In case I didn't say it clearly: no, I don't agree that your use of
"operant" is self-consistent.

I'm glad that you said that your purpose is to show that the language of
behaviorism is consistent with control theory. I think you need to
contemplate the significance of that statement. As long as this is your
purpose, I predict that you will succeed in achieving it -- quite
independently of whether such consistency actually exists. Among the
"operants" you will use to demonstrate to yourself this consistency will be
your own sense of conviction of the truth of what you say. But your
personal conviction of truth is not allowed to be a factor in scientific
discourse, for the very reason that it is under your control and not
subject to any external constraints. Just consider the sorts of things that
people are convinced are true. If they can convince themselves of what you
and I would consider far-fetched beliefs, then believe me; so can you and
I. That is why mere personal belief is not considered sufficient to prove
anything in science.

In my opinion, science can't exist where the terms of language are not
crisply and uniquely defined, and where the rules of logic are not equally
explicit and commonly accepted. The rules of grammar and the informal
customs of word usage are not sufficiently disciplined to serve scientific
discourse. At the very least we must devise a technical language in which
the terms always have the same explicit meanings, so we can't shift the
meanings by changes of emphasis, context, or point of view.

If you're going to use the term "operant" in making propositions, then I
must insist that you offer a single clear definition of the term that is
the same in all circumstances, and stick to it. For example, you could
define it as "the set of all physical actions that could in some
circumstances produce a given effect in the environment." Or it could be
"the set of all physical actions that have in the past produced a given
effect in the environment." I hope you will agree that these are two
completely different definitions that are not interchangeable.

Making clear definitions and sticking to them is the first step toward a
mathematical statement of a theory. I hope that's where we're headed.

Best,

Bill P.

[From Chris Cherpas (2000.12.06.0340 PT)]

Bruce Abbott (2000.12.05.2215 EST)--

I'm only speaking for myself. If there are EABers out there who think that
reinforcement involves something else (e.g., linear cause-effect), then I
say nail 'em. But first we need a model that demonstrates how a living
control-system generates those changes in behavior referred to as
"reinforcement," "extinction," etc. when exposed to the changes in
contingency under which those effects are demonstrated. Thus far we haven't
got one.

Let's back up a bit and move the free-feeding condition
into the test chamber. Imagine we have a hole in one wall
of the chamber with food freely available. We also have
a platform in front of the food trough that detects whenever
the rat is standing in front of the trough. (You can probably
think of a better arrangement for rats, but I'm so pigeon-biased
that I naturally think in terms of troughs -- aka "hoppers.")

In the 1st contingency phase, we make the platform
raise slightly whenever the rat is not on it, but at
an angle so that it's raised only on the side touching
the food hole. So, now, whenever the rat is off the
platform, the platform appears as a ramp, leading up
to the wall, somewhat covering the food hole. When the
rat reaches the end of the ramp, it lowers so that the
food is not obscured in any way.

So far, there is virtually no difference in the rat's
ability to get food from the free-feeding condition --
no interruption in eating from the previous condition.

Next, we continue to "shape the environment" so that
in addition to the platform being raised, a pedal or
bar, hinged in the platform, protrudes, so that the
rat now presses on the bar as it reaches the end of
the ramp, and this lowers the ramp -- again, making
the food readily available. While there is now a
bar-press contingency, we see little change in the rat's
approaching and eating, as this bar-press has been
introduced so gradually, and is almost indistinguishable
from simply taking the last step before reaching the
food hole.

At this point we may see the rat just hanging at
the end of the ramp, so it never lifts up, but, eventually,
especially if water is provided at the other end of the
chamber, the rat will move to other areas and will
simply walk up the ramp and press the bar every time
it's ready to feed.

I'm not going to say anymore at this point, other than
to ask for consideration of the notion of shaping the
environment, so that rat is never required to reorganize
to get the food, and that, if we continue adding conditions
in this manner, we may see "reinforcement" as the rat
compensating for interruptions/disturbances to having
food freely available.

Best regards,
cc

[From Bill Powers (2000.12.16.0345 MST)]

Bruce Abbott (2000.12.05.2215 EST)--

... observations reveal the following:

I want to demonstrate to you that observations do no such thing. You
interpret the observations in a particular way. They can be interpreted in
other ways you do not mention. I will show another way for each point you
make, not because I defend it, but just to show that more than one
interpretation is possible.

1. Initially, when a bit of the rat's behavior (depressing the lever) did
_not_ produce a food pellet, the rate at which that act occurred remained low.

A. Initially, when pressing the lever did not produce food, the rat moved
about the cage, sniffing, licking, and scratching at things including the
lever. The rate of lever depressions remained low because most of the rat's
actions that could depress the lever were being applied elsewhere.

2. When it was then arranged that the same bit of behavior now produced a
food pellet each time it occurred, the rate at which that act occurred
increased over the rate observed in (1).

B. When depression of the lever produced a food pellet each time it
occurred, the animal would pause and eat the food, and spend slightly more
time in the vicinity of the lever before moving on. As the animal spent
increasing amounts of time in the vicinity of the lever, the rate at which
the lever was depressed increased (over the rate observed in (1)), which
caused the rate of food delivery to increase.

3. When it was then arranged that the same bit of behavior once again did
_not_ produce a food pellet, the rate at which that act occurred diminished
to the level observed under (1).

C. When pressing the lever no longer produced food, the animal changed its
behavior, first to an increase in lever pressing (you left that out), and
then to an increasingly variable pattern until it was back to the original
pattern of sniffing around the cage and scratching and licking at things.
After the initial rise in lever pressing, the decreasing amount of time the
animal spent at the lever resulted in a lower and lower number of presses
until the lever was pressed only when the animal happened to be moving
around near it.

Because nothing else changed except whether lever-pressing produced a pellet
or not, and because these same changes occur reliably every time these tests
are performed, it can be inferred with a high degree of confidence that it
is the delivery of the food pellet as a consequence of a lever-press that is
responsible for the observed increase in the rate of lever-pressing in (2),
and that the loss of this consequence is responsible for the observed
decrease in the rate of lever-pressing in (3). This can be stated with
confidence because nothing else has changed between conditions that could
explain the change in rate of lever-pressing. We can thus conclude that the
increased rate of lever-pressing in (2) relative to (1) and (3) is due to
the fact that in (2) the lever-press produces pellets, whereas in (1) and
(3) it does not.

You can state that the rate of lever pressing in (2) is higher than in (1),
but you can't say it is higher _because_ pressing produces food pellets. In
an exactly parallel way, you can say that the speed of a car is higher when
pressing the accelerator feeds gas to the engine than when pressing it
fails to feed gas to the engine. But we would never say that the fact that
the accelerator works properly is what makes the car accelerate.

In fact, given the availability of food via lever presses, the rate of
pressing can still increase and decrease without any further change in the
availability, increasing and decreasing the rate of food delivery and
consumption. This shows that the availability of food via the lever is a
factor in the changes of behavior, but only an enabling factor, not a
cause: to know whether food is available is necessary, but not sufficient,
to predict subsequent behavior (a "cause" is both necessary and sufficient
to produce its effect, which is why physics does not deal with causes: they
are extremely rare). This is analogous to the way having gas in the gas
tank is necessary for a car to be driven away, but not sufficient to
account for its being driven away. The car is not driven away _because_
there is gas in the tank. A rat does not press a lever _because_ doing so
would produce food. The car cannot be driven away if there is no gas in the
tank, and the rat cannot produce food if the lever is the only means of
producing food and is not working. But given the gas and the contingency,
the behavior is not yet accounted for.

The basic difficulty here is in treating a contingency, which is the
_possibility_ for an act to produce an effect, as if it were a causal
variable. But the possibility of an effect is not the same as occurrance of
an effect. Before the effect can be produced, the possibility of producing
it must exist, but that is not enough. The action that produces the effect,
given the possibility, must also occur. And even that is not enough: the
system must be so organized that the _lack_ of the effect motivates an
attempt to produce it.

This effect of the contingency between
lever-pressing and food-pellet delivery on response rate is termed
"reinforcement" and if delivering food-pellets for lever-presses has this
effect on lever-presses, then the pellets are said to serve as
"reinforcers."

I hope I have shown that this is an insufficient description. The
contingency makes it possible for an action to have a certain effect, but
is insufficient to account for production of the action. It is not
"delivering food for lever presses" that has the effect of increasing lever
presses; it is the increase in lever presses that makes the contingency
into an actual increase in production of food. It is the driver of the car,
not the gas in the tank, that puts the car into motion and drives it away.

What we actually observe is that when the apparatus is connected so that a
lever press, if it occurred, would produce a food pellet, both lever
pressing and food pellet production increase. The food pellet production
increases _because_ the lever pressing has increased; of that there is no
doubt. Why the lever pressing increases can't be explained without a model.

I'm only speaking for myself. If there are EABers out there who think that
reinforcement involves something else (e.g., linear cause-effect), then I
say nail 'em. But first we need a model that demonstrates how a living
control-system generates those changes in behavior referred to as
"reinforcement," "extinction," etc. when exposed to the changes in
contingency under which those effects are demonstrated. Thus far we haven't
got one.

Well, then, let's get one. Offline, you and I have already made a start
with the "furnace" models. In my model I included a "search" function
which, right now, merely sets a "failed" flag false to indicate success. We
could expand that to include an actual search process which is terminated
when it results in production of a rise in temperature. Of course you would
want it to terminate when a command produced a flash, but the same
principle would apply. This model is becoming a representation of the basic
situation outlined in this post. When we have it working well enough, we
can explore the appropriateness of the way EABers would interpret the
visible part of its behavior.

When that happens, we can also go public with it and perhaps, for a change,
know what we are talking about.

Best,

Bill P.

[From Rick Marken (2000.12.06.0920)]

Bruce Abbott (2000.12.05.2215 EST)--

Instead of arguing about which account is right, control
theory or reinforcement theory, we should be trying to
develop a control model that duplicates in detail the changes
in behavior that are observed between conditions in which a
given behavior produces a food pellet and those in which it
doesn't.

I think you could build such a model very easily. I would start
with a simple two level model. The top level is a single control
system controlling for, say, pellets/sec. Error in this top
level system drives the reference to the lower level control
systems to which its output is attached. Assume that there are 4
such lower level control systems. The variables controlled by
these systems could be things like lever position, nose position,
left paw position, right paw position.

There is also a learning (reorganization) control system that is
controlling perceived error in the top level control system by
adjusting the gain of the connection between the top level output
and the reference inputs to each of the lower level systems. The
gain of each lower level system can changed randomly (between, say,
0 and 100) at some rate dependent on the size of the error in the
top level system.

Here's a diagram of the model:

              Pellet Rate Reference Error Reference
                      > >
                      v v
             ------| |----Error ----->| |-----e.r

···

                   > >

            > > Reorganization |
            > v |Rate Change|
            > > >
            > r1______r2_v_____r3_______r4 |
            > > > > > >
            > p1>|C1| p2>|C2| p3>|C3| p4>|C4| |
            > >g1 |g2 |g3 |g4<-
            > v v v v System

  ______________________________________________________
            > > > > > Environment
            > o1 o2 o3 o4
           qi <----------------------------------
       Pellet Rate Feedback Function

There are two main control loops here; the Pellet Rate control
loop and the Error control loop (the reorganization system). The
pellet rate control loop acts on the controlled variable by
controlling four variables (p1, p2, p3, p4) with gains determined
randomly by the reorganization system. So at any time the gain
for any of the 4 lower level systems could be anywhere between 0
(no control of the perception) to 100 (maximum control of the
perception).

The feedback function determines which of the outputs (o1, o2, o3,
o4) actually has an effect on the controlled variable. So, for example,
if o1 is force exerted on the lever then only o1 might have an effect
on qi.

The important algorithm of the model is the reorganization algorithm.
For starters, I would have error in the reorganization system, e.r,
determine the rate of change in the gains (g1..g4) of the four lower
level systems. Maybe something like this:

if (RND()<e.r) then signi = -singi

Ratei = k*e.r * signi
gi = gi + Ratei

if (gi>100) then gi = 100
if (gi<0) then gi = 0

The idea is that the sign of the gain change (signi) is changing
randomly as long as the error is greater than 0 (the range of RND()
is 0 to .999). So as e.r decreases to 0 the probability of a change
in the sign of the gain change become 0. At the same time the size of
the change in gain (Ratei) is determined by the size of the error
in the reorganization system.The idea behind this algorithm is that
the gain of the lower level system that allows the pellet control
system to bring error to _and maintain_ it at zero should ends up
remaining large while the gains of the unproductive systems decrease
to zero.

If the algorithm works properly, the model should demonstrate the
observed effects of "reinforcement" (the system should start
pressing the lever, o1, to get the pellet rate up) and it should
also show the effects of "satiation" -- once the pellet rate
control system gets its error consistently to zero it will stop
setting a non-zero reference for lever pressing (assuming that's
the lower level system that produces reinforcement) so o1 goes
to 0.

You can write the model in any language you like, Bruce. I should
be able to translate it into java for a web demo.

Best

Rick
---
Richard S. Marken Phone or Fax: 310 474-0313
MindReadings.com mailto: marken@mindreadings.com
www.mindreadings.com