[From Bill Powers (2000.11.02.0351 MST)]
Bruce Abbott (2000.12.01.1035 EST)--
Nobody has made the claim that all operants in
the same functional class are equally reinforced when one of them is.
(Sheesh!) Membership in the same functional class means they produce the
same end (e.g. they all get the lever down). If you arrange to reinforce a
lever-press, then any act that accomplishes this will be reinforced, but of
course only the act that actually occurs will be marked for repetition.
Double sheesh. What does it mean to speak of "all operants in the same
functional class?" I thought an operant _is_ a functional class, the class
of all actions that have the same effect. Is there some super-class made up
of operants, each operant being a class of actions having the same effect?
I assume that "reinforcing a lever press" is shorthand for a more precise
meaning, such as "reinforcing the tendency of the organism to do something
that causes the lever to be pressed." You EABers have got to learn how to
talk intelligibly with the outside world.
OK, so the operant is NOT reinforced as such, but only the specific act
that depresses the lever. This takes care of one concern, that when the
control system changes its action during a disturbance, the EABer could
claim that the changed action had also been reinforced since it is a member
of the same operant. If the organism changes its action so as to continue
opposing a disturbance, reinforcement theory, if I reason right, could not
explain this since the new action has not been reinforced yet (according to
what you're saying).
To answer one of Chris Cherpas' comments, there may be many versions of
reinforcement theory, but I've assumed they all propose some strengthening
effect of a reinforcement on the behavior that produced it. The main
difference I see between reinforcement theory and reorganization theory is
that reorganization theory does _not_ propose that any particular act is
strengthened, made more probable, etc.. Instead, the occurrance of the
wanted input _reduces the tendency to change to a different behavior_. In
the limit, correcting the intrinsic error sufficiently well will leave the
behavior organized in a particular way _rather than changing it to some
other organization_. The particular behavior that results is simply
whatever was going on when the changes ceased.
The image of a search is the best description of reorganization. Something
in particular is wanted -- food, say. The _lack_ of food constitutes an
error which starts the search process. The search continues as long as
there is enough error to drive it. When the error is reduced or corrected,
the search slows or ceases. The search itself can be systematic or random,
depending on whether we're thinking of a higher-order learned system that
adjusts parameters according to some algorithm, or an e. coli-type search
where parameters are altered in random directions. Also, the search can be
thought of as physically moving around in an environment, or as altering
parameters inside a control system.
As I understand the concept of reinforcement, the image to use is not one
of a search, but one of an instructor waiting to see what an organism does,
and when it does the right thing, saying "THERE!, Keep doing that!" where
_that_ is whatever output the organism was producing at the time. So the
idea, as I understand it, is to encourage producing the _output_ that was
effective. This, of course, can be done, but it will not lead to control.
Under control theory, we can admit that no particular output will always
have the same effect, because of natural disturbances and variations in the
environment. So the reinforcement concept will work only when the
environment is protected from disturbances; then the same output will in
fact always have the same consequence. Of course a control system will also
work with such a protected environment; it will produce the same output
every time, too. Thus you can't distinguish a reinforcement-operated system
from a reorganizing control system if the controlled variable is protected
against disturbances (independent influences that can change its value).
When disturbances are allowed, the reinforcement-based system will fail,
but the control system will go on working.
On the subject of discriminative stimuli: naturally the environment abounds
in them, but the particular ones needed to make control into reinforced
actions would be those that represent the presence of a disturbance. It is
very easy to set up experiments in which such discriminative stimuli are
completely absent. In our computer-based tracking experiments, there is no
independent indication of the magnitude and direction of the disturbances
at any moment. Not only that, but when some accurate indication of the
magnitude and direction of the disturbance is presented on the screen,
performance gets _worse_ because the person's attention is drawn away from
the controlled variable. So I think we can refute any claim about the role
of discriminative stimuli in learning a control task.
To sum up: when there are no disturbances of the controlled variable, the
observer cannot distinguish between a system that learns to produce a
particular output and a system that learns to vary its outputs to control a
particular input. Both kinds of system will produce the same behavior and
the same consequences. When disturbances of the input are applied, however,
only the control system will automatically vary its actions to oppose the
disturbances.
Best,
Bill P.