Varying Lever Pressure

[From Bill Powers (2000.11.15.0809 MST)]

Bruce Gregory (2000.1114.1252)--

Unless one holds a dualistic view of life, all learning requires a
modification of the existing neural structure.

Does this apply to

"I've just learned the outcome of the Florida voting."

"I learned your telephone number just today"

"I learned German in school"

"I learned to play the flute."

Does "a change in neural structure" include the memorization of past
perceptions to be used as future reference signals? Does it include
changing the supply of calcium in the vicinity of a neuron, or other
substances? Does it refer to the withering of synaptic connections and the
growth of new ones? Does it include using the same neural control system in
a new environment where the feedback connection is different, thus
resulting in different amounts and direction of actions without any change
in the neurons or muscles (the phenomena quaintly known as "response
generalization")

If the phrase refers to all these things, then it's an overgeneralized
category, including phenomena with such different physical roots as to be
mutually irrelevant.

The simplest assumption that I can think of is that learning
takes place whenever the organization of the neural system is altered as a
result of its own operation. There are many neural network models of such
processes. I am reluctant to say that is obvious that they are all worthless.

Of course. But are they all sufficiently the same to be referred to by a
single term?

Best,

Bill P.

[From Bruce Abbott (2000.11.15.1110 EST)]

Bruce Gregory (2000.1115.0932) --

Bruce Abbott (2000.11.14.1835 EST)

>Bruce Gregory (2000.1114.1600)

>My point was simply that in the absence of a model one explanation is hard
>to distinguish from another. EAB appears to be light on models and
>therefore encourages hand waving. Not that I have anything against hand
>waving, as long as you realize that is what you are doing.

One could say the same about Darwin's theory of evolution.

One could, but one would be wrong.

Yes, my point exactly. One would be wrong, and for the same reason, one
would be wrong to say that about a selectionist view of reinforcement. It
encourages hand waving no more than Darwin's theory of evolution does.

Bruce A.

[From Bruce Gregory (2000.1115.1133)]

Bruce Abbott (2000.11.15.1110 EST)

Yes, my point exactly. One would be wrong, and for the same reason, one
would be wrong to say that about a selectionist view of reinforcement. It
encourages hand waving no more than Darwin's theory of evolution does.

Among my examples are genetic algorithms. How about a few of your examples?

BG

[From Bruce Abbott (2000.11.15.1150 EST)]

Bruce Gregory (2000.1115.1133) --

Bruce Abbott (2000.11.15.1110 EST)

Yes, my point exactly. One would be wrong, and for the same reason, one
would be wrong to say that about a selectionist view of reinforcement. It
encourages hand waving no more than Darwin's theory of evolution does.

Among my examples are genetic algorithms. How about a few of your examples?

Among my examples also are genetic algorithms. What other examples do you have?

Bruce A.

[From Bruce Gregory (2000.1115.1214)]

Bruce Abbott (2000.11.15.1150 EST)

Among my examples also are genetic algorithms. What other examples do you
have?

You win. Bye.

BG

[From Bruce Abbott (2000.11.15.1250 EST)]

Bruce Gregory (2000.1115.1214) --

Bruce Abbott (2000.11.15.1150 EST)

Among my examples also are genetic algorithms. What other examples do you
have?

You win. Bye.

Win? I wasn't trying to "win" (whatever that means), I was trying to
pursuade you to change your mind on this issue.

Bruce A.

[From Bruce Gregory (2000.1115.1330)]

Bruce Abbott (2000.11.15.1250 EST)

Win? I wasn't trying to "win" (whatever that means), I was trying to
pursuade you to change your mind on this issue.

I cited genetic algorithms as an example of Darwinian models. If genetic
algorithms are also an example of EAB models, then I don't understand EAB
models. So I have to take your word for it.

BG

[From Rick Marken (2000.11.15.1100)]

Bill Powers (2000.11.14.0406 MST) --

The action of pressing with a certain force produces a food consequence,
which increases the probability that the rat will press with the same
force the next time. Thus it is the contingency set up in the environment
together with the reinforcer that selects the behavior of pressing with a
certain force.

Bruce Abbott (2000.11.14.1330 EST) --

Those who subscribe to this view believe that the selecting is being
done by mechanisms within the organism;

It seems to me that those who subscribe to this (reinforcement) view
do believe that the "selecting" is done by the environment. They just
mean something different by "selecting" than what we mean when we talk
about control systems "selecting" the results of their actions. When
reinforcement theorists (and, I might add, Darwinian evolutionists)
talk about "selecting" they are talking, it seems to me, about a passive
filtering process. An example of this kind of "selecting" is a strainer
with holes of such a size that, when a pot of water containing cooked
spaghetti is poured through the strainer, the holes are "selecting"
the spaghetti and rejecting the water. So a particular consequence
(spaghetti in the strainer) of action (pouring both spaghetti and water
into the strainer) is "selected" by the environment (the holes in the
strainer).

This passive kind of "selecting" is quite different than the active
"selecting" that is done by a control system. A control system selects
by setting a reference signal to a particular value in order to specify
the consequence it wants; and the system acts so as to make the
consequence match the reference specification for that consequence. If
the control system wants (specifies via a reference setting) spaghetti
sans water, the system could _use_ a strainer to produce this
consequence.
But the control system will vary its actions (for example, it will vary
which strainer it uses) as necessary to produce the desired consequence
(spaghetti sans water). If the holes in a strainer let _both_ the water
_and_ spaghetti through, the control system will change to a smaller
gauge strainer in order to produce the selected consequence.

So, in _active_ selecting -- the kind done by control systems --
the system itself gets the consequence that it selects. In _passive_
selecting -- the kind done by reinforcement and "natural selection" --
the system gets whatever consequence the passive filter happens to
select.

I think it would be very useful to try to develop experimental
approaches to distinguishing these two views of "selecting", perhaps
in the context of an experiment like the one done by Slifken et al
(1995). Apparently, Slifken et al believe that they have shown that
particular isometric forces are "selected" (in the passive filtering
sense) when these forces produce reinforcing consequences, like food.
The reinforcers are like the holes of the strainer, letting through
only those forces that produce the reinforcers. The PCT view would be
that these forces are selected (in the active setting of references
sense) as the means of producing the reinforcing consequences
themselves.
I believe there must be a way to experimentally distinguish these views
of "selecting", especially if these views are implemented as working
models so we can see under what circumstances the models behave
differently from one another.

I was hoping to get some suggestions from some experimentally inclined
people on how to go about designing an experiment to distinguish these
views of "selecting". I hope experimentalists like Bruce Abbott, Isaac
Kurtzer and Gary Cziko will take the time to post some suggestions to
the net. I think CSGNet would be an excellent forum in which to develop
such an experiment.

Best

Rick

···

--
Richard S. Marken Phone or Fax: 310 474-0313
MindReadings.com mailto: marken@mindreadings.com
www.mindreadings.com

[From Bruce Abbott (2000.11.15.1545 EST)]

Rick Marken (2000.11.15.1100) --

This passive kind of "selecting" is quite different than the active
"selecting" that is done by a control system. A control system selects
by setting a reference signal to a particular value in order to specify
the consequence it wants; and the system acts so as to make the
consequence match the reference specification for that consequence.

So, let's apply this to the case that started all this: the rat pressing a
lever to earn a food pellet, when the apparatus requires that the lever be
pressed with a force that falls within a narrow range or "window" of values,
initially unknown to the rat. You say that the control system "selects by
setting a reference signal." I'm a little troubled by this -- it sounds
like the control system is a little person in the head that knows what it
wants and knows how to get it. Can you be more specific? How does the
control system in question "know" what variable to control? How does it
"select" that variable for control? How does it "know" what reference value
to select for that variable? How does it "know" what actions to take to
control the variable?

Bruce A.

[From Rick Marken (2000.11.15.1420)]

Bruce Abbott (2000.11.15.1545 EST)--

You say that the control system "selects by setting a reference signal."
I'm a little troubled by this -- it sounds like the control system is
a little person in the head that knows what it wants and knows how to
get it.

The reference input to a lower level control system is determined by
the outputs of higher level control systems as the means of keeping
the perceptions controlled by these higher level control systems
under control. So higher level control systems are the ones that
"select" the (perceptual) consequences to be produced by lower level
control systems. This is the active "selecting" to which I referred.
For example, in the isometric force case the higher level systems
controlling for the perception of the reinforcement apparently
set the reference for perceived isometric force (pressure on the
paw, perhaps) at a level that produces the perception of reinforcement.

How does the control system in question "know" what variable to control?

I presume that the higher level, reinforcement control systems have
learned (through some kind of reorganization process) that it is
necessary to be able to select particular perceptions of force in
order to perceive the desired level of reinforcement.

How does it "select" that variable for control?

This is done by amplifying some set of higher order output (error)
signals into a scalar reference input to the system controlling the
perception of isometric force (the reference input calculations
for the level 1 and 2 systems in the spreadsheet hierarchy model at
http://home.earthlink.net/~rmarken/demos.html show one way this
can be done).

How does it "know" what reference value to select for that variable?

The same way you know which reference values to select for mouse
position when you are controlling a cursor by varying mouse position.
The "knowledge" is built into the control loop in the form of a
negative feedback relationship between "selected" values of output
and actual values of input. This negative feedback relationship is
created by "selecting" the value of output on the basis of the
_difference_ between input and reference (r). The "selection" equation
for mouse position (m) output when controlling a cursor position (c)
input, for example, is m = f (r-c).

How does it "know" what actions to take to control the variable?

Same answer as the previous one: The "knowledge" is built into the
control loop as the simultaneous equations that define the loop.

Any ideas for experimental tests of the kind of "selecting" that
is going on in the isometric force experiment?

Best

Rick

···

--
Richard S. Marken Phone or Fax: 310 474-0313
MindReadings.com mailto: marken@mindreadings.com
www.mindreadings.com

[From Bill Powers (2000.11.15.0819 MST)]

Bruce Abbott (2000.11.14.1330 EST)--

Those who subscribe to this view believe that the selecting is being done by
mechanisms within the organism; the notion that the environment is doing the
selecting is simply a shorthand way of noting that the environmental
consequences of behavior, as perceived by the organism, result in a change
in what the organism is likely to do in the future under similar conditions.
Certain consequences have this effect because of the way the organism is
organized.

Why, if they believe that the organism initiates and does the selecting, do
they use terms which suggest (strongly) that it is the environment that
does the initiating and selecting? I am reluctant to believe that this is
merely a sloppy habit of speech. I have never seen it said in a JEAB
article that an organism determined its own behavior and controlled the
consequences of that behavior. Everything I have seen suggests the exact
opposite. Are you really speaking for the behaviorist community when you
say they mean that the organism is the controller and the environment only
a passive responder to the organism's actions?

I have pointed out that the apparent effects on particular behaviors will
be seen only when the behavior always has the same consequence -- the
no-disturbance case. In the general case in which disturbances are allowed,
it will be seen (according to PCT) that while the same consequence
continues to be produced, different behaviors are required to produce it;
hence it is not the behavior that is "conditioned." Instead, the
consequence is controlled by the organism which varies its behavior as
required. That explanation continues to hold up when there are no
disturbances, of course, and in that case it may seem that it is the
behavior that is related to environmental conditions. Have I made this
argument clear enough? I have a feeling that it is falling on deaf ears.

Best,

Bill P.

[From Bruce Abbott (2000.11.15.2000 EST)]

Rick Marken (2000.11.15.1420) --

Bruce Abbott (2000.11.15.1545 EST)

You say that the control system "selects by setting a reference signal."
I'm a little troubled by this -- it sounds like the control system is
a little person in the head that knows what it wants and knows how to
get it.

The reference input to a lower level control system is determined by
the outputs of higher level control systems as the means of keeping
the perceptions controlled by these higher level control systems
under control. So higher level control systems are the ones that
"select" the (perceptual) consequences to be produced by lower level
control systems. This is the active "selecting" to which I referred.
For example, in the isometric force case the higher level systems
controlling for the perception of the reinforcement apparently
set the reference for perceived isometric force (pressure on the
paw, perhaps) at a level that produces the perception of reinforcement.

How do they know what level will produce the perception of the reinforcer?
How do they know what to vary?

How does the control system in question "know" what variable to control?

I presume that the higher level, reinforcement control systems have
learned (through some kind of reorganization process) that it is
necessary to be able to select particular perceptions of force in
order to perceive the desired level of reinforcement.

Ah, so how do these systems learn that certain forces on the lever must
first be perceived if the reinforcer is to be perceived? Could it have
anything to do with what forces were being perceived during presses
immediately prior to previous reinforcer deliveries? Could it have anything
to do with what other force-perceptions during presses were NOT immediately
followed by reinforcer deliveries?

How does it "select" that variable for control?

This is done by amplifying some set of higher order output (error)
signals into a scalar reference input to the system controlling the
perception of isometric force (the reference input calculations
for the level 1 and 2 systems in the spreadsheet hierarchy model at
http://home.earthlink.net/~rmarken/demos.html show one way this
can be done).

What I mean is, how do the higher-level systems "know" what variables they
need to control in order to produce the reinforcer? The systems of which
you write seem to have magically come prewired to do just what is required.
I rather imagine that this is unlikely to be the case for the rat.

How does it "know" what reference value to select for that variable?

The same way you know which reference values to select for mouse
position when you are controlling a cursor by varying mouse position.
The "knowledge" is built into the control loop in the form of a
negative feedback relationship between "selected" values of output
and actual values of input. This negative feedback relationship is
created by "selecting" the value of output on the basis of the
_difference_ between input and reference (r). The "selection" equation
for mouse position (m) output when controlling a cursor position (c)
input, for example, is m = f (r-c).

I don't think I know anything about what reference values to select for
mouse position in order to control a cursor, until I've had an opportunity
to experiment with the mouse and observe the resulting cursor movements.
Now why do you suppose that is? As you say, the equation relating mouse
movement to cursor is built right in.

How does it "know" what actions to take to control the variable?

Same answer as the previous one: The "knowledge" is built into the
control loop as the simultaneous equations that define the loop.

You are assuming that which the explanation is supposed to explain. How
does the control loop come to have those equations such that control over
the variable is achieved? In a simulation, you can simply write the proper
equations, but the rat hasn't got a programmer to do this for it.

Any ideas for experimental tests of the kind of "selecting" that
is going on in the isometric force experiment?

No, but I can't say I've given it any thought yet, either. At this point
I'm not pursuaded that the selecting being done to produce control systems
that "do the right thing" (as required by environmental relationships) in
order to achieve control over food delivery, is anything other than a
passive filtering process. Clearly, however, that process must end up
creating appropriate control processes with the right reference values for
perceived lever-press force.

Bruce A.

[From Rick Marken (2000.11.15.2200)]

Bruce Abbott (2000.11.15.2000 EST)

Ah, so how do these systems learn that certain forces on the
lever must first be perceived if the reinforcer is to be
perceived?

I don't know. According to PCT, however, this learning process
involves random variation in existing control organizations and
_active_ selection (by reducing the rate of change in existing
organizations) of those organizations that tend to improve
control. According to reinforcement theory, the learning involves
random variation in _responses_, not control organizations,
and _passive_ selection (by consequences that increase the strength
of responses that occur prior to the consequence) of those responses
that tend to produce particular consequences.

Could it have anything to do with what forces were being
perceived during presses immediately prior to previous
reinforcer deliveries?

Yes. Reinforcement theorists certainly think so. But control
theorists think it has more to do with the organism selecting
certain control organizations by delaying random changes in
these organizations as long as the current organization produces
the intended perception (the "reinforcer").

I think it's possible to experimentally discriminate between
these two views without getting too deeply into the details
of how each learning mechanism works. These two views differ
in terms of _what_ is learned. According to control theory what
is learned is a _control system_; a system that controls for the
"reinforcing" consequence. According to reinforcement theory, what
is learned is a response; the response that is followed by the
reinforcing consequence.

The experiments that would test between these two possibilities
would involve the application of disturbances to the consequences
that are either being actively selected (control theory) or doing the
passice selecting (reinforcement theory). If what is learned is how
to control the consequence then, when disturbances are applied to
the consequence, this same consequence will continue to be produced,
but by different responses. If, on the other hand, what is learned
is a response then, when disturbances are applied to the consequence,
different consequences will be produced by the same or different
responses.

I think this logic can be applied to the isometric force experiment.
It should be possible to show that, however learning might occur,
what is learned is either 1) how to control for a particular
consequence or 2) a particular response (force) that is passively
selected by its consequences.

So, what do you think? Any ideas for experiments that could
help us see what is being learned in the isometric force
experiment?

Best

Rick

···

--

Richard S. Marken Phone or Fax: 310 474-0313
Life Learning Associates e-mail: marken@mindreadings.com
mindreadings.com

[From Bill Powers (2000.11.16.0219 MST)]

Bruce Abbott (2000.11.15.1545 EST)--

So, let's apply this to the case that started all this: the rat pressing a
lever to earn a food pellet, when the apparatus requires that the lever be
pressed with a force that falls within a narrow range or "window" of values,
initially unknown to the rat. You say that the control system "selects by
setting a reference signal." I'm a little troubled by this -- it sounds
like the control system is a little person in the head that knows what it
wants and knows how to get it.

Yes, that is what Rick, and I, and everyone else who understands PCT is
saying -- all except for the "little person" part. There is a system inside
the rat that knows what it wants (it has a positive nonzero reference level
for food intake). In the case where a fully-formed control system already
exists, the system knows how to get it (send reference signals to
lower-level control systems that perform the acts that, as a side-effect,
produce the food). In a novel situation (novel for the individual rat)
there is no existing system that will produce the food, but there is
another kind of system in the rat that can institute processes that will
construct a control organization that will produce food. Either way, is it
the rat that determines what is wanted, and the rat that reorganizes itself
and finds the means of getting it. The environment merely provides the
setting in which the rat must find the means of getting food if it doesn't
already know how to get it.

Can you be more specific? How does the
control system in question "know" what variable to control? How does it
"select" that variable for control? How does it "know" what reference value
to select for that variable? How does it "know" what actions to take to
control the variable?

These are, of course, very basic questions, with which I tried to deal in
considerable detail in B:CP (and even before that). In fact, my answers to
these questions constitute what we now call perceptual control theory,
together with the theory of reorganization. I probably did not cover every
possible situation, nor did I come up with every possible answer. But the
answers I have proposed have seemed fairly general to many people, enough
to form a starting point for the replacement of previous concepts about
what behavior is, how it works, and how behavioral organization is acquired.

Most germane to your questions is the concept of a reorganizing system.
Let's consider, once again, how it is proposed to produce new organizations
that end up providing food in a novel situation where the rat has no
previously-learned method for causing food to appear.

First, we must find out or try to guess what effect of ingesting things the
rat is trying to control. A number of variables can be postulated, from a
general notion of "nutrition" to specifics like blood glucose
concentration, electrolyte balance, level of toxins in the blood, and so
forth. The specifics are, of course what is actually controlled; general
concepts are only for the convenience of a theoretician.

First, how does the rat know that it should control these variables? That's
a rather anthropomorphic way of asking the question, for it implies that
there is some central consciousness in the rat that does what we call
"knowing." We know nothing about a rat's consciousness, if there is any,
but we do know that rats inherit certain structures and that all rats that
stay alive through their own efforts, without exception, eat when certain
variables like blood glucose get out of range. So it makes sense to propose
that there are certain very basic reference levels built into every rat as
part of its inheritance, among them being a reference level for a certain
concentration of blood glucose.

So we have a basic perception, potentially controllable (remember that in
PCT, perception does not imply anything about consciousness: a thermostat
perceives temperature in the sense used in PCT). We have an inferred
reference level for the perception.

Deviations of the perception from the reference level are detected by an
inferred comparator, which produces an error signal as usual. But now we
have a problem, because (to stick to one example) the error in blood
glucose concentration can be cured only by performing unknowable acts in an
environment of unpredictable structure created at the whim of an experimenter.

In the worst case, there is no way in which previous experience could help
in deducing what the experimenter has decided that the rat must do in order
to produce food. We have to assume that the rat has certain capabilities
such as those of moving itself around, but what it has to find out is how
and where to move that will correct the basic glucose concentration error.
This implies that reorganization is going on at a specific level in a
partly-organized system. Since we're assuming the worst case, the only
solution I can see is for the rat to produce varied actions in variable
ways and variable places until either food appears to be eaten or it
starves to death. And of course when it has found one of the actions that
produces food, it must stop the random variations and continue doing what
it was doing when the food appeared.

And what is it that tells the rat it has done the right thing? The blood
glucose concentration error diminishes. This is how we get to the E. coli
method of reorganization, which I had defined 25 years before I knew
anything about E. coli (see the 1960 papers with Clark and MacFarland).

The diminution of blood glucose error, I have since found by using
simulations, must slow the rate of reorganization, not terminate it
abruptly, if a systematic approach to a final solution is to occur. So the
naive rat is not expected to stop searching the cage and start pressing the
lever the first time a lever-press results in a food pellet's being
delivered. Rather, the rat starts dwelling longer and longer in the
vicinity of the lever, as if it's not sure what it was doing while it was
there that produced the food (another anthropomorphism as well as a
metaphor). Only after many trials does it eliminate the irrelevant actions.

So this model, after quite a few cycles of modification, now predicts more
or less correctly how a rat will behave in a Skinner box, at least in
general. And it answers one of your questions: how does the rat find out
what to do in order to get what it wants?

Notice that the rat gets no guidance from the environment to lead it to the
behavior that is needed. Of course if the needed behavior is something
bizarre like walking in a figure-8 pattern, there would be little chance of
finding it through reorganization in a single jump. However, if the
experimenter started with simple requirements such as turning left which
the animal would be likely to produce during a normal search process, the
amount of reorganization required to find it would be small. Then the
requirement could be changed so another small reorganization might come up
with the next approximation in a reasonable time, and so on through the
process known as "shaping." The need for a single large reorganization of
low probability is replaced by a series of small reorganizations of high
probability. In no case does the experimenter determine how the animal will
reorganize: that is strictly up to the animal. Shaping merely sets the
stage so that fewer reorganizations need to be done in order to
re-establish control of food intake during a given phase of shaping.

The relationship between the experimenter and the animal is one that can
easily be illustrated with the rubber bands. The experimenter varies what
does not matter to him, the organism's food intake, as a means of
controlling what does matter to him: the organism's behavior. And the
organism varies what does not matters to it, its behavior, as a means of
controlling what does matter to it, its food intake. Each party to the
relationship is controlling what matters to it, its own perception, by
varying what does not matter to it, its own behavior. This can be a
completely conflict-free interaction.

This is all standard PCT, with only minor changes from what I started
saying in print 40 years ago. I must have been explaining it very poorly,
for these questions to be arising again at this late date.

Best,

Bill P.

[From Bruce Gregory (2000.1116.1738)]

Bill Powers (2000.11.15.0809 MST)

If the phrase refers to all these things, then it's an overgeneralized
category, including phenomena with such different physical roots as to be
mutually irrelevant.

I guess that ends this exchange.

BG

[From Bill Powers (2000.11.18.0404 MST)]

Bruce Nevin (2000.11.18 0444 EDT)--

PCT focuses on the details of behavior. ...

Reinforcement theory focuses on the process of learning. ...

That's a nice symmetrical formulation, but it doesn't hold up. The problem
is that you can't define learning separately from behavior. The only way to
say whether learning has happened is to look for a change in performance,
in behavior. Therefore what you think behavior is has a great deal to do
with what you call learning.

As behaviorists imagine behavior, there are discrete happenings which are
followed by discrete consequences. The happenings are called responses, and
the consequences are called reinforcements. A reinforcement increases the
probability that a just-previous discriminative stimulus will be followed
by emission of an instance of a particular class of responses called an
operant. Learning, therefore, is measured as a change in this probability
of occurrance of discrete events. Probabilities can't be directly measured,
so they are converted into frequencies. The frequency with which an operant
is emitted increases (until satiation sets in) as the operants are followed
by reinforcers.

In control theory, behavior is characterized by a model in which some
variables are functions of other variables. The most general case is the
continuous function, although discrete functions can be handled as a
special limiting case. A function is defined not by any one input-output
event, but by the way the output varies as the input varies. In a linear
function, for example, if the input varies according to a sine-wave, the
output will also vary as a sine-wave of proportional size. If the input _to
the same function_ increases linearly with time, the output will also
increase linearly with time. In fact, any waveform of variation at the
input will result in an output waveform of the same shape and with a
proportional amplitude. This function is characterized not by a list of
imput-output events, but by a single number, the constant of proportionality.

Other functions can be characterized by polynomials or other forms such as
exponentials. Still others require differential equations to express them.
In no case are the functions represented as _events_. Instead, they are
expressed mathematically so that one can predict how the output will behave
for _any_ time-course of variable input magnitudes. If, for the linear
function with a proportionality factor of k, an input appears briefly and
immediately disappears, we can predict that an output _k_ times as large
will appear and disappear in synchrony with the input.

A function is generally expressed as the sum of a set of input variables
with weightings or coefficients, for example: z = a*x + b*x^2 + c*y -
d*y^3. In that function there are two input variables, x and y, and one
output variable, z. For any combination of input variable values,
evaluating the formula produces the resulting value of z. As the input
variables change in value, the output variable z changes in value, and its
manner of change can be computed continuously from x and y by using the
formula that defines the function.

If the values of the coefficients, a..d, change, then the form of the
function will change. The same pattern of variations in x and y that
produced a certain pattern of changes in z will now produce a _different_
pattern of changes in z.
For example, if c and d changed to zero, then variations of z would depend
on variations of x alone rather than on both x and y.

Now here is the difference between the behavioristic view of learning and
that of control theory. In behaviorism, variations in z are predicted on
the basis that a variation in x,y will be followed by a variation in z.
Learning consists of an increase in the probability that a z-event will
occur when an x,y-event occurs. In control theory, learning consists of
changes in the coefficients of the function that makes z depend on x,y. It
is a change of the input-output function from one form to another. This is
not because of control theory per se, but because control theory is cast in
terms of functional relationships, not sequences of events.

Notice that control theory, because of this use of functional
relationships, inherently predicts generalization. For any given input
event, a specific output event will be predicted, but if a new pattern of
input changes occurs, the same formula can be used to predict(in the
absence of further learning) the resulting new pattern of output changes.
Nothing in the behaviorist view predicts generalization -- it's a separate
and inexplicable phenomenon.

PCT is a theory of behavior. Reinforcement/EAB is a theory of learning.
This is why a crucial experiment to pit one against the other and
demonstrate a winner is difficult to design.

No, it isn't. All that is required is to introduce disturbances of the
consequence that behaviorists call reinforcement. If the consequence
continues to be produced as before, then clearly the behavior must be
varying in just the way needed to cancel the effects of the disturbances.
This would show that it was not a particular response that was learned, but
a process for VARYING behavior so as to control its consequence. When there
are no disturbances, of course, no variations in the behavior are needed,
so it seems that a particular behavior was learned. Introducing the
disturbance will show whether what was actually learned was a whole control
loop rather than just a particular action.

An example is what happens when the force required to depress a lever is
increased after a rat has learned to produce food by pressing it. The rat
will immediately increase the force until the apparatus clicks and the food
appears. Another example is found in the obesity experiments mentioned in
B:CP. Rats which have learned to press a lever to cause food to be
delivered directly to their stomachs will, when food is independently
delivered, immediately cease pressing the lever. They will not permit
themselves to be overfed as long as they can prevent it by ceasing to
behave. These experiments, and many more like them, show that animals and
people do NOT learn to produce particular responses. They learn to control
particular variables by _immediately and without practice_ varying their
responses as circumstances demand. That is not explainable under the
behavioristic concept of the nature of behavior.

So your contention that control theory investigates behavior while
behaviorism investigates learning does not bear up under close examination.
It is impossible to study learning except in the context of a model of
behavior. And the behaviorist's model of behavior is radically different
from the control theorist's.

Best,

Bill P.

[From Bruce Nevin (2000.11.18 0444 EDT)]

Rick Marken (2000.11.15.2200)--

I think it's possible to experimentally discriminate between
these two views without getting too deeply into the details
of how each learning mechanism works. These two views differ
in terms of _what_ is learned. According to control theory what
is learned is a _control system_; a system that controls for the
"reinforcing" consequence. According to reinforcement theory, what
is learned is a response; the response that is followed by the
reinforcing consequence.

PCT focuses on the details of behavior. The PCT account of learning is less well elaborated. It accounts for behavior (or promises to, at higher levels) so well, that we are fairly comfortable with the relative inspecificity of its account of how perceptual input functions, reference input functions, and output functions are structured, how they change, and how new control loops are created and integrated into an existing control hierarchy. The general principle of reorganization covers a very broad territory with relatively little data and modelling. Problems of learning can all be worked out in the future, modelling behavior is the fundamental thing.

Reinforcement theory focuses on the process of learning. The EAB account of the details of behavior is less well elaborated. EAB gives a well-elaborated account of learning, with a linear-causation account of the details of behavior tacitly assumed in the background. The main enterprise, the account of learning, works so well that there seems little reason to give much attention to PCT's challenges respecting the details of behavior. It is well known and accepted that the specific details of behavior have to be managed statistically, modelling learning and change of behavior is the fundamental thing.

PCT is a theory of behavior. Reinforcement/EAB is a theory of learning. This is why a crucial experiment to pit one against the other and demonstrate a winner is difficult to design. PCT looks at the experiment and says the EAB behavioral outputs are wrong; EAB says so what, we're talking about learning. EAB looks at the experiment and asks where are the PCT data on learning and models of learning; PCT says so what, we're talking about behavior.

Neural nets may be a good model for those PCT functions that create one signal out of many -- output functions, reference input functions (these may be two sides of the same thing, except at the very top and very bottom of the hierarchy), and perceptual input functions. Is it possible to put EAB research into learning through a kind of intellectual centrifuge to separate valid results about learning from bogus assumptions about the linear causation of behavior?

That done, with PCT able to account for the same data about learning, it may be more effective to redirect attention to the need for a theory (PCT) that accounts for the details of the behavior that is learned specifically, rather than statistically.

         Bruce Nevin

···

At 10:00 PM 11/15/2000 -0800, Rick Marken wrote:

[From Bruce Gregory (2000.1118.1149)]

Bill Powers (2000.11.18.0404 MST)

So your contention that control theory investigates behavior while
behaviorism investigates learning does not bear up under close examination.
It is impossible to study learning except in the context of a model of
behavior.

Another exchange bites the dust. Since learning is such an empty concept, I
wonder how he can be so certain? Never, mind, he just is.

BG

[From Rick Marken (2000.11.18.1400)]

Bruce Nevin (2000.11.18 1234 EDT)--

My purpose was not to defend behaviorism but to figure out why
no one has proposed a crucial experiment (as somebody called it,
Popper?) of the sort Rick was asking about, and how to go about
doing so.

Why worry at all about why no one has proposed an experiment? Just
propose one yourself. I could have proposed one myself (I will if
no one chimes in soon) but I think it would be better if others --
especially others who are familiar with both reinforcement theory
and PCT -- started the ball rolling. I think the process of trying
to think up such an experiment would be the best lesson possible
in how to do behavioral research based on an understanding of PCT.
The experiment could be developed iteratively; it doesn't have to
be perfect as first proposed.

I would just like to see someone start the ball rolling by proposing
an experimental test to discriminate the reinforcment from the PCT
views of behavior. Then all interested parties on the net can make
suggestions for improvement. I bet we could have the design for
an acceptable experiment hammered out in a week or so.

It doesn't have to be an experiment that would be convincing to the
conventional psychological community, by the way. I am well aware of
the fact that it is impossible to design that kind of experiment
simply because (as you point out) the conventional psychological
community is already convinced that reinforcement theory makes sense.
What I would like to see us do is develop an experiment that would
convince _us_ -- those of use who are involved in developing the
experiment -- that the PCT view of behavior is (or is not) preferable
to the reinforcement view.

Best

Rick

···

--

Richard S. Marken Phone or Fax: 310 474-0313
Life Learning Associates e-mail: marken@mindreadings.com
mindreadings.com

[From Bruce Gregory (2000.1118.1736)]

Rick Marken (2000.11.18.1400)

It doesn't have to be an experiment that would be convincing to the
conventional psychological community, by the way. I am well aware of
the fact that it is impossible to design that kind of experiment
simply because (as you point out) the conventional psychological
community is already convinced that reinforcement theory makes sense.
What I would like to see us do is develop an experiment that would
convince _us_ -- those of use who are involved in developing the
experiment -- that the PCT view of behavior is (or is not) preferable
to the reinforcement view.

Great idea. However, it has already been done by Bourbon and Powers.
Nevertheless, those who cannot remember the past are welcome to repeat it.

BG