EAB and PCT

[From Bill Powers (960406.1640 MST)]

Bruce Abbott (960406.1255 EST) --

     Research by Sidman, by Dinsmoor, and by Herrnstein and Hineline
     strongly suggested that rats could somehow integrate events over
     time and respond on the basis of average shock frequency reduction

You're going to have to explain that one a bit further. What does it
mean to say that rats respond "on the basis of" something? Is this the
same as saying they response _to_ something? Is what they are responding
to the average shock rate, or the average shock rate _reduction_? Or is
what is meant that they emit responses (i.e., act) _in order to_ reduce
shock rate? Or is your odd wording an attempt to _avoid_ saying that,
while still preserving the same meaning?

     Data from concurrent and concurrent chains schedules indicated that
     the proportion of responding emitted on one of two keys tended to
     match the relative rate of reinforcement associated with the
     schedule programmed on that key (for VI schedules, at least) ...

We've been through this before. I think that the interpretation of
"matching" phenomena comes as close to fudging the data as anything I've
seen in this field. Is exclusive preference "matching?" Why doesn't the
principle say that the rate of responding is proportional to the
schedule parameter, or some other such unequivocal proposition? Could it
be because there is no clear way to say what matching means that will
make the data fit the hypothesis? When you say that the behavior
"tended" to match, you also imply that it tended NOT to match. What
measure of matching is used? What would perfect matching be? How much
matching error must occur before you discard the idea of matching?

···

-----------------------------
Sorry, Bruce. I really don't want to get back into this. What we're
doing with your rat experiments is real science that will lead to solid
results, and we won't have to ambiguate our language to talk about them.
You have an investment in EAB, and want to present it in the best light,
but that's really your problem, not mine. I don't want to fall back into
EAB-bashing. And I find the defenses unconvincing. So let's turn to what
we agree about and keep on keeping on.
----------------------------------------------------------------------
Best,

Bill P.

[From Bruce Abbott (960407.1045 EST)]

Bill Powers (960406.1640 MST) --

Bruce Abbott (960406.1255 EST) --

    Research by Sidman, by Dinsmoor, and by Herrnstein and Hineline
    strongly suggested that rats could somehow integrate events over
    time and respond on the basis of average shock frequency reduction

You're going to have to explain that one a bit further. What does it
mean to say that rats respond "on the basis of" something? Is this the
same as saying they response _to_ something?

What I was trying to get across is the notion that the organism is sensitive
to the rate of events over time, and can distinguish the reduction in rate
that occurs as a function of its behavioral output on the manipulandum. In
a reinforcement-based explanation, the reduction in shock frequency would
serve as the reinforcer for responding.

    Data from concurrent and concurrent chains schedules indicated that
    the proportion of responding emitted on one of two keys tended to
    match the relative rate of reinforcement associated with the
    schedule programmed on that key (for VI schedules, at least) ...

We've been through this before. I think that the interpretation of
"matching" phenomena comes as close to fudging the data as anything I've
seen in this field.

You must have some strange new definition of "fudging the data." In normal
scientific discourse, fudging the data means altering the numbers to make
them fit the theory. It's a serious charge and one I wouldn't be tossing
around lightly.

It another thing entirely for one to conclude that some function _fits_ the
data when in fact the fit isn't all that good. This, I think, is what you
intend to convey, but what you actually said leaves a much more sinister
implication than is warrented.

Is exclusive preference "matching?" Why doesn't the
principle say that the rate of responding is proportional to the
schedule parameter, or some other such unequivocal proposition? Could it
be because there is no clear way to say what matching means that will
make the data fit the hypothesis?

Matching as a description of performance was developed for concurrent VI-VI
schedules; later it was noted that "matching" _could_ be used to describe
performance on concurrent ratio schedules, but that it would be only
trivially true. So let's stick to the situation for which the description
was intended, for starters. The principle as initially formulated did apply
to the schedule parameters. However, where there were large differences in
VI values between keys, it was soon noticed that using the obtained rather
than scheduled rates provided a better fit. We both know why, but as a
_description_ of a relationship between variables matching still held.

When you say that the behavior
"tended" to match, you also imply that it tended NOT to match. What
measure of matching is used? What would perfect matching be? How much
matching error must occur before you discard the idea of matching?

As usual in any scientific work, fits can be evaluated only within the
limits of precision of the measure. Within the limits of precision in the
original studies, relative response rates matched relative reinforcement
rates. Perfect matching would have been identical values for relative
response rate and relative reinforcement rate, but one has to allow for
measurement error (even in physics). By now a large number of studies have
investigated matching. Measurement precision has improved, and it has been
shown that matching as a general principle does not hold.

Sorry, Bruce. I really don't want to get back into this. What we're
doing with your rat experiments is real science that will lead to solid
results, and we won't have to ambiguate our language to talk about them.
You have an investment in EAB, and want to present it in the best light,
but that's really your problem, not mine. I don't want to fall back into
EAB-bashing. And I find the defenses unconvincing. So let's turn to what
we agree about and keep on keeping on.

I didn't think I was presenting an argument -- certainly not an argument for
the reinforcement view. I was offering a bit of history to show how
empirical findings of the early 1960s led to a line of enquiry concerning
global principles like optimization and melioration. This presentation was
offered to show how these developed naturally from the data that were
developed at that time. You had asserted that going after such global
principles first puts the cart before the horse, that one should begin with
the data. I was showing how the theories actually emerged in the context of
examining the data. For some reason you wish to characterize this
contribution as reflecting my "investment" in EAB (a charge you also levied
agaist Chris Cherpas). Well, in my view, you wish to pontificate about
things of which you know little, making smug charges of "fudging data" and
the like (without evidence), and it does not make you happy to have your
nice little fantasies contradicted by someone more familiar with the facts.

Regards,

Bruce

[From Rick Marken (960407.1430)]

Bruce Abbott (960406.1255 EST) --

Research by Sidman, by Dinsmoor, and by Herrnstein and Hineline
strongly suggested that rats could somehow integrate events over time >and respond on the basis of average shock frequency reduction

Bill Powers (960406.1640 MST) --

You're going to have to explain that one a bit further. What does it
mean to say that rats respond "on the basis of" something? Is this >thesame as saying they response _to_ something?

Bruce Abbott (960407.1045 EST) --

What I was trying to get across is the notion that the organism is >sensitive to the rate of events over time, and can distinguish the >reduction in rate that occurs as a function of its behavioral output >on the manipulandum.

I think Bill's problem was the same as mine. You said that research
"strongly suggested" that rats "respond on the basis of" a perception.
Now I know, and Bill knows, that _you_ know that this is only what
this research "strongly suggests" to researchers who don't know that
animals are controlling. I'm sure neither Bill nor I would have been
inclined to say anything about your comments if you had just made it a
bit clearer that you were saying, not what _you_ thought the research
strongly suggested, but what the EAB researchers thought the research
strongly suggested. For example, it might have been clearer if you had
said something like:

Research by so and so strongly suggested that rats could perceive
events integrated over time; unfortunately, these researchers
mistakenly concluded that rats respond on the basis of these
perceptions. Because of the way the research was conducted, these
researchers failed to notice the possibility that these perceptions
were controlled by the rat.

Best

Rick

[From Bill Powers (970912.0400 MDT)]

I guess until this gets said I don't get back to sleep.

Some ideas are just so simple there there doesn't seem to be any way to
express them. The difference between EAB and PCT is one of these ideas.
It's so obvious and elementary -- or anyway seems so at the moment -- that
it's easy to find yourself going twice around the block just to go next door.

First, what is a contingency? It's a relation among physical variables and
nothing more. Every natural law is a contingency. Every dependency of one
environmental variable on another one is a contingency. It makes absolutely
no difference whether an organism's behavior or something else affects the
independent part of the contingency, and it makes absolutely no difference
whether the dependent part affects, is perceived by, or is otherwise
associated with an organism.

All that a contingency says is that if variable B is to change, one or more
of the variables (A1, A2... An) on which it depends must change. This is a
recognition that in the physical world, nothing happens spontaneously.
Nothing moves of itself. Any physical variable is a function of other
physical variables: B = f(A1, A2, ... An).

In an operant conditioning situation, the world is set up so that the only
way a certain consequence can be created is for certain antecedent events
to happen in some particular way. If the consequence does occur, then one
or more of the antecedents on which it depends must have occurred. If a
necessary antecedent depends on a particular behavior, then the behavior
must have occurred. So the consequence depends on at least one of the
necessary behaviors' having occurred.

What we observe is that some kinds of consequences, some of the time, are
made to occur to a degree higher than background by behavior when it is
possible for that to happen -- that is, when a contingency exists. Behavior
will keep changing until certain consequences are made to repeat at a
non-chance level.

Suppose we change the contingency: now a _different_ behavior will produce
the _same_ consequence. We observe the same thing: behavior will change
until it produces the same consequence as it did before. What we usually
find is that whatever behavior will produce certain consequences, that
behavior will eventually be produced.

Putting all these observations together, we can arrive at a clear
conclusion: the consequence produced by a given behavior has no particular
amount or direction of effect on behavior. We can show that the same
consequence will be produced by one behavior as by a different one, and
that it may even be produced by one behavior or its opposite.

The only factor that remains the same across these different conditions is
the consequence. Whatever the form of the contingency, the behavior will
change until the same consequence as before is produced by it. No matter
what behavior must be performed (within the limits of possibility), the
food pellets will be delivered and eaten.

Since the production of the consequence remains the same over a set of
different behaviors and contingencies, its occurrance cannot explain _any_
of the behaviors that produce it. If taken as a cause of one behavior that
does occur, it can't also be taken as the cause of a different behavior
that didn't occur this time but did occur at another time, or as the cause
of a behavior directed oppositely, which also didn't occur. Its causal
properties would have to change with each change of contingency, so as to
cause exactly the behavior needed to produce itself, no matter what that
behavior might be.

This renders the concept of the consequence as a cause vacuous. It is a
cause that has the property of being able to cause whatever effect is
observed, and that means it is not a cause at all. What we actually observe
is that behavior changes in such a way as to keep producing the same
consequence, whenever a change in environmental properties occurs.

The essential vacuousness here is not visible when one considers only one
contingency and one consequence, so there is only one behavior that can
produce the consequence. Nor it is visible when different consequences are
made to result from a given behavior. The critical case is where different
behaviors are necessary, because of different contingencies, to produce the
same consequence. Only when we consider the proposed effect of consequences
on behavior across different experiments related in this way can we see
that consequences actually have no causal relation to behavior.

We're talking, of course, about the basic phenomenon described by William
James, to no avail, 100 years ago: variable means achieving constant ends.
This observation was ignored, leaving the way open to behaviorism and to
Pavlov's concept of reinforcement.

Best,

Bill P.

[Hans Blom, 970915b]

(Bill Powers (970912.0400 MDT))

Some ideas are just so simple there there doesn't seem to be any way
to express them.

Ideas only _appear_ simple. No idea truly is. Explanations have no
bottom, regrettably.

First, what is a contingency? It's a relation among physical
variables and nothing more. Every natural law is a contingency.
Every dependency of one environmental variable on another one is a
contingency. It makes absolutely no difference whether an organism's
behavior or something else affects the independent part of the
contingency, and it makes absolutely no difference whether the
dependent part affects, is perceived by, or is otherwise associated
with an organism.

All that a contingency says is that if variable B is to change, one
or more of the variables (A1, A2... An) on which it depends must
change. This is a recognition that in the physical world, nothing
happens spontaneously. Nothing moves of itself. Any physical
variable is a function of other physical variables: B = f(A1, A2,
... An).

Strange. When I offered this same argument in the past -- that the
world shows law-like behaviors -- you rejected it, with the argument
that disturbances -- unlawlike behaviors -- are overwhelmingly the
rule. Now what is it, Bill? Are organisms controllers because they
effectively "reject" disturbances, or because they learn the laws of
the world? Or do the two go hand in glove?

Right, contingencies are relations. That is simple. But how come that
rats can get to know -- and use -- only some contingencies and not
others? Why is that different for pigeons or for humans? What are the
limitations, why do they exist, and why are they so strongly species-
specific? That's _not_ simple. Given the fact that we humans can
perceive a lot of contingencies that rats and pigeons cannot, one may
extrapolate and wonder how limited we humans are in this respect, and
how much lawfulness there is out there where we only believe we see
"disturbances". Why did it take millenia before someone discovered
the relation F=ma which we now find so obvious?

Suppose we change the contingency: now a _different_ behavior will
produce the _same_ consequence. We observe the same thing: behavior
will change until it produces the same consequence as it did before.
What we usually find is that whatever behavior will produce certain
consequences, that behavior will eventually be produced.

You skip a step. When the contingency is changed, the first thing
that is observed (both by the rat and the external observer) is that
the _same_ behavior produces a _different_ consequence. It must be
this perception (of failure) that prompts the animal to try different
behavior. Until -- at long last, after much exploration, maybe -- it
discovers some behavior that realizes the original desire. Or not: if
pushing the lever does not produce pellets anymore, no adequate
behavior is left, no control is possible any more. Discovering the
contingency preceeds its use. Learning preceeds using the knowledge
that is learned. Simple?

Putting all these observations together, we can arrive at a clear
conclusion: the consequence produced by a given behavior has no
particular amount or direction of effect on behavior. We can show
that the same consequence will be produced by one behavior as by a
different one, and that it may even be produced by one behavior or
its opposite.

But what _is_ constant? Two things. First, the rat will (within its
limitations) learn the contingency. Second: the rat will use its
knowledge of that contingency in "computing" its actions. If the
physical contingency is broken, the rat will soon learn that its
lever pushing actions are ineffective. As a result, it will stop
acting (this way). In your analysis, the lack of discrimination
between learning (getting to know the environment function) and
acting (using the acquired knowledge of the environment function) is
a great source of confusion.

Since the production of the consequence remains the same over a set
of different behaviors and contingencies, its occurrance cannot
explain _any_ of the behaviors that produce it.

You oversimplify things. Your analysis presupposes both that _some_
alternative way to achieve the goal is available, and that this
alternative can be discovered.

If taken as a cause of one behavior that does occur, it can't also
be taken as the cause of a different behavior that didn't occur this
time but did occur at another time, or as the cause of a behavior
directed oppositely, which also didn't occur. Its causal properties
would have to change with each change of contingency, so as to cause
exactly the behavior needed to produce itself, no matter what that
behavior might be.

Do you see now how "perception of a relation" -- and its deployment
in the control loop -- solves this apparent paradox?

This renders the concept of the consequence as a cause vacuous. It
is a cause that has the property of being able to cause whatever
effect is observed, and that means it is not a cause at all. What we
actually observe is that behavior changes in such a way as to keep
producing the same consequence, whenever a change in environmental
properties occurs.

You miss something. An analysis of what _does_ remain constant (or
gets "strengthened") might help...

Greetings,

Hans

[From Rick Marken (970915.0820)]

Hans Blom (970915) --

A loop gain of 1000 hardly ever occurs in practice

I'd guess that the loop gain of the typical biological control
system is in the 10^3 to 10^6 range. In our tracking experiments
the disturbance has only a fraction of its expected effect,
suggesting a loop gain of at least 10^3.

I've only encountered such high loop gains in simulations ;-).

You should try observing nature;-)

if one considers learning, that in the case of a high gain control
system (p = 0.999 r + 0.001 d) there is hardly any perception of the
disturbance.

There is _never_ any perception of the disturbance by the control
system that is controlling the "disturbed" perception; all the control
system perceives is the combined result of disturbances and its
own output. This is old, old news, Hans.

Hans Blom (970915b) --

Are organisms controllers because they effectively "reject"
disturbances, or because they learn the laws of the world? Or
do the two go hand in glove?

Organisms are controllers because they exist in a closed-loop,
negative feedback relationship with respect to their own sensory
input. If they were not controllers they would not have been
able to maintain themselves as organisms; they would just be
dust (which is what happens to collections of matter that are
organized "like a rock" rather than like a control system).

Because organisms are controllers they automatically act to resist
disturbances to the variables they are controlling. We know that
they can do this controlling without learning the laws of the
world because we can build artificial systems that control without
knowing anything about the laws of the world. Nevertheless, it is
possible that living systems do learn the laws of the world in
order to control, but, so far, experimental tests (see
http://home.earthlink.net/~rmarken/ControlDemo/OpenLoop.html) reject
this possibility.

Bill Powers said:

Since the production of the consequence remains the same over a set
of different behaviors and contingencies, its occurrance cannot
explain _any_ of the behaviors that produce it.

Hans Blom replies:

You oversimplify things. Your analysis presupposes both that
_some_ alternative way to achieve the goal is available, and
that this alternative can be discovered... Do you see now how
"perception of a relation" -- and its deployment in the control
loop -- solves this apparent paradox?

I don't. Could you give the solution, please?

Best

Rick

···

--
Richard S. Marken Phone or Fax: 310 474-0313
Life Learning Associates e-mail: rmarken@earthlink.net
http://home.earthlink.net/~rmarken

i.kurtzer (970915)

[Hans Blom, 970915b]

(Bill Powers (970912.0400 MDT))

Bill:

All that a contingency says is that if variable B is to change, one
or more of the variables (A1, A2... An) on which it depends must
change. This is a recognition that in the physical world, nothing
happens spontaneously. Nothing moves of itself. Any physical
variable is a function of other physical variables: B = f(A1, A2,

Hans:

Strange. When I offered this same argument in the past -- that >the world

shows law-like behaviors -- you rejected it, with the >argument that
disturbances -- unlawlike behaviors -- are >overwhelmingly the rule. Now
what is it, Bill? Are organisms >controllers because they effectively
"reject" disturbances, or >because they learn the laws of the world? Or do
the two go hand >in glove?

Are you suggesting that a monkey knows about tensile strength as it swings
on a vine? Also, controllers do not "reject" , effectively or not,
disturbances.

In your analysis, the lack of discrimination
between learning (getting to know the environment function) and
acting (using the acquired knowledge of the environment function) is
a great source of confusion.

how did you just define learning as "getting to know the enviromental
function"; did i miss something? was that a conclusion or a premise?

i.

[Hans Blom, 970916d]

(i.kurtzer (970915))

Strange. When I offered this same argument in the past -- that
the world shows law-like behaviors -- you rejected it, with the
argument that disturbances -- unlawlike behaviors -- are
overwhelmingly the rule. Now what is it, Bill? Are organisms
controllers because they effectively "reject" disturbances, or
because they learn the laws of the world? Or do the two go hand
in glove?

Are you suggesting that a monkey knows about tensile strength as it
swings on a vine?

Certainly. Although it may not know the (human) concept of tensile
strength, it knows enough _about_ the tensile strengths of vines to
be able swing. When you swung on vines (or ropes) as a kid, did you
know the concept "tensile strength"? No, but you did acquire the
knowledge (which you would find very hard to express in words) which
ones would most likely support you and which ones would not. Would
you maintain that it is important to know the concept of "tensile
strength" in order to swing on a vine? Do you need to know how a car
engine works before you drive a car? Do you need to know how the
light bulb lights up before you flip the switch? Being able to _use_
a relationship does not imply being able to consciously "understand"
or "explain" it.

Also, controllers do not "reject" , effectively or not,
disturbances.

But you do understand what I intend to get across when I use these
words?

In your analysis, the lack of discrimination between learning
(getting to know the environment function) and acting (using the
acquired knowledge of the environment function) is a great source
of confusion.

how did you just define learning as "getting to know the
enviromental function"; did i miss something? was that a
conclusion or a premise?

Hey, that would make an interesting discussion! Yes, you did miss
something: I've discussed extensively how a controller is only able
to control relative to the environment in which it finds itself. It
is the _whole_ loop, for instance, that determines what the loop gain
is. A controller cannot arbitrarily decide what its gain should be.
Suppose it picks the value 1000 but then it happens to find itself in
an environment whose gain is 0.000001: we wouldn't consider the
controller to be a controller at all...

But back to your remark. First it was a conclusion. Simply replace
the word "conclusion" by "higher level perception" and we're back to
a PCT discussion. Then I recognized the reliability of this
conclusion and I started to _use_ it -- as a premise, you could say
-- for the "computation" of my actions, e.g. in the design of control
systems. And with some success, I might say. But then, that "success"
might appear arbitrary to you, because it would be success relative
to _my_ goals, not yours.

"It's all perception", some say. I extend this -- except, maybe, for
the lowest levels of the HPCT hierarchy -- to "it's all conclusion".
Would that be wrong?

Greetings,

Hans

[From Bill Powers (970917.1535 MDT)]
Hans Blom, 970915b--

All that a contingency says is that if variable B is to change, one
or more of the variables (A1, A2... An) on which it depends must
change. This is a recognition that in the physical world, nothing
happens spontaneously. Nothing moves of itself. Any physical
variable is a function of other physical variables: B = f(A1, A2,
... An).

Strange. When I offered this same argument in the past -- that the
world shows law-like behaviors -- you rejected it, with the argument
that disturbances -- unlawlike behaviors -- are overwhelmingly the
rule.

Disturbances have nothing to do with contingencies or laws: they are
extraneous influences added to the consequences of behavior. We measure
laws under special conditions in which all extraneous influences have been
excluded, to the best of our ability. Outside the laboratory, no such
special conditions exist. However, once we are aware that disturbances do
exist, we can measure them, too, and see that a control system resists
their effects even without being able to sense or model them. We can then
see that most of behavior consists of adjusting outputs to oppose the
effects of unanticipated disturbances.

Now what is it, Bill? Are organisms controllers because they
effectively "reject" disturbances, or because they learn the laws of
the world? Or do the two go hand in glove?

I have not become more tolerant of this mode of argument than I was before.
I don't want to get involved.

Best,

Bill P.

[Hans Blom, 970918b]

(Bill Powers (970917.1535 MDT))

Disturbances have nothing to do with contingencies or laws: they are
extraneous influences added to the consequences of behavior.

Ah, I see: you take disturbances as being fundamentally additive,
whereas I prefer to not introduce this limitation as an a priori. You
say, for instance, that

  perception = f (action) + disturbance

whereas I pose that the more general model would be

  perception = f (action, disturbance)

Sometimes I even favor the model

  perception = f-disturbance (action)

where f-disturbance is a function (think of a function as a mapping)
with a random component. The latter two models are mathematically
equivalent; only the notation differs.

The former (and yours) allows the interpretation that the two terms
represent the two "causes" for my perceptions, (1) my actions, i.e.
me as actor, and (2) unpredictable vagaries, modeled as a second
"actor" on the world ("the hand of God"? "fate"?). The latter allows
the interpretation that it is only my actions that determine what I
perceive, but through a badly predictable or partly indeterministic
world; with the implication that if only f-disturbance could be
better known, I'd be in a situation with less noise, higher signal-
to-noise ration, and better control.

Psychologically speaking, I find the second interpretation far more
empowering -- and in line with the equally empowering view that
organisms are controllers ("you are in control!").

Moreover, mathematics shows that your model is the first order
(series expansion) approximation of mine. If higher orders can be
neglected (which we often do, for Occam's reasons of convenience and
simplicity), all three models equivalent.

Ah, models. Math says they're equivalent, yet interpretations may
vary widely. Isn't that strange?

Now what is it, Bill? Are organisms controllers because they
effectively "reject" disturbances, or because they learn the laws
of the world? Or do the two go hand in glove?

I have not become more tolerant of this mode of argument than I was
before. I don't want to get involved.

May I remind you -- once more -- that it is _the whole loop_ which
determines the characteristics of the control system? The environment
contributes just as much to the most basic properties of the control
system (e.g. loop gain, phase margin, stability, quality of control)
as the controller. Either we take it as an a priori given (a divine
miracle?) that organisms/controllers/humans just happen to be adapted
to their environment or we study the ways in which this adaptation
arose (in a species) and arises (in each individual).

But that is just _my_ opinion. If you're not involved, others will do
the job.

Greetings,

Hans

[From Bill Powers (970918.0837 MDT)]

Hans Blom, 970918b--

Ah, I see: you take disturbances as being fundamentally additive,
whereas I prefer to not introduce this limitation as an a priori.

Yes, additive disturbances are one kind that is very common in real
environments. They arise when the controlled variable is directly affected
by influences in the environment other than the actions of the control
system itself. You treat such additive disturbances as "noise," as if
nothing can be done about them. But within the bandwidth of control, a
control system of the PCT type can counteract the effects of additive
disturbances; only the frequency components above the bandwidth of control
remain unopposed. Even if the disturbance is "random," the components
within the bandwidth of control will be opposed by corresponding
fluctuations in the control system's action, so the lower-frequency effects
on the controlled variable will be suppressed.

By the way, don't tell me that _your_ model-based control system _can_ do
this. I have determined that your program actually relied on real-time
error information in opposing the disturbance; it was actually a
closed-loop PCT control system and it always needed real-time perceptions
to oppose the disturbance. If the perceptions were lost, your system would
instantly lose the ability to oppose arbitrary additive disturbances.

The other kind of disturbance is one that affects the form of the
environmental feedback function, which you call "the plant." To compensate
for this kind of disturbance, a model-based control system must adapt its
internal model to reduce the difference between the predicted and actual
perceptions. If any parameter of the plant is noisy, the effects of the
noise can be opposed only up to the bandwidth of the adaptation process. In
the PCT model, the effects of noisy parameters in the environmental
feedback function can be opposed up to the full bandwidth of control, if
the noise amplitude is not too great (so it grossly affects the loop gain).

If the adaptation process is made to operate very fast, so as to match the
performance of a PCT model, the main reason for using model-based control
is lost: the ability to go on producing control-like outputs after the
real-time input is cut off. The system can't distinguish between a
disturbance and a total loss of input, and it will try to adapt as if a
parameter had changed, with the same undesirable effects that we see in a
PCT system when the input is cut off.

... the two terms
represent the two "causes" for my perceptions, (1) my actions, i.e.
me as actor, and (2) unpredictable vagaries, modeled as a second
"actor" on the world ("the hand of God"? "fate"?). The latter allows
the interpretation that it is only my actions that determine what I
perceive, but through a badly predictable or partly indeterministic
world; with the implication that if only f-disturbance could be
better known, I'd be in a situation with less noise, higher signal-
to-noise ration, and better control.

The PCT model can oppose the effects of even "hand-of-God" disturbances,
whether additive or multiplicative, up to the high-frequency limit of
control. Fortunately, in the real world, random effects tend to be of the
1/f sort; the higher the frequency, the smaller the effect of the
disturbance. The bandwidth of control in real organisms seems to be such
that all the large fluctutations, that have effects important to the
organism, are cancelled out by control action; the remainder have effects
too small to be of importance. There are exceptions, of course.

Moreover, mathematics shows that your model is the first order
(series expansion) approximation of mine. If higher orders can be
neglected (which we often do, for Occam's reasons of convenience and
simplicity), all three models equivalent.

Except that in order to control in the absence of input for any appreciable
length of time, the model-based system must adapt correspondingly slowly,
so it loses the advantage of being able to counteract arbitrary
disturbances within the bandwidth of control. The PCT model then becomes
the model of choice, because it can nullify the effects of disturbances
without having to predict them.

Now what is it, Bill? Are organisms controllers because they
effectively "reject" disturbances, or because they learn the laws
of the world? Or do the two go hand in glove?

No, organisms do not, in general, and in ways other than metaphorically,
"learn the laws of the world." I have tried to point out that a PCT system
contains no literal model of the world, yet it controls perfectly well. Its
output function can be very simple, yet it can control very complex
processes. Your response has consistently been that if the PCT model does
control, it MUST contain something that can be SPOKEN OF as a model of the
world. This shows only that you have difficulty in distinguishing premises
from conclusions. Your model-based control examples all have contained a
literal simulation of the properties of the world lying between the
system's output and its input. That is, they contain a literal model of the
world. But the PCT model accomplishes the same degree of control without
any such simulation. If you want to say that "absence of a simulation" is,
through some rationalization, the same as "presence of a simulation," feel
free -- but don't expect me to accept the argument.

One last thing. I do wish that you'd give up your idea that the PCT model
is not concerned with adaptation or learning. It is. It's just that
adaptation and learning are dealt with as a separate problem from the
problem of describing and predicting the performance of a system when
adaptation and learning are complete. In the PCT model, the reorganizing
processes are attributed to a distinct set of functions that act ON a
control system. And it is recognized that most such adaptive processes take
place on a time scale that is much slower than the time scale for
performance of control.

Your conception tries to combine the performance and adaptation aspects
into a single system architecture, by using _one particular method_ of
adaptation, a very computation-intensive method. It's a clever method, but
it has drawbacks and it requires computations that are extremely unlikely
to exist anywhere but at the higher levels of control. It certainly can't
explain the way the spinal reflexes perform or adapt. You're confusing a
clever invention that has certain nice features with a model of a real
organism.

Best,

Bill P.

[Hans Blom, 970923b]

(Bill Powers (970918.0837 MDT))

Thanks for this post. It shows up a number of misunderstandings that
may need to be rectified. In a later post I'll return to the theme of
implicit versus explicit models.

... additive disturbances are one kind that is very common in real
environments.

I just wanted to remind you that the "additivity" of disturbances is
a _model_, a way of perceiving (and, in the process, simplifying) the
"world". It is both very common and very useful. If possible at all
(sometimes it is not), I employ this simplification as well.

You treat such additive disturbances as "noise," as if nothing can
be done about them. But within the bandwidth of control, a control
system of the PCT type can counteract the effects of additive
disturbances; only the frequency components above the bandwidth of
control remain unopposed.

We basically agree, even though we have a different "explanation".
Where we differ is that you prefer to talk in terms of the _frequency
domain_ and thus the _frequency response_ of a controller (where
bandwidth is a central notion) and that I prefer to talk in terms of
the the _time domain_ and thus the _impulse response_ or _step
response_ of that controller (where bandwidth is not a meaningful
notion but settling time, for instance, is). Both notions are
mathematically equivalent: they are just transformations of each
other. So neither is "more correct" than the other; they just employ
a different terminology. Where you say "within the bandwidth of
control, a control system can counteract the effects of disturbances"
I would say "a control system can effectively counteract slowly
varying but not rapidly varying disturbances". But that is
essentially the same thing, looked at from different angles.

But note also, that one formalism can express some "intuitive truths"
much more easily than the other -- and the other way around. Why do I
pester you so much with the formalism that you're less acquainted
with rather than employing yours? Well, just for this reason. Using
yours, for instance, you see no "model" inside a PCT controller
except, maybe, in the most metaphorical manner. Using mine, however,
I insist that there must be one, however hidden. Mathematically,
there would be no "conflict"; just a very complex expression versus
another, very simple one. That, I guess, is the reason why I'm not a
single paradigm person...

Even if the disturbance is "random," the components within the
bandwidth of control will be opposed by corresponding fluctuations
in the control system's action, so the lower-frequency effects on
the controlled variable will be suppressed.

We also saw this in my theodolite demo, in a quite analog fashion. An
immediate (very high frequency) change of the disturbance could not
be counteracted, but when the disturbance remained constant (_very_
low frequency), it was fully suppressed.

There is a basic difference, however, between controllers with
different criteria of what is "good" control. A frequency domain
specification says that control must be accurate within a certain
control bandwidth. A time domain specification may say, for instance,
that a step change of the disturbance must be counteracted as well as
possible. These are different criteria, and they will most probably
lead to different designs.

"But organisms were not designed", you might say. "But they were", I
might counter, "implicitly. Organisms that weren't sufficiently able
to cope with their environment just don't exist anymore. Those that
are still alive were effectively designed by their environment". And
who would be "right"?

By the way, don't tell me that _your_ model-based control system
_can_ do this [counteract a rapid disturbance, e.g. a step change].

Sure it can. Remember that it models the _impulse or step response_.
If it discovers a step-wise change of the disturbance, it "knows"
(through its model -- which may be more or less accurate) what the
future perceptual effects of this step-wise change will be, if
unopposed. And it is therefore able to oppose them with the right
countermeasures immediately after the step change has been perceived
-- which is, of course, always after the fact (if only one sample
period).

I have determined that your program actually relied on real-time
error information in opposing the disturbance; it was actually a
closed-loop PCT control system and it always needed real-time
perceptions to oppose the disturbance. If the perceptions were lost,
your system would instantly lose the ability to oppose arbitrary
additive disturbances.

Of course. And, if you remember, I showed exactly that in a demo.
What you say is equivalent to saying that with eyes closed we don't
see. What you say is true for _any_ controller, whatever its type.
What is a PCT controller without its perception? Note that _being
without a perception_ is NOT the same as _having a perception of
value zero_ (a matter of great misunderstanding).

If the adaptation process is made to operate very fast, so as to
match the performance of a PCT model, the main reason for using
model-based control is lost: the ability to go on producing
control-like outputs after the real-time input is cut off.

Please enlighten me: how could there possibly be _any_ (let alone
"very fast") adaptation when the real-time input is cut off? When
there can be no adaptation/learning, we effectively use only what we
learned in the past, although we also ought to be less certain about
what we (think we) know. That's what an adaptive controller does.
That's also a pretty accurate description of someone returning to a
city that he hasn't seen or heard about for a long time: expect the
same, but don't be too certain about it...

By the way, how does the PCT model handle (un)certainty?

The system can't distinguish between a disturbance and a total loss
of input...

A system _ought to be able to_ make this distinction. An adaptive
controller must, but in practice the distinction is often hard-wired
in. Just like in us humans, who just seem to know that with our eyes
closed we don't obtain (much) information about the world out there.
There are, however, methods to implement this as a separate and extra
mechanism in adaptive controllers. Kludges, mostly, so far; no good
hard theory is available yet.

The PCT model can oppose the effects of even "hand-of-God"
disturbances, whether additive or multiplicative, up to the
high-frequency limit of control.

In the time-domain description, one would say that the "hand-of-God"
is pretty predictable (slow?).

Fortunately, in the real world, random effects tend to be of the
1/f sort; the higher the frequency, the smaller the effect of the
disturbance.

Yes, that helps to better predict things ;-).

... in order to control in the absence of input for any appreciable
length of time, the model-based system must adapt correspondingly
slowly, so it loses the advantage of being able to counteract
arbitrary disturbances within the bandwidth of control. The PCT
model then becomes the model of choice, because it can nullify the
effects of disturbances without having to predict them.

The PCT model becomes the model of choice _in the absence of input
for any appreciable length of time_? I'm flabbergasted: the PCT model
_cannot_. It always depends on input being there and being compared
to the reference. If there is no input (NOT input = 0!), there can be
no PCT control. At least not until you've actually implemented the
imagination mode!

No, organisms do not, in general, and in ways other than
metaphorically, "learn the laws of the world." I have tried to point
out that a PCT system contains no literal model of the world, yet it
controls perfectly well.

And I've tried to point out that even a PCT system _must_ contain
some "model" of the world, in whatever form. I don't care whether
that model is "literal" or not, whatever you mean with that.

Finally, since I recognize that _any_ model is a metaphor, I fully
agree with your sentiment that organisms only metaphorically "learn
the laws of the world". Agreement at last!

Greetings,

Hans

[From Bill Powers (941022.0815 MDT)]

Bruce Abbott (941020.1815) --

Demonstrating control over behavior in EAB means identifying variables
that, when manipulated, can be shown to alter the behavior in
predictable ways.

PCT says there are are such variables and that by manipulating them one
can control behavior. The two classes of variables that can be used to
control behavior are the feedback function (as in the schedule of
reinforcement) and disturbances (not much used in EAB). By altering the
form of the feedback function you force the organism to alter its
behavior in order to maintain the former effect on its inputs; by
applying disturbances directly to the controlled inputs (for example by
arbitrarily adding or removing rewards), you again force a change in
behavior for the same reason. In all cases, the organism is organized to
maintain its own inputs at preferred levels; the behavior itself is
varied as required to do this. That is why the behavior can be
controlled by outside agencies; normally, the organism is not
controlling its behavior but varying it as necessary to control the
consequences of the behavior.

Usually the experimenter does not view himself or herself as
controlling; rather it is the environment that does the controlling.

This is a matter of language. In PCT, we say that controlling is a
process of varying actions to produce a preselected perceptual (and
actual) result. So if the experimenter manipulates variables and sees
that they produce a predicted result, this is control. Of course control
is best seen in the presence of disturbances. If some extraneous factor
like a flapping curtain or sticking of the key disturbs the behavior
that the experimenter is predicting, so it doesn't fit the prediction,
the experimenter will do something to oppose the effects of the
disturbance -- remove the cause, shield the organism from the effects,
or what have you. The experimenter will be satisfied when the observed
behavior again matches the reference level, the predicted behavior.

As Rick Marken points out, for the environment to "control" the
behavior, it would have to be pretty smart (as we use the term control
in PCT). Control is varying an action until its consequences are in, or
return to, a desired state. The controlling that the experimenter does
is mostly finished by the time the final experiment is carried out for
publication. The schedules of reinforcement have been chosen and tweaked
and the amount and quality of the reinforcers have been adjusted to
bring out the desired features of behavior. The animal has been brought
to the proper body weight by adjusting its diet outside the experimental
apparatus, and of course the animal has been placed in the apparatus.
Usually, as I understand it, there has been a period of "shaping" during
which the experimenter leads the organism into performing the behavior
that the experimenter wants to see.

Experimenter control is illustrated by your remarks to Tom about the
COD:

...the changeover delay prevents the pigeon from adopting a simple
strategy for earning reward from the two keys: simple alternation. On
concurrent VI-VI schedules the pigeon can maximize the rate of reward
only by responding on both keys. When it responds on one key the
schedule programmed on the other key just keeps running, so it becomes
increasing likely as time passes that a reinforcer will have been set
up on that key. In reinforcement-theory terms, because switching keys
produces (in the absence of a COD) immediate reinforcement fairly
often, switching behavior is itself reinforced by the immediate payoff.
Imposing the COD prevents immediate reinforcement following a switch
and prevents reinforcement of simple alternation between the keys (peck
left, peck right, peck left, etc.). Because switching guarantees no
reinforcement for the period of the COD, it discourages pigeons from
switching as often as they otherwise would.

This is clearly control behavior on the part of the experimenter (as
well as the pigeon). For some reason (which perhaps you can explain),
the experimenter did not want to see the pigeon using a simple
alternation as a way of providing rewards for itself. In order to
prevent this strategy from succeeding, so that some other strategy would
have to be used, the experimenter altered the conditions until the
"simple alternation" would not work any more. So the experimenter had
the goal of doing away with that kind of behavior, and by manipulating
the conditions achieved it. That is certainly control in the PCT sense,
and it was certainly done by the experimenter, not the environment.

This is also a clear illustration of conflict between two control
systems. The experimenter wants to see a certain kind of behavior from
the pigeon and sets up a two-key situation which he thinks will produce
it. The experimenter does not care about the food rewards, and so is
willing to vary the experimental setup that produces them until the
desired behavior is observed (perceived). The pigeon, which does not
care how or where it pecks but cares very much about getting the food
reward, uses the keys to supply itself with the reward, eventually
discovering the (or a) most efficient method, which is to peck
alternately on the keys.

But this behavior does not match what the experimenter wanted to see --
perhaps it does not fit what the experimenter thinks of as "choice
behavior." So somehow the experimenter has to alter the setup to make
the simple alternation less effective. The experimenter reasons that if
a change-over delay is used, the pigeon will not be rewarded immediately
for pecking on the other key (that is, the pigeon's control of the
reward input will be made less effective), and perhaps the pigeon will
shift to some other strategy more like what the experimenter perceives
as making a choice. So the experimenter introduces the COD, and keeps
increasing its length until finally the pigeon does abandon simple
alternation. Now the experimenter sees what he wants to see, and the
pigeon finds a different action that will deliver what the pigeon wants,
the reward. However, the experimenter must be somewhat less than
satisfied because of having had to introduce an ad-hoc feature to the
experiment, and the pigeon is somewhat less than satisfied because more
effort is now required to obtain the same input.

···

--------------------------------

My guess is that it pecks on the key to which it has just switched:
there is not time to do much else (usually around 2 seconds) and the
COD contingency would work like a fixed-interval 2-s schedule on those
occasions when a reinforcer was already set up on the VI schedule
associated with the key.

The effect of varying the length of the COD has been investigated, and
there has even been some work trying offer a unified account of
behavior on concurrent schedules with and without COD.

I have always been struck by the extreme complexity of EAB experiments.
What with mixes of schedules each with infinitely variable parameters,
adding CODs, adding arbitrary time-outs, varying the force-compliance
and direction of movement of keys and bars, changing the size and
gustatory qualities of rewards, introducing indirect effects (like the
changeover key), bringing in running wheels and access to mates,
adjusting the "richness" of the environment, and probably dozens more
adjustments of variables, I don't see how any coherent model of behavior
can be generated. If all these factors are important, then each one of
them must be explicitly varied in each experiment to see whether their
effects are interacting with the variables nominally being tested. If,
for example, reward size has been found to have an effect, then how can
you do an experiment in which the reward is "access" to food with no
control of how much food is ingested on each access, and indeed no
record being kept of the quantity ingested?

It seems to me that EAB researchers, like those in the rest of
psychology, don't take each others' findings very seriously. Each
investigator, or group, starts essentially from scratch, ignoring all
the variables that others have tested and not even reporting on their
status. What was the size of the cage and the lighting? Were there
potential mates or rivals within eye, ear, or nose range? How much work
was required to depress the key or bar? What was the nutritive content
of the food pellets actually swallowed? What was the free-feeding rate
of ingestion of the type of reward used for each individual? How were
the animals handled? Since all these factors have been studied, and many
more, and have been found to have enough effect to warrant a
publication, why were these factors not at least noted and recorded as
part of the experimental conditions? If they were not, are we to
conclude that these other effects have ceased to exist, or that the work
which established them has been discredited? Is it not possible that the
effect seen in a given experiment would disappear or change markedly if
some other factor, known but not accounted for, happened to change?

It's all much too complicated for my simple mind.
---------------------------------------------------------------------

Animals do not display matching under most two-key conditions, although
they tend to do well with VI-VI schedules.

Isn't this begging the question a bit? When you say they "tend to do
well" on VI-VI schedules, what this tells me is that they don't show
matching behavior -- some do, perhaps, but some don't, so the matching
law isn't a law at all. If you have a real law of behavior, it's got to
work all the time, with essentially all organisms to which it applies.
And why on earth would one talk of a "matching law" if "animals do not
display matching under most two-key conditions"? To me, that suggests
that this law has in fact been disproven: it's not a law. Why are we
even talking about it?
----------------------------------------------------------------------

Judging from some films I've seen, pigeons working on these schedules
don't spend all their time pecking on the keys. They occasionally turn
away, perhaps to peck at the floor or as a way to space out responding.
On especially long intervals they may exhibit signs of emotionality--
flapping their wings briefly, for example. An they sometimes check out
the food magazine prematurely. Despite all this they manage to respond
at a fairly uniform rate on the keys and usually manage to collect most
reinforcers soon after they are set up.

-----------------------------------------------------------------------
I suspect that this accounts for the two curves we see on Fig. 7.18 in
Staddon's book. The curves show rate of responding versus rate of
rewards, for eight schedules. The two curves are very nearly congruent,
one being a magnified version of the other in BOTH dimensions. I beat my
brain to a pulp trying to find a model that would reproduce both curves
without requiring that most of the parameters be changed from one to the
other. Then I realized that if the animals were not spending all their
time pressing the bar, the differences in both dimensions would be
entirely accounted for without any change in the model.
The larger curve is for animals maintained at 80% of their free-feeding
body weight. This is like a 160-pound man being starved down to 128
pounds. My guess is that the animals would spend very little time away
from the bar and the feeding dipper. The smaller curve is for animals
maintained at 98% of their free-feeding weight. The curve has the same
shape, but the numbers on both axes are nearly halved. This is what you
would see if the animals really responded just as they do at 80% body
weight, but spent only half their time at the bar. The data are
presented in terms of total presses per session and total rewards per
session, so there is no way to tell what the pressing rates and reward
rates were while the animal was actually engaged in the task. This is
why I need Motherall's raw data. I doubt that it exists any more.

This guess is encouraged by the figure on the previous page, which also
has a large and a small curve of the same shapes, but in which the
difference in conditions was introduction of a running-wheel into the
cage.
-----------------------------
You say you don't work with pigeons. What _do_ you work with? Getting
the kind of data needed for PCT modeling from the literature has proven
a frustrating job; what's needed is the raw data, a record of every key-
press and every reward against time, as well as observations of what the
animal was really DOING all of the time. If you're doing experiments,
all this data must be available. Do you have anything (simple) we could
start with? I really hate measuring figures with a millimeter rule
trying to get the numbers, without knowing how accurately the points
were plotted and printed or what the data for the individuals were. From
my few results so far, I'm confident that we can come up with a PCT
model that will predict behavior quite well, but to do this we need to
know the raw data, not published summaries of averages. In the model
that fits Staddon's (Motherall's) data, I had to estimate each schedule
by finding the nearest whole number that would match the plotted (and
measured) rates of behaviors and rewards, and I get different numbers
each time I measure some of the rather tiny values from the position of
the points. So no matter how well the model fits, I don't trust it. You
must have some data we could work with, or perhaps you could even devote
some time to doing a real PCT experiment for a simple situation. What
about it?
-----------------------------------------------------------------------
Best,

Bill P.