reinforcement theory; Erling J. on Hans' model

[From Bill Powers (950609.0910 MDT)]

Bruce Abbott (950608.1930 EST) --

     I am claiming that current reinforcement [theory] provides a
     consistent and generally compelling framework for understanding
     behavior (which is why it has so many adherents) and is capable of
     handling the sort of data Bill P. asserts it cannot (the ratio
     data).

Compelling, yes, consistent within a narrow set of observations,
perhaps, but consistent with all we know, no.

Have you ever seen one of those lawn ornaments with a windmill on it,
and a little jointed man with his arms attached to a crank that turns
with the spinning of the windmill? It takes very little effort, and
initially, none, to see the little man as turning the crank to make the
windmill go. Based on appearances only, we could make a "compelling"
case for saying that the little man is turning the windmill, which is
causing the wind to blow. The little man turns the crank faster to make
more wind blow, turns it slower to make less wind, and stops to make the
wind stop.

Since that explanation seems to fit the facts as well as the other one,
why don't we believe it? Not because of what we observe the little man
doing, but because the explanation doesn't fit ALL the facts we know
about, or can find if we look. But first we have to be willing to look:
we must admit that seeing the little man as the cause is only one
possible interpretation and deliberately consider alternative
interpretations.

···

-----------------------------------
     I need to understand how the PCT model of the universe accounts for
     the apparent motions of the planets, even though those motions are
     only a side-effect of not being at the center of the universe.

Interesting point. The PCT model says that the only universe we know is
the one we perceive from our own point of view, and that this universe,
while probably related to a real universe, is not necessarily a literal
representation of the real one.

We observe planets apparently moving among the stars, with two of them
never getting very far from the sun, and three others following paths
that actually loop back on themselves every year, by different amounts.
If we simply believed our observations, we would have to come up with a
very complex explanation for these different irregular movements.
However, when we make a model of the physical solar system, looking for
underlying mechanisms that might produce these appearances, we come up
with a completely different picture that doesn't look at all like what
we see with our eyes. A few simple underlying principles such as the
inverse-square law of gravitation and conservation of angular momentum
lead to a model with planets in elliptical orbits, with us being on one
spinning planet and observing the others from a moving platform.

If we believe that the model gives a more correct picture than our
observations, we see that while the observations are consistent with the
model, certain of them are illusions -- the apparent yearly reversal of
movement of the outer planets, for example, is not real. The apparent
change of shape of the other planets, as well as our own Moon, is not
real. So we begin to understand that the world we perceive and control
may be quite different from the world that exists.

Glad you brought it up, although I may have taken the argument in a
different direction from what you had in mind.
-----------------------------------------------------------------------
Erling Jorgensen (950608.830 CDT) --

A lot of very interesting observations!

     ... imagination seems to work without much susceptibility to
     external disturbances. This seems to be one of the concerns both
     Rick and Bill have with your demo -- it doesn't always control very
     well. But if you're trying to model imagination, that isolation
     from certain disturbances could be a strength.

In B:CP I suggested that one use of imagination might be in feasibility
testing. A simple example is planning a chess move. In imagination you
can move any of the pieces any way you like, even the other player's
pieces. If you stick to the rules, you can try out different
developments from the current position without actually moving any real
pieces. The opponent never does anything unexpected, so there are no
disturbances.

But feasibility testing has another side. When you're planning some
outcome, the first thing you want to know about a plan is not whether
you could actually carry it out, but whether if you could achieve the
immediate result you have in mind, it would actually get you what you
want. Suppose your Rolex came off and skidded under a parked car. What
you want is to get the Rolex back. So in imagination, you use your left
hand to lift the front end of the car by its bumper, while reaching
underneath with your right hand to get the watch. Immediately, as you
try to imagine doing this, you realize that this plan won't work because
your arm isn't long enough to reach to where the watch is while you're
holding up the front end of the car. So in imagination, you move to the
side of the car and lift it by gripping it under the door. Now your arm
can reach the watch, and the plan is feasible.

So you can check out the feasibility of a plan in imagination, while
taking care of details just by imagining that they have happened. You
create a perception of the desired happening in imagination. Of course
when you try to execute the plan, controlling real-time perceptions, you
may find that something you have imagined as an essential step in the
plan can't actually be done: you can grip the car as imagined, but you
can't bring about the perception of holding the car up with your left
hand. So it's back to feasibility testing, this time taking more factors
into account.

     Doesn't a constructivist approach like PCT come very close to
     saying, our map (the one inside each one of us) _is_ our territory?
     Only a hypothetical (i.e., fictional!) outside observer who
     presumably sees both could say they are not the same.

It's interesting that while we can't verify beyond doubt that our maps
match the territory, we can often prove that they don't match it. When I
got across that 3000 volts, my map of the "off" transmitter was
immediately invalidated. But if my elbow had hit the other, grounded,
terminal of the capacitor I would have gone on assuming that the map was
correct.

     For another thing, imagination -- in addition to not being as
     susceptible to outside disturbances -- also seems to have this
     'full-correction-on- the-next-iteration' aspect. Isn't it the case
     that an imagined set of perceptions can be controlled (generated?)
     much quicker than the comparable "real world" perceptions?

This depends on your experience and knowledge. If I ask you to imagine
pushing a car with a dead battery into the garage where the recharger
is, do you imagine flicking the car into the garage, or do you imagine
leaning on it and pushing hard until it's moving, and then pulling back
on it to stop it before it hits the wall? How accurately we imagine
depends on how much detail we put into our mental models. This is often
the difference between the way the boss imagines carrying out a project,
and the way the engineers to whom he gives orders visualize it.

      Example: consider pretzel-neck giraffes. Until this moment you
     likely never thought about such animals, but probably you now have
     an imagined picture of a giraffe with its neck in the shape of a
     pretzel. It is still "perception", whether it comes via
     imagination, memory, or analog transforms of the environment.

A layman and a biologist would imagine this wonderful image quite
differently (I had to stop laughing before I could think about this).
The biologist knows that there are only seven long vertebrae in the neck
of the giraffe (the same number we have), so the "pretzel" he sees would
be angular, if not impossible because of the bones that would have to be
broken. If you know enough, there are certain things your brain has
difficulty imagining because the imagined perceptions create errors in
other control systems much as if they were real. Try to imagine sucking
on a stick coated with ants. Some would imagine this happily, others
wouldn't. I could have mentioned worse things.

     Is the greatly diminished richness of imagination due to poor
     mimicking of the perceptual input from lower-level systems?

I think it's due more to the level in the hierarchy where the
imagination connection is closed. You can remember _that_ you ate
breakfast without imagining the sights and tastes. If the imagination
connection is closed at a high level, the lower-level perceptions are
missing from the result, so you get a idea of something happening, but
it has no visual, tactile, auditory, etc. components. People are very
different in the level at which they imagine and remember. Bill Leach
reports being a non-visualizer -- except for one experience in which the
loop was closed at such a low level that he actually saw, felt, and so
forth. So Bill knows what it is like to have both kinds of imaginary
experiences. People who imagine ONLY at high levels often express
disbelief that anyone can actually re-experience the sensations involved
in a memory or an imagined scene.

Here's a puzzler, Bill: when you now recall that experience, do you
recall THAT you had it, or do you recall IT?

Incidentally, Hans asked

Why necessarily an "external" helper? It might be an INTERNAL helper,
one that we are genetically endowed with, or it might be a different
internal system, either at a higher hierarchical level or running in
parallel.

All I meant was that the helper is external to the model that was
offered. There might be another subsystem, not modeled, that took care
of the switch-throwing. The actual model offered, however, could not
carry out this vital operation unassisted.

Hans also commented,

The parameters of the calibration curve can be thought of as "higher
level perceptions".

And you replied,
     This sounds different. I thought the world-model and eventually
     the controller received a Kalman-filtered version of the y-
     perception, not a direct sensing of the parameters a, b, and c. Is
     the controller of your diagram (950503) structured differently from
     the comparitor of a standard PCT control loop?

You have good modeling instincts. To "know about" parameters implies
_perceiving_ the parameters, and that requires perceptual functions
sensitive to parameters rather than signals. While such can be imagined,
they introduce a new mode of perception, perceiving _relationships among
signals_ as opposed to the signals themselves. Fortunately, that level
of perception is already in the HPCT model!

     From Bill's equations (950531.0845), I notice:
     >> u = (xopt - c - a*x) / b vs.
     >> o = o + Ko*(qopt - qi)

     Are these forms of the controller compatible?

I talked about this in a very long post, and it made the post so long I
deleted it. The answer is that these comparators are slightly similar if
you just do some algebraic manipulations:

u = (xopt - c - a*x)/b

u = (xopt - x + x - c - ax)/b

u = [(xopt - x) + (1-a)x - c]/b

If you add a modeled disturbance d(t), we get

u = (1/b)(xopt - x) + (1-a)x/b - c/b - d(t)/b

So we have a parallel with gain times the error signal (1/b)(xopt - x),
plus some other terms which would be unnecessary in a negative feedback
control system. Also, Hans' output function is not an integrator, so we
do not have u = u + (the rest) but just u = (the rest). The reason for
the form of Hans' model is that the calculation of u is precisely what
is needed to make x = xopt exactly, regardless of the values of the
parameters in the model and the magnitude of the modeled disturbance (if
any) or the constant c.

     Can the controller be standardized for both models?

No. In Hans's model there are added terms that do not depend on the
error (xopt - x). They are needed to make x = xopt under all conditions.

     Is there anything incompatible with inserting Kalman filters at
     multiple layers of a PCT hierarchy? Or would that obstruct
     standard PCT control (which corrects for errors _without_ knowing
     anything distinct about the disturbance)?

Good question. In the HPCT hierarchy, the properties of the world that a
Kalman filter would try to reflect would include the properties of the
lower-level control systems. Since those properties simplify the
interactions with the world considerably, the Kalman filter might also
prove to be much simpler when there is one filter per controls system.
But somehow I think that the Kalman approach is too elaborate for the
requirements of an elementary control system in the hierarchy; there
must be a simpler way to achieve the same result.

Hans:

You aren't getting this yet. A non-zero-average disturbance component
is EASY to model and EASY to control.

You:
     But isn't the point that Bill's form of controller _doesn't have_
     to model the disturbances in order to control for them.

Hans is only saying that a _constant_ disturbance is easy to model. Just
add a constant to the Kalman filter, which it can do itself if you
provide a place to enter it. A constant disturbance, however, is a
relative idea: it might be constant on one time-scale but variable on
another. Actually the constant c, the white noise, and the arbitrary
low-frequency disturbance should just be lumped together into one
disturbance. They all boil down to a fluctuating value of an independent
variable that can affect the controlled variable independently of the
output of the control system. And as you say, "my" form of controller
doesn't have to model the disturbance to cancel its effects on the
controlled variable just as accurately as Han's model can.

And before Hans objects to that, I should point out that if he can
postulate a physical system capable of creating outputs with infinite
precision in zero time, I am surely permitted to postulate a negative
feedback control system with infinite loop gain.
----------------------------------------------------------------------
Rick Marken (950608.2130) --

     Well, I'd say it's time for you to show how reinforcement theory
     handles the ratio data, pardner. I'm talking about a workin'
     model, friend; no curve fitten'. So it won't do ya' no good ta draw
     =|;-)

Let's make it simpler. Here are two experimental points:

1 210 210

40 90 3000

These are estimates from a printed graph, representing the average for
four rats over an entire session. Same old graph, same old source.

If an increase in reinforcement causes an increase in behavior, how come
a decrease in reinforcement rate of 120 causes an increase in behavior
rate of 2790 (both per session)?

From the same graph, we have

Ratio reinforcement rate behavior rate

40 90 3000

160 <10 1200

Now, for a decrease in reinforcement rate of 80+, we get a decrease in
reinforcement rate of 1800. This is the "right" relationship. Again,
these are estimates from a printed graph, but the errors are not likely
to be more than 10% or so.

Notice that the "right" relationship holds only for the lower 90/210 or
44% of the range of reinforcement rates, and that the relationship
between the extremes is

Ratio reinforcement rate behavior rate

1 210 210

160 <10 1200

It certainly looks as if the most general rule is that the less
reinforcement there is, the more behavior there is. Only if you keep the
received reinforcement rate in the lower range will the opposite or
"right" relationship be seen in terms of incremental changes.
-----------------------------------------------------------------------
Best to all,

Bill P.

<[Bill Leach 950609.20:45 U.S. Eastern Time Zone]

[From Bill Powers (950609.0910 MDT)]

Here's a puzzler, Bill: when you now recall that experience, do you
recall THAT you had it, or do you recall IT?

Now at least, all I recall is that I did have the experience. It seems
that parts of the experience would "almost" reoccur or a couple of short
segments would "replay" when I thought about it in times past. Even now,
I feel that I "almost" get an image of the instructor in front of the
class teaching.

-bill