Saccharine; selecting behavior

[From Bill Powers (941220.0930 MST)]

Bruce Abbott (941216.1940 EST)--

Yes, you said that the saccharine is being in some sense "mistaken" for
a nutrient sugar. But saccharine, while sweet and attractive to rats,
does nothing to correct an intrinsic reference level, if by that you
mean something needed for survival.

If the reorganizing system worked according to what is _factually_ good
for the organism, then nobody would smoke, drink, do drugs, climb cliffs
and 29,000-foot mountains, overeat, or watch television. But the
reorganizing system is just a control system like any other: all it can
control are its perceptions. Evolution has provided it with sensors
which represent certain bodily states, and it is the output of those
sensors, combined with intrinsic reference signals, that determines when
reorganization will happen. These sensors sample the state of the body
(or whatever) according to their physical nature. And that means that
more than one kind of input can produce the same sensory signal. A given
input might be factually bad for the organism, but if it creates a
sensory signal that matches the reference signal, no organization will
occur.

Of course controlling for the taste of sugar may well be learned, not
intrinsic. I can imagine that a more basic control system would monitor
the glucose concentration in the bloodstream and report a low
concentration as an error calling for reorganization. The result of that
reorganization might reasonably be thought to be behavioral control
systems that control for the taste of things that result in increasing
blood sugar. So the taste of sugar would come to be a learned controlled
variable.

Once that control system is established, it, too, controls a perception,
not the reality. If behavior produces the taste of sugar, fine -- that
means the organism is eating something that will raise blood glucose.
But it may be eating something that produces the same perception without
raising blood glucose -- namely, saccharine. So the organism would
control for a saccharine input (the same perception), but the intrinsic
error would continue. I suppose that if the animal had a choice between
saccharine-containing substances and sugar-containing substances, it
might eventually learn to pick the right substance on the basis of some
perception other than sweetness. Many people who have tried saccharine
(myself included) have gone back to sugar because they "don't like the
aftertaste" -- meaning a perception that is not associated with an
increase in well-being, a perception which sugar doesn't provide. There
would have to be some perceptual basis other than sweetness for making
the choice.

And the rats will work for saccharine even when they have become
satiated on actual nutrient foods.

That's neat! I presume this means they will work for saccharine even if
they've been eating sweet stuff and have "satiated" (reached the
reference level). To understand this, however, we'd have to know what
the real (i.e., higher-order) controlled variable is. If it's something
other than taste -- fullness, or certain levels of food constituents --
the eating of saccharine solution wouldn't drive any of the other
controlled variables over their reference levels. Satiation has to be
understood as a collection of perceptions matching their reference
signals, not just one. I notice that in eating macaroni and cheese, I
get very "satiated" with the taste and don't want any more, but can
still manage to finish my peas without discomfort, and even put down a
piece of apple pie. Burp.

Another thing to keep in mind is the one-way control system. There's no
reason to set up a two-way control system for something that is so hard
to get that it seldom gets up to the reference level no matter how hard
we behave (like money). A one-way control system contains only one sign
of error signal, the opposite sign being equivalent to zero error. In
the case of sugar, it may be that sweetness is hard enough to get that
our control systems develop to achieve _at least_ a certain amount of
sweetness, but don't response to _too much_ sweetness by rejecting it.
So other control systems that keep scarfing down the sweet stuff (for
other reasons) aren't turned off by a "supernormal stimulus."

So what triggers the reorganization necessary for the rat to acquire
lever-pressing in this situation? Put another way, is reorganization
theory a kind of "drive reduction" explanation for learning?

Don't assume that every behavior that works in new circumstances
involves reorganization. If the rat already knows how to produce an
input by lever-pressing, it's already got the basic control system. But
if a naive rat really will learn lever-pressing to get saccharine when
it is already satiated, we have to assume that it isn't satiated with
respect to whatever perception is affected by saccharine. I wouldn't
want to guess too much -- after all, the point is to start with the
observations and _then_ find a model that explains them.

Approval for my research protocol arrived this morning, so I am clear
to order some rats. However, another roadblock has appeared ...

Oy, isn't it ever thus?

ยทยทยท

----------------------------------------------------------------------
Bruce Abbott (941216.1500 EST)-- (replying to Peter Burke)

... control systems, once brought into being, usually exhibit only a
limited set of behavioral responses. I agree with you that it needs to
be shown "exactly how the error signal accomplishes this proper
selection of behavior."

It's important to distinguish between producing an amount of a specific
kind of behavior and choosing behaviors that are qualitatively
different. When you change from one FR schedule to another, the kind of
behavior remains the same -- repetitive pressing of a bar -- but the
amount of the behavior changes -- how rapidly the bar is pressed. So a
single control system with a single error signal can handle the changes
in behavior that occur when the FR ratio changes.

In the PCT model, we assume that for each controlled variable there is
just one kind of behavior involved, so that all the error signal has to
do is to cause more or less of it (where "less" can mean reversing the
sign of the behavior when negative values make sense, as in pushing
versus pulling). If you want to control a different kind of variable,
you need another physically distinct control system.

Suppose that there are two bars that produce two different perceptions:
one of food, and one of water. Now we have the same lower-level control
system involved in controlling rate of pressing (whichever bar is being
used), but we have a second control system that controls where the
pressing takes place. And we have a third control system, because the
controlled variables are different: food and water. These all share the
use of the same lower-level motor control systems.

Clearly, we have to have a second level of control. It varies the
reference signals determining where pressing takes place, and it selects
the control system for producing food or the one for producing water.
The reference level for food or water is set by other systems that
monitor the somatic state of the organism (not necessarily the
reorganizing system), so when the food-control system is selected (after
changing the location of pressing) the rate of pressing is determined by
the food error signal, and so forth. The program I sent out yesterday
(re-sent to you, by the way) illustrates how this selection might occur
through gating the output function on and off.

We can imagine a higher-level system that controls for some generalized
perception by sending reference signals to many lower systems. One or a
few of the lower level systems actually do most the work, but if the
environmental constraints change the others may do most of the work.
This would look like selecting among lower behaviors but would involve
no actual selection; all the lower control systems would be active all
of the time.

But at some level, the error signal of a control system might actually
map onto a set of outputs that select from an array of lower-level
systems, actually turning them on and off. For small errors you use one
lower control system, for larger errors (or errors of the other sign)
you use qualitatively different control systems, etc..

All these possibilities can be modeled. The question is, which of them
do we need to account for a specific kind of behavior? To find the right
model, we have to test them against experiments, and vary the
experimental conditions to see which models continue to predict
correctly. Each model will suggest ways of changing conditions that
would lead to predicted changes in behavior, which we can then look for.

My point is that there isn't just one "PCT model." PCT is a set of
principles that can be embodied in many different specific models. The
way I have envisioned the hierarchy is to assume a single controlled
variable for each control system, which would be wasteful except that
this permits each control system to be extremely simple. But maybe there
are other ways of imagining the setup that would work better without the
proliferation of specialized systems. I've only just begun, tentatively,
to suppose that higher systems can gate the output functions of lower
systems on and off, as well as varying reference signals for lower
systems. There may be some pitfalls in this gating concept, but we'll
never know until we try it out. There's a lot of room for creative
modeling here, and a corresponding need for inventing experimental tests
that will tell us which ideas work and which fail.

Interesting point you make about Guthrie -- that he had a very
reorganization-like concept of reinforcement.

My earlier question (as yet unanswered) about how I am able to quickly
compensate for changes in the environmental feedback function (I used
reversal as an example) was directed toward exploring this issue.

Reversal is different from changes in the proportionality factor, which
are easily handled. To handle reversals quickly you need a second level
of system which perceives, for example, the partial derivative of handle
position with respect to the controlled variable. If moving the handle
makes the perceptual signal change the wrong way, the sign of the output
function in the lower system needs to be reversed. So the higher system
would perceive this relationship, compare it with the desired
relationship, and act to flip the sign of the connection in the lower
system to keep the relationship right. From Rick Marken's experiments
with reversals, it seems that there is such a higher system, and that it
takes about 480 milliseconds to effect the reversal.
------------------------------------------------------------------------
I'll be off to Boulder tomorrow morning, so may be off the net for a day
or three. Merry Christmas to all who take the sentiment appropriately.

Bill P.