A Stunning Artifact

[From Rick Marken (941117.1615)]

Bruce Abbott (941116.1330) --

Now you've got me explaining subtraction. Let's say dNut = 3.5 before a
tumble and dNut = 2.5 (still positive) after a tumble. Then:

Try THAT definition and see what happens to your probabilities...

Bravo! You seem to have found a reinforcement model that works -- that is,
controls. If you consider the difference between dNut before the tumble
(dNutB) and dNut after the tumble (dNutA) to be the reinforcer (and it is,
indeed, a consequence of tumbling) then dNutA-dNutB will tend to be negative
(punishing) when you have been moving up the gradient and positive
(rewarding) when you have been moving down it. So the reinforcer,
dNutA-dNutB, can be used to alter the probability of tumbling when going up
or down the gradient; the result will be to delay the tumbles appropriately,
depending on whether E. coli is going up or down the gradient; so there will
be control - - that is, E. coli will make it to (and remain at) the maximum
point of the nutrient gradient.

You have apparently done something that I claimed was impossible -- you seem
to have implemented a control system using reinforcement theory. If your
model is indeed a reinforcement model AND it controls, then I was wrong to
say that PCT and reinforcement theory are NOT alternative theories of the
same phenomenon -- control.

I am prepared to concede that your model is a reinforcement model. Now the
question is "does it really control"? That is, does it keep the time integral
of the perceived gradient (dNut) at som default reference level despite
disturbances? One disturbance that might cause problems for your model is a
change in the conditional probability of the angular result of a tumble given
the gradient before the tumble (dNutB). Your model works when all angles are
always equally probable after a tumble; that is, when p(angle|dNutB>0) =
p(angle|dNutB<=0) = 1/360. However, it doesn't work (it produces a random
walk instead of control) when p(angle>180|dNutB>0)=1.0 and
p(angle>180|dNutB<=0) = 0. In this case, a tumble that occurs when you
are moving up the gradient (dNutB>0) is guaranteed to be between 180 and 360
degrees; a tumble that occurs when you are moving down the gradient (dNut<=0)
is guaranteed to be beween 0 and 180 degees. [Other dependencies between
dNutB and angle after a tumble work just as well -- that is, they eliminate
control.] This happens because the angular bias in tumbling eliminates a
"mathematical artifact" from your reinforcement model. The geometry of the
situation (damn geometry!) makes it true that, when all directions of
tumbling are equiprobable, the change in gradient is more likely to be
"worse" after a tumble when going up the gradient and "better" after a tumble
when going down it (assuming that the reference gradient is directly "up").
When the angles after a tumble are not equiprobable, this artifact no longer
exists.

The fact that the reinforcement model no longer controls when the angular
consequences of a tumble are NOT equiprobable shows that your reinforcement
model is not a control model -- it does not control. An actual control model
(and a human control system) have no problem dealing with the changes in
angle probability following a tumble that I described. The control model
simply tumbles when necessary in order to keep dNut positive. Both control
model and person do this even though the probabiliy that a tumble results in
a "worse" gradient is the same whether you tumble when going up or down the
gradient (similarly, the control model and person control even though the
probabiliy that a tumble results in a "better" gradient is the same whether
you tumble when going up or down the gradient).

So why did the reinforcement model seem to produce control in the first
place? The appearance of control was the result of the "mathematical
artifact" - - dNutA- dNutB tends to be, say, "worse" (negative) after a
tumble when going up the the gradient because _when all directions are
equally likely_ there are simply more "wrong" directions than "right" one's
that are available to be taken. This artifact was responsible for the fact
that the probability of a tumble was correctly increased (in response to a
negative gratient) and decreased (in response to a positive one). The result
was the _appearance_ of purposeful motion up the gradient. Once the artifact
is removed, however, the appearance of purpose disappears. It was not the
reinforcement model that was producing control; it was an artifactual
characteristic of the environment that was producing the _appearance_ of
control. Once the artifact is removed, the appearance of control disappears.

Although your reinforcement model does not control, Bruce, it did ruin my day
anyway; it showed that the plain vanilla E. coli demo cannot rule out a
reinforcement model. You found a way of computing reinforcement (dNutA-dNutB)
that does SEEM to produce control (rather than a random walk) in the E. coli
situation (with equiprobable angles after a tumble). This leads me to suspect
that there is probably no one demo -- no experimentum cruxis -- that can deal
a death blow to the idea that reinforcement theory is an alternative to
control theory. With some changes you could probably come up with a
reinforcement model that gives the appearance of controlling even in this new
version of E. coli, where the probabily of different angles after a tumble
depends on whether you are moving up or down the gradient.

I think I'll just leave this reinforcement discussion to Bill Powers for
a while -- especially since he has ruled out my favorite kind of argument --
insult and abuse;-). Bill seems to be as patient as you are. Ultimately, you
will see (when you look carefully at how the models work) that a control
model does something that is completely different than what a reinforcement
model does -- a control model controls. Thus far, I have not seen a
reinforcement model that does that (though you gave me quite a scare
there).

Best

Rick