[From Rick Marken (941117.1615)]

Bruce Abbott (941116.1330) --

Now you've got me explaining subtraction. Let's say dNut = 3.5 before a

tumble and dNut = 2.5 (still positive) after a tumble. Then:

Try THAT definition and see what happens to your probabilities...

Bravo! You seem to have found a reinforcement model that works -- that is,

controls. If you consider the difference between dNut before the tumble

(dNutB) and dNut after the tumble (dNutA) to be the reinforcer (and it is,

indeed, a consequence of tumbling) then dNutA-dNutB will tend to be negative

(punishing) when you have been moving up the gradient and positive

(rewarding) when you have been moving down it. So the reinforcer,

dNutA-dNutB, can be used to alter the probability of tumbling when going up

or down the gradient; the result will be to delay the tumbles appropriately,

depending on whether E. coli is going up or down the gradient; so there will

be control - - that is, E. coli will make it to (and remain at) the maximum

point of the nutrient gradient.

You have apparently done something that I claimed was impossible -- you seem

to have implemented a control system using reinforcement theory. If your

model is indeed a reinforcement model AND it controls, then I was wrong to

say that PCT and reinforcement theory are NOT alternative theories of the

same phenomenon -- control.

I am prepared to concede that your model is a reinforcement model. Now the

question is "does it really control"? That is, does it keep the time integral

of the perceived gradient (dNut) at som default reference level despite

disturbances? One disturbance that might cause problems for your model is a

change in the conditional probability of the angular result of a tumble given

the gradient before the tumble (dNutB). Your model works when all angles are

always equally probable after a tumble; that is, when p(angle|dNutB>0) =

p(angle|dNutB<=0) = 1/360. However, it doesn't work (it produces a random

walk instead of control) when p(angle>180|dNutB>0)=1.0 and

p(angle>180|dNutB<=0) = 0. In this case, a tumble that occurs when you

are moving up the gradient (dNutB>0) is guaranteed to be between 180 and 360

degrees; a tumble that occurs when you are moving down the gradient (dNut<=0)

is guaranteed to be beween 0 and 180 degees. [Other dependencies between

dNutB and angle after a tumble work just as well -- that is, they eliminate

control.] This happens because the angular bias in tumbling eliminates a

"mathematical artifact" from your reinforcement model. The geometry of the

situation (damn geometry!) makes it true that, when all directions of

tumbling are equiprobable, the change in gradient is more likely to be

"worse" after a tumble when going up the gradient and "better" after a tumble

when going down it (assuming that the reference gradient is directly "up").

When the angles after a tumble are not equiprobable, this artifact no longer

exists.

The fact that the reinforcement model no longer controls when the angular

consequences of a tumble are NOT equiprobable shows that your reinforcement

model is not a control model -- it does not control. An actual control model

(and a human control system) have no problem dealing with the changes in

angle probability following a tumble that I described. The control model

simply tumbles when necessary in order to keep dNut positive. Both control

model and person do this even though the probabiliy that a tumble results in

a "worse" gradient is the same whether you tumble when going up or down the

gradient (similarly, the control model and person control even though the

probabiliy that a tumble results in a "better" gradient is the same whether

you tumble when going up or down the gradient).

So why did the reinforcement model seem to produce control in the first

place? The appearance of control was the result of the "mathematical

artifact" - - dNutA- dNutB tends to be, say, "worse" (negative) after a

tumble when going up the the gradient because _when all directions are

equally likely_ there are simply more "wrong" directions than "right" one's

that are available to be taken. This artifact was responsible for the fact

that the probability of a tumble was correctly increased (in response to a

negative gratient) and decreased (in response to a positive one). The result

was the _appearance_ of purposeful motion up the gradient. Once the artifact

is removed, however, the appearance of purpose disappears. It was not the

reinforcement model that was producing control; it was an artifactual

characteristic of the environment that was producing the _appearance_ of

control. Once the artifact is removed, the appearance of control disappears.

Although your reinforcement model does not control, Bruce, it did ruin my day

anyway; it showed that the plain vanilla E. coli demo cannot rule out a

reinforcement model. You found a way of computing reinforcement (dNutA-dNutB)

that does SEEM to produce control (rather than a random walk) in the E. coli

situation (with equiprobable angles after a tumble). This leads me to suspect

that there is probably no one demo -- no experimentum cruxis -- that can deal

a death blow to the idea that reinforcement theory is an alternative to

control theory. With some changes you could probably come up with a

reinforcement model that gives the appearance of controlling even in this new

version of E. coli, where the probabily of different angles after a tumble

depends on whether you are moving up or down the gradient.

I think I'll just leave this reinforcement discussion to Bill Powers for

a while -- especially since he has ruled out my favorite kind of argument --

insult and abuse;-). Bill seems to be as patient as you are. Ultimately, you

will see (when you look carefully at how the models work) that a control

model does something that is completely different than what a reinforcement

model does -- a control model controls. Thus far, I have not seen a

reinforcement model that does that (though you gave me quite a scare

there).

Best

Rick