R. coli: A new challenge to reinforcement theory

[From Rick Marken (950615.0900)]

Here is a new version of the E. coli task that I will call R. coli because (I
think) all consequences of responses are completely _R_andom.

The task is (surprise) to keep a cursor aligned with a target. The subject
affects the cursor by pressing the mouse button (the response). The
consequence of this response is that the cursor moves to one of five
randomly determined positions on the screen. One of these five positions is
the target position.

So the result of any response is one of five randomly determined cursor
positions. The cursor remains in this randomly determined position until
another response occurs or until the position is changed (randomly) by an
external disturbance.

Here is the HyperCard script for the basic experiment:

repeat with n = 1 to 500
    if press=1 then
        put random(5)*50 into y
        put 0 into press
        if n mod distint = 0 then
            put random(2)*100 into y
            put random(80)+80 into distint
        end if
    end if
    if y = target then put on+1 into on
    set the loc of card button "coli" to x,y
    put on/n into card field "control"
end repeat

On each iteration of the repeat loop the program looks to see whether or not
there has been a response (GetPress). If there is a responmse (the variable
press has been set to 1) then a new random position is selected for the
cursor (in the y dimension). If there was no response the program checks to
see if the interval between disturbances (disint) has expired; if so, the
cursor is randomly moved to one of two, non-target positions and a new
interval between disturbances is selected.

If the cursor position matches the target position (y = target) then the on
target counter (on) is incremented. A running indication of the proportion
of time (iterations of the repeat loop) during which the cursor is on target
(on/n) is printed out as the variable "control".

When I do this experiment using the parameters shown I am able to keep the
cursor on target for about 60% of the 500 trials ("control" = .6). When I
don't respond at all, the cursor is on target for 0% of the trials
("control" is 0.0). If I replay the pattern of presses I made in one run into
a different run (where the random function produces different random
consequences and different disturbance intervals) the cursor stays on
target for about 9% of the trials ("control" = .09). I think this latter
figure is the best comparison for evaluating the level of control achieved
in the experiment. Responding in this R. coli experiment keeps the cursor on
target about 6 times longer than would be the case if responding were random.

I think it would be very easy to write a control model of the R. coli task.

But the real question now is whether one can develop a reinforcement model
of this task. This is clearly a reinforcement situation: responses (presses)
have consequences (new position of the cursor). Some consequences are good;
the cursor ends up at the target position (this is presumably a reinforcer)
and some are not (these are presumably punishers). These reinforcing and
punishing consequences of responses are (as far as I can tell) completely
random. Nevertheless, the net result of responses is that the cursor stays on

I believe (but, given my experience with E. coli I am not certain) that
there is no way for a reinforcement model to explain the results of this
R. coli experiment. All consequences of responding seem (to me) to be
completely random; there seems to be no systematicity . But I'm the one
who wants to show that reinforcement theory cannot explain control so I may
not be looking carefully enough. So I hope Bruce Abbott will take a shot at
developing a reinforcement model of R. coli. Don't worry Bruce; if
reinforcement theory can handle this one I'll go right back to the drawing
board again. I'm a glutton for punishment (and reinforcement).