Waiting for Ecoli4

[From Bill Powers (941104.2110 MST)]

Bruce Abbott (941104.1315 EST)--

Bruce, Rick tells me that there is an Ecoli4 on the way; it hasn't
arrived here yet. He tells me it's getting more complicated. While I
wait, here is something to ponder.

It seems to me that the basic fact that needs to be faced here is that
the value of a delay before a tumble does not predict the value of dNut
after the tumble. If you keep a record of multiple trials, you will find
that delay is related to dNut after the tumble ending the delay, in the
long run, as follows:

                      + |
                     ^ |
           next dNut | 0////////////////////////////////////
                     v | length of previous delay --->
                      - |

In short, the most probable value of dNut following any length of delay
is zero.

The E. coli control model does not use any information about previous
relationships between delay and consequent values of dNut. It makes no
predictions about future values of dNut. It simply makes the best
possible use of information about the current value of dNut in
determining the length of the current delay period, a strategy that
keeps dNut in the desired state for as long as possible, and that allows
it to be in an undesired state for as little time as possible.


A few posts ago you said that in speaking of reinforcement as
controlling behavior, behaviorists were simply expressing a regular
relationship between an independent variable and a dependent variable. I
don't know why I let that slip past, except that perhaps I'm getting
burned out with reiterating the same statement: the controlled variable
is not an independent variable.

In the case of a reinforcement, the rate of reinforcement is not an
independent variable. It depends strictly on the rate of behavior and
the form of the schedule function, and on whatever independent
disturbances are applied to it directly. In an operant conditioning
experiment, the experimenter has no control over the rate of
reinforcement. Only the organism has control over it, if anything does.
An excerpt from Scotty's seminar on control theory:

In the control-system diagram, there are two independent variables: the
disturbance and the reference signal. All other variables in the loop
are dependent variables; each one of them can be expressed as a function
of system parameters, the disturbance, and the reference signal, with
none of the other variables included in the expression.

Refer to the notation in the diagram in my CTBASIC program.

p = Ki*qi

e = r - p

qo = Ko*e

qi = Kf*qo + Kd*d

Solve for qi (equivalent to obtained reinforcement rate):

qi = Kf*(Ko*(r - Ki*qi)) + Kd*d, or

       KfKo*r + Kd*d
qi = ------------------- or
         1 + KoKfKi

qi = ---------- [ r/Ki + d/(KoKf)]
     1 + KfKoKi

Solve for qo (equivalent to observed behavior rate):

qo = Ko*[r - Ki*(Kf*qo + Kd*d)], or

        Ko*r - KoKiKd*d
qo = ------------------- or
         1 + KoKiKf

qo = ----------- (r/(KiKf) - d)
       1 + KiKoKf

So qo and qi, individually, are functions of r and d. Both behavior rate
and reinforcement rate are dependent variables.

By using appropriate units for signals inside the control system, it is
always possible to define Ki as 1, with the units being the conversion
factor from external physical units to signal units. Likewise, it is
possible to choose physical units for measuring qi and qo such that Kf
is 1. This leaves Ko to absorb all of the loop gain KiKoKf. If Ko is a
large number, then

KiKoKf/(1+KiKoKf) ==> 1.
Ki = 1
Kf = 1
for convenience we scale qd so that
Kd = 1 also.

This leads to the useful approximations

qi = r and

qo = r - d.

This shows that the input quantity is very nearly determined by the
reference signal, and that the output quantity is determined jointly by
the disturbance and the reference signal. This means that the
reinforcement rate, which is qi, is determined by the reference signal
inside the organism, and that the behavior rate, qo, depends both on the
organism and on the environment, in this case equally.

This approximation holds only when the product KiKoKf is much larger
than 1 (say, 10 to 100 or more). For lower values of the product, the
exact equations have to be used.

The algebraic equations represent the steady-state solutions of the
differential equations that describe the control system, provided the
system is stable. If the output function is a leaky integrator of the

qo := qo + g*e - decay*qo,

then the equivalent steady-state output gain is g/decay.

You can set up the CTBASIC model with specific parameters and check the
predictions of the above equations. You will find that for zero
disturbance (qd = 0) and high loop gain, the perceptual signal will very
nearly equal the reference signal; you can alter the reference signal
between plots and see the perceptual signal tracking it in the steady
state. Varying qd will have little effect on this tracking if KiKoKf is
100. If Ki is 1, you will see the input quantity qi also tracking the
reference signal. If Ki is other than 1, then qi will behave as p/Ki.

For a zero reference signal, varying the disturbance will result in an
almost constant qi, and an equal and opposite value of qo. If Kf and Kd
are not equal, then you will observe very nearly

qo = (Kd/Kf)*qd

In the present incarnation of CTBASIC, you have to change a variable and
plot it again in order to see what happens when it changes. This happens
automatically when you hit the Enter key.

Bill (Scotty) P. down in the engine room.