E. coli calculations

[Martin Taylor 960223 11:40]

Bill Powers (960223.0630 MST)

Some questions. In 2 dimensions, there seems to be a tendency to move
away from the target up to a radius of 2 units. But this makes sense
only if we include the case where the object moves _past_ the target,
its final position being measured not at its closest approach but at the
end of an integer number of steps. Consider this case:

       start end
         X ------------|-----------|
                    :
                    :
                    :
                    :
                    :
                    o target

The first step brings the object closer to the target, but the next step
takes the object farther from the target than the starting position. So
this would appear to be a move _away_ from the target, because of the
discrete nature of the steps, even though the initial step brought the
object closer to the target.

Exactly so. But as they say, "it's not a bug, it's a feature." We assume
that the seeker cannot know in advance whether the next step will continue
the improvement, and when it finds out that the next step makes matters
worse, it can't backtrack to where it was before. It just has to go on
and try to improve from where it arrived. It's rather like saying that
I would have made more on the stock market if I had sold yesterday, at
the peak. I would have, but unfortunately I didn't know it was the peak
until today, and nobody will now buy it yesterday.

I don't know if this will make much difference in your final results,
but you really should treat the continuous case to find the actual point
of closest approach.

That's what we did in our earlier series of studies. And so long as my retired
friend keeps an interest, we will go back to it later. Now we are looking
to see what effect a finite step size has, and how it makes a difference
whether the "seeker" can determine infinitely fast and precisely whether it
is going in an "improving" direction. Since making the step discrete does
affect the results in a quite dramatic way, the next stage is probably to
treat the situation with a continuous move, but with disturbances and/or
imprecise measurement by the seeker of the current distance to the target.

If a real e-coli were to change directions when it determined it was going
further from the target, it couldn't do so after moving an infinitesimal
distance. It couldn't because its sensitivity to the criterion variable
isn't infinitely precise, and because there are likely to be real fluctuations
(random disturbances) to its CEV. We know that it is wrong to assume that
the seeker can always tell the difference without error after a unit step
but not after any lesser step, but the "truth" is probably somewhere
between this and the results for a continuous infinitely precise seeker.

Approximating the continuous case involves reducing the size of a step
in relation to a unit of distance in the space. If you made the step one
tenth of the size above, the overshoot would be only one tenth of a unit
of distance, and the same move as above would end with the object closer
to the target instead of further away.

Exactly so. The step size is the scale unit of the space. If you say that the
unit step represents 1 meter, then at 16 dimensions the seeker will come
to an equilibrium distance of about 5 m. If you say that the unit step
represents 1 nanometer, then the equilibrium distance is roughly 5 nm.
By making the steps infinitesimally small, you approximate the continuous
case as closely as you want.

I'd like to understand how you pick a new direction at random. The way I
would do it would be to pick a random delta-x[i] for each x[i], then
multiply the distance moved in each dimension by

         delta-x[i]
       ---------------
      sqrt(sum(delta-x[i]^2))
            i

To get the change in position on each axis.

Is this what you did?

Yes, having just looked at the code for the first time. The code isn't
exactly that, but the effect is the same. The code is:

  for (i=0; i<dim; i++){
    c[i] = 2.*random()/MAXINT - 1.;
    csq += c[i]*c[i];
  }
  csq = sqrt(csq); /* csq now is a normalizing factor making the c[i] vector
                      be of unit length */

The c[i] components are added to the corresponding components of the current
location vector and the distance to the origin compared with the previous
distance. If it is worse, the trial terminates, but if it is better, the
same c[i] vector is added again, and so forth.

More work on it this afternoon, but no promises of new results. That depends
on the success in writing and debugging code.

Actually, I think one of the really interesting results of yesterday's
work is the incredibly low probability of getting even a two unit move to
arrive closer to the target in a high-dimensional space when the target
distance is small (at infinite distance it's always a probability of 0.5,
and at a distance of 1 the probability is zero, so it's at intermediate
distances that the question is interesting).

Martin