More e-coli results

[Martin Taylor 950220 18:50]

In a previous message, I reported on the probability that an N-dimensional
e-coli jump would be closer to the optimum than the current e-coli position,
given that the jump was of length no greater than M. Since then, we have done
two more simulation runs, both using jumps of length exactly M. The first
asked the question "What is the probability that the jump will reduce
the distance to the optimum" and the second "What is the probability
that the jump will reduce the distance to the optimum by at least 5%."

The first run gives answers practically identical to the earlier, as follows
(I'll try to write them in a way easier to understand). In all cases,
the step is one unit, and the optimum is M units away. The entries in the
table are 100 x the probability that a step will lead to a position closer
to the optimum by any amount, however small (blank areas not tested, 0 means
less than 0.5%).

Dim M=1.5 2 2.5 3 3.5 4 5 6 8 10 12 14 16 20 24

4 33 38 40 42 43 44
8 23 31 35 37 39 40
16 11 21 27 31 34 36
32 0 9 17 23 27 30
64 0 0 3 11 17 21 27 31 36 39 41 42 43 44 45
128 9 17 23 30 34 36 38 40 42 43
256 0 4 11 21 27 31 34 36 39 40
512 0 0 0 9 17 23 27 30 34 36
1024 0 0 0 0 4 11 17 21 27 31
2048 0 0 0 0 0 0 3 9 17 23
4096 0 0 0 0 0 0 0 0 4 11
8192 0 0 0 0 0 0 0 0 0 0

No matter what the dimensionality, there is a reasonable (>5%) probability
of some improvement with the jump, provided the optimum is far enough away.

The situation is quite different for the probability that the improvement is
at least 5%. For one thing, such a large improvement is impossible if the
optimum is more than 20 units away (or less than 0.95 units). For another,
at high dimensionality, to get within that "small" an angle--the angle of
the cone that is tangent to a hypersphere of radius 0.95 from a point at
distance 1.0--is a very low probability event. So there is an optimum
distance from the target, at which the probability of an improvement of
at least 5% is maximum--and for all tested dimensions, that optimum is
at a distance around 3 units! Here are the results:

Dim M=1 2 3 4 5 6 8 10 12 14 16
4 18 32 35 34 33 31 26 19 12 6 2
8 4 22 25 25 23 12 4 1 0 0 0
16 0 9 14 13 9 5 1 0 0 0 0
24 0 1 5 4 1 0 0 0 0 0 0
32 0 0 1 1 0 0 0 0 0 0 0

So the probability of getting much improvement from a jump gets to be very
small for even fairly low dimensionality. I think we'll try for a 1%
improvement next.

However, do not despair for e-coli in high-dimensional spaces. The real
question is the probability that the direction is within a certain cone
whose surface passes adequately close to the optimum, and we haven't tested
that question yet. The zeros for large values of M will not come into
play for this question. I suspect it has a much easier analytic answer, and
tomorrow I'll ask.

This all has to be finished before March 31, as my mathematical colleague
will be retiring then.

[Martin Taylor 960222 18:30]

We ran some more trials today with the e-coli seeker that can move only in
steps of size one unit. This time, we started the seeker at some fixed
radius from the target and let it make exactly one move. A move was defined
as one step if the direction led the seeker further away from the target,
and more than one step in the same direction if the first step led
toward the target. A same-direction move toward the target stopped after
the first step that no longer brought the seeker nearer than the last
step had done. To clarify, see the diagram in 2-D; the vertical bars
indicate succesive steps:

    Start Stop
      X---|---|---|---|---|---X
                           .
                           .
                           O Target

We ran 10,000 trials at each of the radii 2^n where n ranged from 0 to 6,
and at each dimensionality 2^n where n ranged from 1 to 8. For each set of
10,000 moves we found the expected amount by which the resulting point
moved away from the target (a negative expectation is toward the target).
We also computed the standard deviation of the radius change. Equilibrium
is at the radius where the expected change of radius is zero. These data
provide a much cleaner result than my by-eye results previously reported.

The equilibrium distance R for a number of dimensions D is very close to
a straight line in log-log space for dimensionality less than 128. To
within nearly the thickness of my fine pen-point log2(R) = (7/4)*log2(D).
At 256 dimensions, this is high by 0.2 log2 units. Numerically the line
isn't far from my by-eye value (which I won't repeat for fear of confusion,
since these data are very much more precise).

Here are two matrices of test results. The numbers in the first matrix are the
expected values of outward movement, one unit being the length of the step.
Remember, inward movements can be very long if the direction is close to
the target, but outward movements are always one step, though most of them
move outward far less than one unit. Negative values mean inward moves.
The limiting value for infinite dimensionality at a start radius of 1 should
be sqrt(2) -1, or 0.414, since almost none of the moves are inward. At a
start radius of 2, it should be sqrt(5) -2, or 0.235. These can serve as
check points for the extreme data points.

                       Start Radius
Dim 1 2 4 8 16 32 64
2 0.549 0.233 -0.226 -1.003 -2.467 -5.268 -11.011
4 0.525 0.300 0.052 -0.286 -0.888 -2.056 -4.314
8 0.444 0.309 0.135 -0.061 -0.344 -0.918 -1.932
16 0.404 0.306 0.166 0.045 -0.115 -0.403 -0.904
32 0.407 0.264 0.166 0.075 -0.017 -0.158 -0.424
64 0.410 0.233 0.162 0.083 0.024 -0.056 -0.194
128 0.411 0.235 0.142 0.085 0.038 -0.006 -0.081
256 0.414 0.235 0.123 0.084 0.044 0.012 -0.028

Here are the numbers of moves out of 10,000 that actually went inward.
The effects of the surface curvature at short radii and high dimensionality
are very clear.

                Start Radius
Dim 2 4 8 16 32 64
2 3618 4380 4692 4941 4852 4972
4 2153 3765 4397 4648 4926 4916
8 769 3113 4158 4522 4772 4895
16 16 2069 3590 4327 4731 4817
32 0 849 3002 3963 4470 4796
64 0 0 2207 3574 4314 4598
128 0 0 879 3018 3869 4528
256 0 0 0 2067 3558 4243

Well, those are the numbers. Make what you will of them.

Martin