[Martin Taylor 950213 11:00]
Bill Powers (950210.1900 MST)
On the simulation of multidimensional e-coli optimization.
The way you report your results is somewhat hard to figure out. Let's
see if I understand the following:Dim Minimum distance
for p>0.05 p>.20 p>.301024 10.0 14.5 23
I take it that this means that if the steps are of size M, with 1024
dimensions, there is better than a 1 in 20 chance of an improvement per
step if the distance to the goal value is 10M, better than 1 in 5 if the
distance is 14.5M, and better than 1 in 3.3 if the distance is 23M.
Correct, except that the results I reported are for steps of size no
greater than M. Results for steps of size M exactly are currently being
obtained. At high dimensionality, there won't be much difference.
And another way to say it is that if we are making random steps in a
1024-dimensional world, and find that we get an improvement only about
one time in 3, the goal must be about 23 steps away (in some direction).
OK?
Well, remember that the world of the simulation is the same in all directions,
in the sense that the gradient is radially symmetric about the optimum.
This won't be the case in the real world. Otherwise I guess you are right.
The e. coli principle is just a little different, but I think critically
so, from the way you've set up the problem. In the e. coli method there
is an underlying constant velocity vector.
The result you have so far (and the upcoming one with a constant step
size) takes us part of the way by showing that there is a useful
probability that a direction selected at random will be favorable. But
to finish the analysis, we need to know what will happen if a velocity
is established in that direction, and the line in that direction is
extended until finally further progress would start to take us away from
the center again.
I don't think that the difference is interesting, in that the problem is
set up to deal with ONE step. If you make two steps in a given direction,
you can use the data to see what would have been the probability you would
have of improving the situation after one step of double the size. In other
words, you can see how far in a given direction you can go, maintaining
improvement over the initial state. Half that distance (I'm pretty sure) is
the distance at which the improvement will stop as your e-coli keeps taking
consecutive steps in the initial direction.
Better numbers for making that calculation will come from the simulation
in which all the steps are the same length, M.
A full analysis of the problem requires taking the reference signal into
account. When e. coli has a positive nonzero reference signal for the
rate of increase of the nutrient, this is equivalent to defining an
angle to left and right of the optimum direction that is less than 90
degrees each way, or in 3 dimensions, a cone. Tumbles will occur when
the result is a direction lying outside the angle or cone, even if the
resulting rate is still somewhat positive. This makes for more tumbling,
but more progress toward the goal when a favorable result does occur.
That would be an interesting complementary study. It might not require
much extra programming, but it would sure require a lot of extra computer
time, since there's another dimension to be examined. As matters stand,
our HP-9000 takes much of the weekend to do the higher-dimensional tests.
I'll enquire.
Martin