Ecoli2 & 4; Tai Chi; >; Beam in Bin paper

[From Bill Powers (941113.0530 MST)]

Bruce Abbott (941111.2100 EST)--
RE: Ecoli4

Enough of theses verbal arguments! ... Run ECOLI4 as often, for as long
as you like. E. coli will occasionally find the goal, and sometimes it
will stay there a long time, but it will not consistently go there and
it will not consistently stay there when it does. If it does, I'll buy
you a steak dinner. If it doesn't, you buy me one. Deal?

No deal, you're right. I was fooled by a chance period in which the spot
stayed near the target.

ยทยทยท

----------------------------------
RE: Ecoli2:

As the two limits were intended to keep Speed BETWEEN 0.3 and 2.0 (with
Speed being driven up or down by the consequences of moving), it did
not compute with me when you said the system was trying to keep Speed
below 0.3 or above 2.0.

You may have INTENDED for the program to keep Speed between 0.3 and 2.0,
but the program you WROTE makes Speed AVOID the region between 0.3 and
2.0. If the first system is acting, any speed above 0.3 is reduced,
iteration by iteration, toward 0.3. If the second one is acting, any
speed less than 2.0 is progressively raised toward 2.0.

Here is a program statement that will maintain speed between 0.3 and
2.0:

if speed >= 2.0 then speed := speed - LearnRate
else
if speed <= 0.3 then speed := speed + LearnRate;

Now if speed is ever caused (by other programs steps) to be outside the
range from 0.3 to 2.0, LearnRate will be applied as needed to restore it
to that range. Within that range it is not controlled -- that is,
perturbations tending to change Speed without putting it outside the
limits will not be resisted. This is a control system with a large dead
zone.

If you had actually succeeded writing a program that ALWAYS keeps speed
between 0.3 and 2.0, there would never have been a tumble, because
tumbles occur only when speed is less than or equal to 0.3.
-------------------------------

On more important matters, I've completed my research protocol for the
animal care and use committee on the operant schedules project, and
expect to have approval to order rats in about a month or so.

Excellent. Some time in January, then.

I've also been playing with Operant3. I'll have some comments when
I've had a bit more time to explore its behavior under various
parameters.

Good. There's lots to explore.

Regarding the learning model of e. coli, I've had some thoughts of my
own concerning how to handle this; when I get some "play" time I'll try
to model both approaches and see which works best.

I think that such a learning model is possible, but that to make it work
we would have to make the learning EXTREMELY slow. I played around with
it a little, and realized that you can't get a good measure of squared
error without averaging over a very long time, during which the organism
might well get into regions of such low concentration gradient that the
hill-climbing effect would become vanishingly weak. The fact that the
basic operation of the system is already a statistical procedure makes
it impractical, I think, to add a higher level of statistical procedure.
Of course you might prove me wrong, but if I'm right a lot of time could
be wasted trying to make this learning work. E. coli can behave properly
in gradients of over 20 kinds of attractants and repellents, but I don't
know of any evidence that an individual learns to do this (although
what's an individual in an organism that reproduces by dividing?).
-----------------------------------------------------------------------
Avery Andrews (a day or two ago):

So the Tai Chi masters support my hypothesis that the locus of
reorganization follows the locus of awareness?
-----------------------------------------------------------------------
Phil Runkel (direct post) --

... whether I can put in the greater-than signs I do not know.

Try holding down the shift key and pressing the button two spaces to the
right from the M key. That's how I do it.
------------------------------------------------------------------------
Ray Allis (941111.1630 PST)--

RE: "Beam in the Bin" paper.

I finally got all 216K of the paper downloaded. To expand it, I
discovered (eventually) that I needed gzip124.zip, which is on Simtel
mirror sites. At the Washington University site it is in
systems/ibmpc/simtel/compress.

When expanded, the file proved to be a 1.2 MB PostScript file, which
took half an hour to print on my dot-matrix printer using Ghostscript.
Thanks A Lot.

As Mary says, this paper by Bessiere et. al. is an example of a man with
a hammer going around looking for nails. In this case, the hammer is
used to drive thumbtacks, and there is a much easier way to do at least
the main part of what his "robot" does.

Basically, the robot consists of a motor turning a shaft on the end of
which is a side-looking photocell. This shaft extends vertically into a
"green plastic bin" with a light-bulb mounted in one side of it. As the
shaft turns, the photocell sees and reports a varying light intensity,
some light coming directly from the bulb and some from reflections
inside the bin.

The purpose of the robot is described as follows:

    We want our device to accomplish the following tasks:
     - learn "internal representation" of its environment;
     - predict value of the sensory variable given the value of the
       control variable;
     - generate motor command to reach a sensory situation;
     - recognize situations (different environments);
     - recognize novelty (non previously learned environment);
    and finally, behave consistently to exploit and explore changing
    environments.

The authors are obviously thinking of this robot as a very simple case
of a much more complex process. This is probably why they overlook some
of the very simple ways of doing most of the same things. In fact, they
indulge in "overinterpretation," a word I ran across recently in a
similar application and which filled a hole in my vocabulary for
describing needlessly complex descriptions of simple processes. Now to
methods:

    We explore this "environment" by ... drawing at random angles q,
    moving the robot [shaft] to this position [angle] and measuring the
    intensity of light on the photoelectric cell. Following this
    protocol we may build a histogram of the couples of values (q,i)
    observed as shown on Fig. 4.

This approach required making lots of random samples and building up the
histogram. Of course it would be much easier just to step the shaft
through 360 degrees and record the intensity in each bin of an array.
However, once you have statistics on the brain the point of the exercise
becomes that of using the tool, not of solving the problem in a simple
way. So the robot builds up the histogram the slow hard way.

    Using "probability as logic" as basic cognitive theory for sensory-
    motor agent implies that the relevant form for a state of knowledge
    of a rational agent should be a distribution of probability.
    Consequently, we want now to derive a probability distribution (a
    state of knowledge of our robot) from the above histogram (the set
    of sensory-motor experiences).

And of course they proceed to do so. Then:

    The next step is to generate motor commands to reach a wished
    sensory situation. It is the inverse problem relatively to the
    previous one.... In our obviously much simpler problem, it means
    being able to generate a motor command q [angle] that gives a good
    chance to observe a wished i [light intensity]. This is given by:

      [An expression deriving one conditional probability as a function
    of three other conditional probabilities]

What is nice about this is that the authors recognize the behavioral
problem as one of producing a desired perceptual situation. What is not
nice is that they think this has to be done by deriving the output
action that is predicted to produce the desired perception. It does not
occur to them that they could set a reference intensity i*, compare the
current intensity of light to it, and rotate the motor until the
difference disappears. If they did that, they would _guarantee_ that the
robot would always be able to experience exactly the desired light
intensity if it could be observed in any direction.
----------------------------
It occurred to me while reading this paper that a conditional
probability is just a way of describing a noisy proportionality. If
A = P(y|x), then on the average, y = Ax. So a lot of expressions
calculated in terms of conditional probabilities reduce to linear
algebraic relationships, with some uncertainty thrown in. Could it be
that all these conditional probabilities would reduced similarly? If so,
this is surely the hard way to get a simple answer.
----------------------------
As to recognizing different environments (plastic bin turned to
different angles relative to the light bulb), given a histogram this
could be done in a number of ways, among which the calculation of
probabilities may be a quick one. However, as far as achieving the
primary task (achieving a desired perception of light level) is
concerned, it isn't necessary to know when the environment is changed.
If you set the reference signal to i* and turn the motor until i = i*,
it doesn't matter whether the environment has changed. Either there is
some position that will bring about the desired value of i, or there
isn't. If there is, it will be achieved by the control system. If not,
it won't be achieved by any method.

In fact, to do this it isn't even necessary to build up the histogram
first, or ever.

The failure to see this simple solution tells me the authors are
probably making the same mistake we saw in a recent critique of control
theory in the literature (forgot who -- Meyerson?). The objection was
that in a two-level control system, there were many level-one situations
that would satisfy the second level of reference signal. Achieving a
specified second-level goal didn't specify a unique set of values of the
first-level variables.

When control of perception is involved, the uniqueness of the lower-
order situation is irrelevant. The purpose of the control process is to
make the second-level perception match the second-level reference
signal, and that is all. If more than one lower-level situation will
have that result, then it doesn't matter to the higher system which
lower situation results in any given instance of control.

Here we have a robot given the task of creating a specific amount of a
perception of light intensity. The authors interpret this task as that
of finding a specific output act that has a good probability of
generating the specified light intensity. So only some subset of all
possible outputs is admissible; if the situation changes so that some
other output is needed to produce the same light intensity, this
approach will not find that output because it can only look at its
histogram to see which outputs produced that light intensity in the
past.

This is the only reason that the authors add the capability of detecting
a change in the environment's statistical distributions. Detecting such
a change is a way of discovering that a different histogram must be
used. And if NO histogram contains a suitable action going with the
desired intensity, then the system has to be able to recognize the
"novelty" and construct a new histogram. All of this is very logical and
plausible, but in terms of the stated primary task, unnecessary. The
authors are worrying about things that can't possibly matter to a robot
that controls perceived light intensity by means of rotating the shaft
of a motor. The robot couldn't care less about the uniquess of its
solution, in terms of the actual situation in the green plastic bin.
Distinguishing environments is a red herring.

Actually a perfectly ordinary control system will work to accomplish the
task if it can be accomplished at all. As the motor turns, the light
intensity will rise and fall. Depending on the sign of the connection
from the error signal to the direction of rotation of the motor, the
specified light intensity will found either on the clockwise-advancing
side of the intensity-position curve, or on the counterclockwise-
advancing side. When the correct intensity is found on the wrong side,
the feedback will be positive and the motor will skip past that position
to the nearest position where feedback is negative again (there will
always be such a position if there is a positive-feedback position). And
in all cases, the motor will stop turning when the intensity matches the
specified reference intensity. The only requirement is that there be
_some_ angle of the shaft at which the intensity rises slightly above
the reference intensity. If that condition is met, it doesn't matter how
the environment changes in other respects -- such as its statistical
distribution of intensities.
----------------------------------------------------------------------
Best to all,

Bill P.