Adaptive System

[From Bruce Abbott (950615.1220 EST)]

Bill Powers (950614.1812 MDT)]
   Bruce Abbott (950614.1510 EST)

Thanks for the diagram: it does make explicit some of the missing elements
of mine. However, I don't think either of us are completely happy with
either version. Either there are limits to what a simple diagram can
represent and still be comprehensible, or we haven't yet identified the
right way to do it.

    As a suggestion, I think a "cleaner" example of this concept could
    be developed based on the standard e. coli control model, with the
    adaptation mechanism operating on the control system gain factor.
    To function properly, the adaptation control loop must be sluggish
    compared to the system it adjusts.

If I were designing an adaptive E. coli system from scratch, I'd use a
two-level control system and not the reinforcement model.

Me too. Evidently I wasn't communicating well. I was using the word
"standard" to refer to the familiar control model and not to the
reinforcement-based one (note also the reference to the "control system gain
factor"). Sorry this wasn't clear. Note also the reference to the
"adaptation control loop," an outer loop that that sets the parameters of
the inner loop. The system I'm proposing would, indeed, be a two-level
control system.

It isn't really necessary to use information about the
change in S across a tumble.

No, but what I am proposing is that using information about the relationship
between the output action and the consequent change in the error vector,
integrated over some period, might prove to be a very efficient way to
establish the system parameters of the lower-level loop, compared to simply
choosing parameters at random as a result of large, sustained error (e. coli

Come to think of it, your approach isn't really like Hans' model,
because there's no world-model in it -- no model of the link between the
output action and the controlled variable Q. If you cut off the input to
your model, it would not go on producing the same output as before.

Yes, it isn't a "model-based" controller, but then again, I didn't say it
was. When I said it was like Han's model, what I had in mind was that it
adjusts system parameters on the basis of experience with the results of its
attempts to control. As in Han's adaptive controller, the system diagram
contains a box labeled "system parameters," and a control mechanism is put
into place which adjusts those parameters on the basis of specific
information about system performance.

The reinforcement model has one deficiency, which is that it will adapt
only to attractants. In order to make it adapt as appropriate for an
attractant or a repellent, something would have to detect the kind of
input that is being sensed, and change the definition of S+. You can
manually change the definitions, but that's not "adaptation." As it
stands, the model doesn't know whether S+ is a good thing or a bad
thing. It assumes it's good. If you say that a positive dNut is to be
avoided, you have to redo the model, don't you?.

S+ and S- are only names denoting discriminably different stimulus
conditions, such as red and green. The designations as S+ or S- simply
denote whether those stimuli are consistently associated with relatively
attractive versus relatively aversive conditions (i.e., given that two
stimuli were associated with different nutrient concentrations, S+ would be
the one associated with the greater concentration, S- with the lesser). If
instead of nutrient, we substitute something toxic, S+ would be the stimulus
associated with the relatively more favorable conditions, the lower
concentration of the toxin, and S- with the greater.

There is no need to "manually change the definitions" of S+ and S- as these
definitions have no functional significance in the model. Red and green are
still red and green. If you change what is associated with them, then WE
might want to change what we call them (S+ or S-) in order to be consistent
with the definitions, but the model does not change.

What does change as you substitute a toxin for a nutrient is that increases
in toxin are expected to punish, whereas increases in nutrient reinforce,
and decreases in toxin are expected to reinforce, whereas decreases in
nutrient punish. The sign changes from positive to negative, from
attractive to repulsive. To handle a toxin gradient, the "valence" of the
substance located at the source in the environment, as perceived by the
organism, must be changed from positive to negative. This can be done by
hand in the present model by reversing the signs of the effects on PTS+ and
PTS-, but a better approach would be to designate a sensed valence for the
environmental substance and have the model use that valence as a multiplier
in the equations that modify the two probabilities. To model the organism's
behavior in a nurtient or toxin environment, we need only set the valence to
+ or -, respectively.

So how do we know what the "valence" of a substance is? The same way we
know, in the control model, what the reference is: through observation (we
set the reference so that the model behaves as the organism being modeled is
observed to behave), or by definition (if the substance is stated to be an
attractant, it should be given that status in the model, by setting a
positive valence, or by setting a positive reference).

I'm not sure we should even call the reinforcement model, as it stands,
an "adaptive" model, because it can produce only the one "adaptation":
PTS+ -> 0 and PTS- -> 1. If it could automatically detect the difference
between repellents and attractants it would be properly adaptive.

With the slight modification suggested above, the reinforcement model can
handle both repellents and attractants, and would detect the difference as
"automatically" as the control model does.

But all this is rather academic. The e. coli situation is not one in which
reinforcement principles actually apply beyond the simple notions of
attraction and repulsion. There is no learning going on, only behavior
determined by the internal structure of the organism and the properties of
its environment. I think most reinforcement theorists would simply suggest
that there is no reinforcement here, only the functioning of a hard-wired
adaptive mechanism for seeking nutrients--an instinctive or reflexive
mechanism rather than one operating according to reinforcement principles.
For this reason I believe that it would be fruitless to demonstrate that
reinforcement principles cannot "handle" some restrictive situation, such as
this one, that the control model works well in. Reinforcement theorists
would be more than happy, in that case, to grant that the control model
works, but would declair that the failure of the reinforcement model in this
situation would say nothing at all about its validity, as it would not be
expected to operate under those conditions. Reinforcement theorists do not
claim that reinforcement is the ONLY source of "control over behavior."