[Martin taylor 970411 10:11]
Bruce Gregory (970410.2030 EST)]
Martin Taylor 970410 12:00
> Bruce Gregory (970409.1015 EST)]
>
> 1. The only way we learn is by trial and error-elimination.
I think this is a misconception about reorganization. It refers only to
the
class of learning methods that involve random relinkage of elementary
control
units, random changes in the weight structure of the hierarchy or random
generation of new control units. It does not refer to gradient (e-coli)
learning, in which there may be a small shift in a random direction
followed
by progressive continuation in the same direction (refinement or tuning
might
be words for this).
It seems to me that you are assuming that the trials must involve random
relinkage, but that is not necessarily true of all trail and
error-elimination learning. In particular, I think the e-coli learning is
an example of trial and error-elimination.
Then we understand e-coli learning quite differently. I used it as the
prime example of _non_ trial and error-elimination, and as the method
that is most commonly used. Random relinkage probably happens much less.
Here's how I think e-coli learning works. I'm sure Bill or Rick will
correct me if I am wrong. And I imagine you must disagree with me, so it's
worth setting out, anyway.
(1) We have to assume that there is some kind of a criterion of good
control. That may be error in some intrinsic variable, for example. Or
it may be something like the local RMS error in a single control system,
or it may be something else. It doesn't matter, so long as changes in the
weight structure of whatever is doing the learning (either inputs to
a Perceptual Input Function or the distribution to lower levels of the
output) change the criterion smoothly over most of their range of variation.
And there exists some optimum point (call it a minimum squared error if
you want an example).
(2) There is a parameter set, called "weights" in the preceding paragraph.
Variations in this parameter set affect the criterion. The parameters can
be considered as basis vectors in a high-dimensional space, and the values
of the parameters determine the location of a point in this space. The
criterion is a single-valued smooth function of location in this space.
(3) Parameter values can be varied by small or large amounts and the effect
on the criterion of the variation evaluated.
(4) Procedure:
(a) Evaluate the criterion at the current location.
(b) Make a small change by moving the location of the point in a random
direction.
(c) If the criterion gets worse, return to (b).
(d) If the criterion gets better, make another change in the same direction
again (some variants increase the size of the change, some don't).
That's it. I see no trial and error-elimination in that, unless there is
some meaning to the term as you use it that covers this rather precise
tuning procedure. Or, do you see the e-coli procedure as different from
this?
Notice that this procedure never ends. When the criterion reaches its
optimum value, the e-coli still makes random small changes. If external
circumstances alter the location of the optimum, our e-coli will move
to the new optimum, provided that the old one is not a local optimum
in the new configuration. E-coli can get caught in local optima, but
the higher-dimensioned the space, the less likely it is that a point
is an optimum in all directions if it isn't a true optimum. There's
usually a valley somewhere that leads downhill, though in high-dimensional
spaces narrow valleys can be exceedingly hard to find, so it may take
a while for the e-coli to get out of a local constriction. It will
eventually, though.
I just thought of a way you might see this as "error elimination." Perhaps
you think that e-coli won't again try a direction that didn't work on an
earlier random try. That's not true. It won't try a direction that proved
to be bad once it finds a good direction, because once it finds a good
direction it just keeps going in that direction until it's gone too far.
But until then, and again at the next "turning point", it may well retry
a direction that was bad. The procedure (in its pure form) has no memory
beyond keeping going so long as the criterion keeps improving.
Your turn.
My point is that learning is an active process and the learner
must act without knowing the results of that action (at least initially).
If that's your point, I couldn't agree more. I didn't see it in your
language. But that's a hazard we all face, don't we:-(
Martin