[Martin Taylor 2002:03:30 23:45]
[From Rick Marken (2002.03.21.1630)]
One thing I was wondering about was whether there is any
effect on the level of skill achieved of the level of disturbance in the
environment in which the training occurs... What does the "learning"
version of the Little Man model
say? Does the model learn differently (in terms of things like the time it
takes to reach maximum proficiency -- measured as gain, perhaps, the level
of maximum proficiency) depending on the magnitude of the disturbances in
the environment in which it learns?
I think you have to ask first what "learning" means in PCT.
There are different kinds of learning (quite apart from the
conventional distinction between memory for fact and events and
memory for skills). One split, in PCT, is between learning within an
Elementary Control Unit (ECU-- one scalar perception being
controlled) and learning involving an entire control system, which
involves the coordination of a network of ECUs at several levels.
First, learning within a single ECU.
Within a single ECU there are at least three ways in which skill can
be improved, and I would imagine they are differentially susceptible
to the level of disturbance during learning. The three ways, in order
in which they depend on one another, are:
(1) Alignment of perceptual and output vectors in the environment
descriptor space.
(2) Loop transport lag reduction
(3) Gain increase.
The later two are probably obvious. Reduction in transport lag allow
increased gain without loss of stability, and in itself increases
tracking accuracy. The first, however, may need a little explanation.
An ECU is usually drawn simply, as a loop in which a reference signal
value is compared to the value of a perceptual signal, the different
feeds an output function that sends a signal through the environment
where a disturbance is added, and the result is fed back to the
perceptual function that produced the perceptual signal in the first
place. That diagram is nice and simple, but it hides two important
facts, both hidden in the word "function" which is used twice.
What normally happens is that the output affects a great many
variables in the enviroment, not just one signal on a wire. At the
same time, the perceptual function takes its input from many
variables in the environment, but not necessarily the same ones as
are affected by the output function. In other words, the output
creates side-effects, and the perceptual signal includes
environmental effects that cannot be influenced by the output.
"Alignment of the perceptual and output vectors" means that when
alignment is exact, every variable that influences the perceptual
signal is affected by the output to exactly the degree that it
affects the perceptual signal.
If you look at "learning" to track in this way, you can see that
there ought to be a three-stage development of skill. The first stage
involves learning to perceive what is to be tracked and learning to
affect what is to be tracked without wasted motion (which means both
the perceptual and the output vector must separately be aligned with
some vector the experimenter defines in the environment).
If there are extrinsic disturbances (large or small) in the initial
stages of learning, the initial alignment is likely to be
problematic--remember that in a high-dimensional space, any two
random vectors are highly likely to be nearly orthogonal. This means
that initially, the learner is casting around to find what is
supposed to be tracked, as well as how to influence it when it has
been discovered. Of course, in practice, verbal descriptions go a
long way to eliminating this initial random search, and provide an
approximate alignment from which real skill training can be developed.
Once the output and perceptual vectors are aligned substantially
better sufficiently that the output affects the perceptual signal
appreciably more than the disturbance does, I would expect that a
moderate disturbance would aid learning, because it would assist the
learner to discriminate those aspects of the environment that are not
part of the "to-be-tracked" (TBT) vector from those that are, and to
discriminate those parts of the environment that are affected by the
output from those that are not.
(Parenthetically, this seems to involve a monitoring control system,
but I don't think it does in practice. When I say "discriminate" I
mean in the same sense that the Artificial Cerebellum comes to match
the loop dynamics if they are non-white, and indeed, that specific
learning may be important in phase 2 and 3 of the learning process,
lag reduction and gain increase).
Bottom line here is that one might expect initially learning to be
best with no disturbance, but after only a very short initial
get-acquainted stage, learning should be enhanced by introducing
disturbance at a level that increases a skill improves.
Obviously what I say here can't hold for compensatory tracking ("hold
the target at a fixed reference value") because if there were no
disturbance, there would be no varying perceptual signal and nothing
for the learner to learn. The way Rick posed the question suggests
that pursuit tracking is the problem, and in that case the reference
signal to the ECU varies as the "cursor" follows a separately
perceived "target." In that case, it makes sense to talk about zero
incremental disturbance.
···
------------------------------------------
But a control system consists not of one ECU, but of many, each of
which must learn how best to interact with its neighbours. Not only
is each capable of learning through the three phases, but also the
system performance changes as the interconnection strengths change.
That's a whole new ballgame, which I don't want to get into. (There's
some relevant material at
<http://www.mmtaylor.net/PCT/Mutuality/index.html> but what is there
is far from the whole story.
I talked about another aspect at CSG '93, when I called it "The Bomb
in the Machine". This latter aspect derived from the necessary
non-linearity of the individual ECUs, which makes the network
susceptible to a cascading avalanche kind of reconfiguration. Such
avalanches, which derive from momentary or sustained loss of control,
may be the way important things are learned, as opposed to
incremental skill improvements. One way of looking at them is that
they have the potential to provide an "Aha" experience.
When you have the potential for large and small avalanches, it is
hard to be sure of the effects of changing disturbance magnitude. At
the right moment, a tiny disturbance can cause a big reconfiguration.
But on the whole, it is more probable that bigger disturbances cause
more frquent and larger reconfigurations. So (guessing madly) I would
expect that for most effective learning, intermediate disturbances
hwould have the potential for triggering reconfigurations until one
yields effectively controlled performance. At that point, the system
should be able to withstand and to profit from somewhat larger
disturbances, which are likely to be controlled by the newly
reconfigured system, and to increase the skills of the ECUs that were
involved in the new configuration.
All speculation, but based on what ECUs do, singly and in concert.
Probably not helpful to Rick--but maybe it is. I hope so.
Martin