Good Morning Bill,
Hope you have a great day. Happy Birthday Bill. I had breakfast with David Goldstein this morning. His birthday is 9/1/10.
Carter
···
— On Sat, 8/28/10, Bill Powers powers_w@FRONTIER.NET wrote:
From: Bill Powers powers_w@FRONTIER.NET
Subject: Re: hPCT Learning - Trying Again
To: CSGNET@LISTSERV.ILLINOIS.EDU
Date: Saturday, August 28, 2010, 12:23 PM[From Bill Powers (2010.08.28.0740 MDT)]
Ted Cloak (2010.08.27.1638 MTD) –
TC:The “data”, if I may call it that, is my observations of a kitten growing
into an adolescent cat. I’ve watched Sidi (short for Obsidian) practice
running, jumping, stalking, pouncing – hour after hour, day after day. Each
day (well, week) I observed her getting more and more expert at those tasks.
I’m trying to explain this natural (as opposed to laboratory) learning which
is, of course, common to all vertebrates and probably all animal species.BP: So am I. Reorganization is my proposed basic method of learning, which I think may be the natural way common to all animals that learn.
However, I’m finding it difficult to communicate just how PCT reorganization works, so this conversation is turning out to be useful. If I keep working on making it clearer, perhaps eventually it will become comprehensible. Evidently you as well as others already have a concept
of learning in mind, and you’re assuming that the reorganization approach works the same way. But it doesn’t.
The main fact is that PCT reorganization does not work by trying out different reference signals or different behaviors. The best way to see it in action (other than watching cats grow up) is to look at the demos in LCS3.
Demo 7-2 shows a simple system with just three controllers and three environmental variables. It starts out with all three control systems sensing variables constructed in three different ways from all three environmental variables. The input weights of the three systems, nine in all, are set at random at the start of a run and remain the same from then on; adding input reorganization is a project for the future. The output functions of the three systems are each connected to all three environmental variables through adjustable weights, which are initially set to zero. In this demo, the three reference signals
vary in a repeating pattern of magnitudes that traces out a Lissajjous pattern in three dimensions. This pattern of reference signals never changes. But control is so poor at first that the controlled perceptions change very differently from the way the reference signals change.
The only variables being altered by reorganization are the three output weights in each of the three systems. For each system, there is (in addition to the three weights) a set of three auxiliary numbers that are set at random to values between -1 and 1. These “speed” numbers are added (multiplied by a very small constant like 1/100,000) to the output weights on every iteration of the program. The output weights therefore begin to change at a constant speed, the speeds being proportional to the “speed” numbers. This corresponds to the “swimming” phase of E. coli’s behavior.
The three control systems start life in a very crude form. The loop gain is
zero because all the output weights are zero, Soon after the start, all the output weights are nonzero, so each of the three control systems is trying to affect all three environmental variables. The weights, however are very small so the amount of action generated is small at first, and the weights are not adjusted appropriately, so the control is very bad. The control systems interfere with each other because each one affects not only the variable it is supposed to be controlling, but variables in the other two system as well, and it may not be affecting its own variable enough or in the right direction.
This beginning situation is similar to what is found by neurologists looking at motor systems in neonates. There is a crude general input-output arrangement connecting senses to muscles, so the control systems are sort of sketched in, thanks to evolution. But there are far more connections from sensory to motor nuclei than are needed, and most
of them are not the right connections. As maturation and practice proceed, the number of connections is gradually reduced, a process they call “pruning.” In the end, the wrong connections and superfluous connections are pruned away (I would say, the weights are reduced close to zero), leaving only the right connections. And what doesn’t show in the crude observations possible in living brains is that the remaining weights continue to change so the control systems become more stable and more skillful at keeping errors very small.
So at first, we have the weights changing at some rate, a different rate for each of the three output weights. If the running average of squared error of a control system is decreasing as a result, because control is getting better even though still not very good, that change continues. Eventually, however, the three weights will be as close to the right amounts as possible, and then the changes will start making the
error larger. As soon as that happens, there is a tumble. A reorganizing control system sitting off to one side senses the increase in absolute error, and produces an output that changes the auxiliary “speed” variables to new values between -1 and 1 – at random. This starts the output weights changing, iteration by iteration of the program, in different proportions, so if we plotted the weights in three dimensions, the three-dimensional direction in which the resultant is moving would be different. That change of direction is a tumble.There is just as much chance that this random change will leave the error still increasing or increasing faster as making it start to decrease, in which case another tumble will occur immediately. If the tumbles come close together, the weights will not change by very much. Eventually, a tumble will set the weights to changing in a direction that makes the error decrease, and the tumbles will cease. The
weights will go on changing in the new proportions as long as the error keeps getting smaller.Clearly, this principle should make all the weights approach the values they must have for the error in each control system to get as small as it can get. I have found that multiplying the effects of the speed variable by a number proportional to the absolute amount of error produces more efficient convergence, making the speed of change approach zero as the error approaches zero.
It must be understood that all during this process, the reference signals for the three systems in the demo are varying in a fixed pattern that never changes. In other demos, the change in reference signal is made random, and random disturbances are added, and the reorganizing process still converges to the best control possible to achieve by slowly adjusting output weights. The actual behavior patterns, in that case, may never repeat, and the reference values may
never repeat their patterns, either, yet control will continue to improve over time. I have shown all these effects in various demonstrations, though only the simplest of them are in LCS3.Demo 8-1 shows this same principle with random disturbances and with the reference signals varying in a fixed pattern, for an arm with 14 degrees of freedom. You can see the movements becoming more regular as time goes on, starting with clumsy flailings and ending with a smooth regular Tai Chi exercise pattern. The Tai Chi reference pattern remains exactly the same from beginning to end of the demo, but the ability to control the arm to match it while resisting the effects of disturbances continually improves.
TC:I do think we need to understand better how a control system (CS) learns
how to control its output to obtain/maintain the input demanded by its
reference signal, and I’ve got a suggestion for that, below.BP: I hope you can see now that reorganization theory provides the explanation you’re looking for, and the demos do exactly what you describe.
TC: But being in a
hierarchy, a CS also needs to learn how to help the CS which addresses it
obtain/maintain the input demanded by that CS’s reference signal.BP: I disagree. All that a control system in the hierarchy has to learn to do is adjust its outputs to lower systems in a way that keeps its own perceptions matching whatever reference signal it is given, as quickly and accurately as possible. When that is achieved, a still-higher control system can use the lower one as its means of controlling its own, higher-order, perception: adjusting the reference signal in the lower system will quickly and accurately make the perceptual signal in that lower system follow the changing values of the reference signal, and thus provide an input (among many) to the higher perceptual input function that is needed to produce the desired amount of the higher-order perception.
I’ll
leave it at that for now.Best,
Bill P.