reorganization, operant project

[From Bill Powers (941229.1500 MST)]

Martin Taylor (941229.1430) --

I thought I was trying to solve the problem you pose. Yes, there is a
"goal in the hierarchy." It is to keep the intrinsic variables at
their reference levels.

But what system is it that perceives and controls intrinsic variables? A
system that controls the relationship between a target and a cursor
can't ALSO be concerned with controlling intrinsic variables, can it? I
don't get what you mean by saying that a hierarchical system can have
the goal of keeping intrinsic variables at their reference levels.

I have been thinking that the only system that can control the states of
intrinsic variables is one that perceives them, compares the perception
with a reference level, and acts on the basis of the difference. What
are you talking about?

The primary thing that is varied is, in the separate reorganizing
structure, the various signal paths (and the related I/O functions for
the different ECUs). In the localized system, the primary thing that
is varied is the set of reference signal LEVELS. Paths do change, but
as a consequence of failure to control, locally.

I must be particularly dense this afternoon. In a localized system,
changing reference _levels_ for lower systems is the normal mode of
operation, isn't it? A reorganizing aspect of a local system in the
hierarchy would have to do things like varying the output gain or the
weightings of signals sent to lower systems -- but the _values_ of the
signals reaching lower systems have to be determined (via the output
function) by the error signal if we're to have a behavioral control
system of the usual sort. You can't have a higher system learning to
send a particular _value_ of reference signal to lower systems, can you?
If it did that, how would it counteract disturbances, or bring its
perception to a new reference value received from above? I think I must
be missing your point here.

···

------------------------------
RE: Little Baby

     Any variation in the hierarchy that can be imposed by the separate
     reorganizing system is assumed to be available to the localized
     system. In the Little Baby experiment, we plan to test three kinds
     of variation, together or separately: (1) variation of linkage
     weights (as you suggest below, but not as I previously thought you
     accepted), (2) variation of perceptual function (changing WHAT is
     perceived by any particular ECU), and (3) generation of new ECUs
     using Genetic Algorithms to create the new ECU with characteristics
     combined from two parents.

(I hope this indenting is sufficient to mark off your posts -- I get
tired of deleting hard returns and adding all those ">" marks).

Sounds good to me. I have speculated that one kind of preorganization
that would be inheritable would be arrays of comparators, because in at
least the canonical model, all comparators are exactly alike at any
level. This doesn't violate the principle of assuming that the neonate
knows nothing of the external world.

The third (Genetic Algorithm) method will be interesting to see. I've
always wondered how the characteristics of real parents are "combined"
to produce new characteristics in the offspring. If one parent control
system has a gain of 100 and the other a gain of 20, does the child
control system have a gain of 60? Or if one parent perceives a color
that is .3*red + 0.6*green + 0.8*blue, and the other perceives a
different weighting, how do we combine these to create the child's
perceptual function? Or if one parent writes right-handed and the other
writes left-handed, how does the child write?

Even more interesting will be to see how levels of control are handled.
It seems to me that to speak of inheritability of control
characteristics, it's necessary to assume that the hierarchy itself is
inherited in complete form rather than learned through experience with
the world. So if Ms. Torville has learned to control relationships
involving a talented skating partner who is not her husband, her
children by her non-skating husband would inherit a relationship-
controlling system that is something like Torville's, and something like
the husband's (non-existent) control system for that relationship. And
when would we see this new control system being used? At birth? And with
whom? Dean?

Perhaps your experiments with GA will shed some light on just what it is
about behavioral organization that is inheritable and what is clearly
not.
-------------------------
     You suggest that combining input and output variation doesn't work
     well. It would be interesting to find out why not.

It might work; I just couldn't get convergence with my way of doing it.

     Without having worked out the detail, my assumption is that the
     initial stage is that the ECUs whose perceptual signals correspond
     to the intrinsic variables exist initially, together with some
     means of affecting the outer world. That, and the reorganizing
     mechanism, is all.

Right, that's my supposition about my reoreganizing system, too --
except that the initial control systems don't affect the outside world
at all. They affect only the mass of neurons that is going to become the
hierarchy of control that DOES interact with the outside world. This
_is_ the reorganizing mechanism as I see it. If you're proposing some
other reorganizing mechanism, how about describing it?

The delta would apply to the weighting assigned to each path,
increasing or decreasing it.

     Oh, fine. I thought, as I was writing the questions, that if I
     introduced the concept of a variable weighting between output and
     lower reference input, you would accuse me of adding something new
     to the structure of PCT. In at least one earlier discussion, you
     talked about the difficulties associated with the transients
     involved in flipping the sign of a link, and I guess I assumed that
     you were restricting the weights to 1, 0, -1.

Yes, there were difficulties and I gave up on that for the reason you
cite. However, continuous variation of the weights doesn't cause those
problems.

     But that still doesn't apply to delinking from L and relinking to
     Q, unless these are seen as bringing one weight down to zero (and
     stopping there) and independently bringing another up from zero.

Yes, that's what I had in mind. If both a positive and a negative weight
worsen control, the weight will gravitate toward zero. Of course this
assumes that there is some set of possible paths already physically in
existence, so we choose among them by varying weights.
--------------------------
     ... there's a bit of a discrepancy between this statement and the
     results that Jeff Hunter found when he tried increasing the
     dimensionality. Maybe you were using a more efficient algorithm,
     but Jeff found that the time to a solution increased so rapidly
     that overnight computation was needed for more than about 10
     dimensions.

It probably depends on exactly how the algorithm is set up, but I can
accept that steep rise with number of dimensions. I still have one demo
set up with 10 systems and 10 environmental variables in which the
reference signals and perceptual signals are shown as "meters" on the
screen. After 5000 iterations there have been about 1200
reorganizations, and all the systems are showing reasonable amounts of
control (slowly varying reference signal for each system can be turned
on and off, as well as a set of disturbances). This one organizes input
weights, but I cheated by making output weights of 1 or -1 always
correct for negative feedback -- this could in principle be done without
knowing what is being controlled, but it's ugly. It isn't necessary to
go all the way to a perfect solution to see quite reasonable degrees of
control. I don't know why I kept this model; it wasn't the neatest one.
I guess I should reconstruct the best model.

I think I still have a lot to learn about reorganization.
-----------------------------

Remember that wrong directions lead immediately to another
reorganization, while right directions allow progress all the way to
the next point where the error rate goes positive again. A succession
of consecutive reorganizations has essentially no effect
on the parameter values.

     No problem with this. Except for that word "immediately." In any
     control loop there is a finite loop delay, and any step change is
     followed by a transient that lasts at least as long as the loop
     delay.

I thought "immediately" was sufficiently vague. If the direction is
wrong, another reorganization occurs as soon as the wrongness can be
detected and another change can be made. OK now?
----------------------------------------------------------------------
Bruce Abbott (941229.1530 EST) --

      In the bad ol' days I used to run all stations simultaneously on a
     single PDP-8F (all with a whopping 32K of magnetic core memory),
     but every time the computer fried something (which was fairly
     often), the whole lab went down.

I'm one up on you. I used a PDP-8S to run an observatory: telescope,
low-light-level TV, storage tube, and display, for a semi-automatic
supernova search that went on for two years. Down time was approximately
equal to up time. But we discovered over 50 supernovae during the rising
part of their light-curves.

I do think your current setup will be adequate for anything we'll be
doing in the next couple of years. Even the game ports aren't bad for
reading potentiometer positions; I tested a number of them and they're
sufficiently accurate and stable for tracking studies. Just use good
low-noise cermet pots.

     The chambers are equipped with standard pellet feeders that deliver
     standard 45 mg pellets, so we needn't be concerned with "access to
     reinforcement" as in the Motheral experiments.

Do you think anyone will complain because we used a different
reinforcement method from Motheral's?

Can the setup deliver 1, 2, or 3 pellets per reinforcement? It would be
nice to check the part of the model that predicts an effect of
reinforcement size.

     Changing the ratio requirement is itself a form of disturbance to
     which the rat must adjust. However, changing the ratio alters
     several perceptual variables at once, including (a) the effort that
     must be expended per pellet, (b) the count, i.e. number of presses
     per pellet delivery, and (c) the delay to next pellet delivery.

Good point about the variable ratio schedule being a disturbance. The
model should handle that with no change. The points a,b, and c may be
signficant in another theory, but I don't think they'll matter in a
control model. The control model proposes only that total delivered
reinforcement per unit time is under control. If that's right, the other
variations won't matter. The same model that predicts behavior on a
fixed-ratio schedule should also predict for a variable-ratio schedule,
at least to first order.

I hope we can pre-record the variations in ratio so we can use the same
patterns for all test subjects, and for the model, too.

     ... an increased ratio requirement produces longer delays between
     pellet collection and lever-press resumption, without much effect
     on the rate of responding maintained once pressing has resumed. No
     doubt this occurs because rats have even more trouble understanding
     PCT than I do and, of course, are strongly motivated to please
     their Skinnerian masters by producing data consistent with
     reinforcement theory. (;->

This is where our ability to see where the rats are will help a lot. If
the explanation turns out to be that on the higher ratios the animals
spend more time searching elsewhere for food, then ANY theory that
explains the delays in terms of behavior at the bar is spurious.

Also, I suspect that the rats behave quite independently of the
explanations their masters fit to the observations. >:(

Identifying the relevant perceptual variables and figuring out how the
several control systems involved interact to produce the observed
changes will require additional research. For example, it is possible
to introduce the additional delay to pellet delivery without changing
the effort required or the count, and to change the effort required
without changing the count or the delay. And that's just the beginning.
Whoever said rats were simple?

On the other hand, the ways in which they are complex could easily be
different from the ways they seem complex under the wrong interpretation
of their behavior.
------------------------------------------------------------------------
Best to all,

Bill P.