alternate reinforcement model

_Martin_Taylor · January 3, 1995, 5:55pm

[Martin Taylor 950103 11:30]
Bill Powers (941230.0740 MST)

Happy New Year to All (and Sundry, too).

The picture you're developing here, aside from being full of unexplained
ad-hoc processes, is diverging more and more from the basic PCT model
and the basic reorganizing system.

I find it hard to come up with ways of writing so that you interpret as I
intend. The localized reorganizing system is different from the separate
reorganizing system, yes. It is an attempted answer to the question "Is
it absolutely necessary at this stage of understanding PCT to assume that
there exists a separate system devoted to reorganization." It is necessarily
different from the separate reorganizing system in that it suggests that
the answer to the question might be "No, it is not absolutely necessary
to assert that reorganizing is done by a system separate from the main
hierarchy." It is not an attempt to supplant or revise the notion of
reorganization, which is unchanged in the proposal.

In order to make the proposal as straightforward as possible, I suggested
a reorganizing system in which everything was the same as in the "standard"
system except for its localization. I proposed that the outputs of the
control systems for the intrinsic variables contribute to reference signals
in the main hierarchy, rather than causing changes in the structure of the
main hierarchy. This seems to me to be reducing the number of ad-hoc
processes, not increasing them.

Another place in which the number of ad-hoc assumptions is reduced is that
in the localized proposal, there is only one signal that reorganization
should be occurring rather than many such signals. That signal is the
output of a control unit whose input is based on the error signal(s) in
one (localized reorganization) or many (separate reorganizing system) ECU(s)
in the main perceptual hierarchy.

Each ECU has such a set.

Perceptual input function: (sum of squared inputs + derivative of
same) Output function: (leaky integral, time-constant tau)

("Set" = control structure)

As you know, I proposed this only because it is the function you already
had proposed for the globalized error variable. If I have mis-stated it,
I'll substitute whatever function you want to use in the separate system.
The output of THIS ECU, in both proposals, is what drives the reorganization
mechanism (but in the separate system there are additional outputs from the
control systems of the physiological intrinsic variables, which also
contribute to driving the reorganization mechanism).

There is an unexplained process in the localized reorganization structure,
but it is EXACTLY the same unexplained process as exists in the separate
reorganization structure, and if you come up with an explanation for either,
it will apply (probably) to the other. And that is: "what is the mechanism
whereby weights, linkages and functions change, and whereby new ECUs are
constructed." We don't know the answer to that. I've suggested the
Genetic Algorithm as a possible part of such a mechanism.

I have considerable trouble with the idea that the perceptual input
functions of all ECUs at all levels are basically identical.

So would I. I haven't seen any such proposal made by anyone on CSG-L.
Where did you get this straw man? Not, surely, from my proposal about
how each localized reorganizing unit uses the error variable in "its"
ECU?

But....

On the other hand, it seems to me quite unnecessary to propose that there
is a different type of function in order to explain the different types
of perception at the different levels of control. If you look at the
outputs at different levels of, say, a multilayer perceptron, you (the
external observer) can quite easily see that different kinds of external
structures are being analyzed. For example, in his inaugural lecture at
the University of Toronto, Geoff Hinton used as an illustration a three(?)
layer perceptron in which the input was the sibling and parental relationships
among a smallish group of people (two-by-two). If I remember rightly, the
training input would be three items, two names and the name of a relationship,
such as Maria, sister, Antonio, suitably coded. One of the outputs at the
top level was identifiable as Italian vs English (i.e. the output was high
if the (single) person presented at the input was Italian, and low if the
person was English, and another unit had the reverse). At an intermediate
level the output could be considered as degree of membership in an extended
family, of which there were several, interlinked. Italian and English, and
family, were not part of any training criterion--I forget whether the
structure was self-trained or whether there was a teacher. I've probably
got the details wrong (it was many years ago), but not the principle, which
is that the kinds of perception output at the different levels were quite
different.

But the point is that these different kinds of perception were all generated
by the same kind of perceptual function, and the linkages among them were
the same as are posited to exist among the perceptual functions in a
control hierarchy. Each unit in the neural net could, in principle, have
been the perceptual input function for some ECU in a control hierarchy,
controlling for different kinds of perception at different levels.

The levels
of perception I think I see in human behavior (and my own experience)
seem to involve quite different computations as we go up the levels.
It's hard for me to see how a perceptual function that computes sequence
uses the same basic operations as the perceptual function that computes
tension in a tendon.

I reserve judgment on this. In my own musings, I have persuaded myself
that there is a need for three different kinds of perceptual function
(six, if taking a derivative is asserted to be different from taking
a static value). I have not persuaded myself that three (or six) is any
kind of limit.

For the curious, the three different kinds of perceptual function I have
mulled over are: (1) functions that combine incoming values with some
kind of weighting function, possibly with mutual multiplication, and
apply a non-linear saturating function to the result; (2) functions
like the first, but in which some of the inputs are reciprocal connections
to other similar perceptual functions (i.e. the output of A is an input to
B, and the output of B is an input to A, possibly indirectly); (3) functions
that perform logical operations that include IF...THEN...ELSE.

I hope that the foregoing paragraph doesn't lead to another round of claims
that I am introducing unnecessary new stuff into PCT. As I said, it's
what is mulling around in my head, not a cut-and-dried proposal.

I also have considerable trouble with the
_particular_ form of perceptual function you propose for all ECUs.

Correction: ... that you (I hope) proposed for the error-related variable
that is the perceptual function of the ECU whose output is part of the
reorganization signal (in the separate system), or that IS the reorganization
signal (in the localized system).

It's
hard to see the separation between cursor and target, in a tracking
experiment, being perceived by a function that computes the square of
the sum of weighted input signals, with or without derivatives.

Yes, it beats me why you would have wanted to try, other than that you
find it necessary to assume I am trying to change PCT in ways that I
explicitly and repeatedly say I am not.

···

=============

On reorganization mechanism suggestions:

If the higher system affects the lower only by varying the value of its
reference signal (first paragraph), how does it also vary its linkages
and functions and cause new ECU's to grow (second paragraph)? You need
to give each ECU capacities for doing those operations as well as
varying the values of reference signals for lower systems, don't you?

I don't suppose it is worthwhile restating it yet again. But, YES. Each
ECU in the localized reorganizing scheme is allotted the capacities that
are allocated to the separate reorganizing system in that scheme, but they
apply only to its OWN local environment.

And by what operations would an ECU be able to "induce" the growth of
new ECUs when control of existing ECUs fails? How is it able to perceive
that control has failed? What I'm concerned about is that you're going
to put a lot of program into the model that isn't really part of the
model: it just makes things happen when the programmer needs them to
happen, like "inducing" (by magic) the generation of new ECUs. Isn't the
whole problem where these new ECUs come from and what gives them
particular forms?

Yes, that's the same problem of mechanism that has to be addressed in
either structure. The same solution should apply to both proposals.

Anyway, all this sounds quite incompatible with the Genetic Algorithm
approach, in which new ECUs are developed not by experience with the
current environment or failure to control in that environment, but by
combining ECUs from the parents.

Huh!?!

What kind of dormitive principle is this you are espousing? New ECUs developed
"by experience with the current environment or failure to control in that
environment"? How, pray?

Or are you thinking of the GA as an
alternative to the entire scheme above?

Not an alternative. A possible way in which new ECUs might be constructed
when the existing ones in a neighbourhood fail to control well during their
experience with the current environment. Along with all the other ways
that reorganization may happen--changes of link weights and of input
and output functions.

================

I hold out little hope that this will be understood any better than my
previous attempts; but little hope is better than none, and in the spirit
of the New Year, I will maintain that hope.

Take as the watchword in understanding it: Everything about the reorganizing
system is the same in the two proposals, EXCEPT its distribution or
localization.

Martin

William_T_Powers2 · December 31, 1994, 7:31pm

[From Bill Powers (941230.0740 MST)]

Martin Taylor (941228.1620) --

     Yes. In both versions of reorganization, when an intrinsic
     variable is far from its reference, the control system for which it
     is a perceptual signal is producing output. In the "localized"
     version, that output changes the value of reference signals for
     other ECUs in the hierarchy, eventually causing action that affects
     the intrinsic variable, with luck in the direction to reduce the
     error in the intrinsic variable.

Martin Taylor (941229.1430)--

     As the embryo grows, its environment changes, and control systems
     that earlier were sufficient are no longer enough to permit the IV
     [Intrinsic Variable] ECUs to maintain control--and even the
     originally grown new ones may not be able to control. So further
     ECUs will be grown both between the IV ECUs and the first "new"
     layer, and between the first "new" layer and the outer world. And
     so on. In the localized system, when ANY ECU fails to control, it
     varies its linkages and functions, and possibly induces the
     generation of new ECUs below it. (And above it???)

If the higher system affects the lower only by varying the value of its
reference signal (first paragraph), how does it also vary its linkages
and functions and cause new ECU's to grow (second paragraph)? You need
to give each ECU capacities for doing those operations as well as
varying the values of reference signals for lower systems, don't you?
Or, if this capacity is not included in the design of each ECU, then you
need to posit some process outside all ECUs that can accomplish this.
And by what operations would an ECU be able to "induce" the growth of
new ECUs when control of existing ECUs fails? How is it able to perceive
that control has failed? What I'm concerned about is that you're going
to put a lot of program into the model that isn't really part of the
model: it just makes things happen when the programmer needs them to
happen, like "inducing" (by magic) the generation of new ECUs. Isn't the
whole problem where these new ECUs come from and what gives them
particular forms?

Anyway, all this sounds quite incompatible with the Genetic Algorithm
approach, in which new ECUs are developed not by experience with the
current environment or failure to control in that environment, but by
combining ECUs from the parents. Or are you thinking of the GA as an
alternative to the entire scheme above?

Martin Taylor (941229.1845)--

Each ECU has such a set.

Perceptual input function: (sum of squared inputs + derivative of
same) Output function: (leaky integral, time-constant tau)

I have considerable trouble with the idea that the perceptual input
functions of all ECUs at all levels are basically identical. The levels
of perception I think I see in human behavior (and my own experience)
seem to involve quite different computations as we go up the levels.
It's hard for me to see how a perceptual function that computes sequence
uses the same basic operations as the perceptual function that computes
tension in a tendon. I also have considerable trouble with the
_particular_ form of perceptual function you propose for all ECUs. It's
hard to see the separation between cursor and target, in a tracking
experiment, being perceived by a function that computes the square of
the sum of weighted input signals, with or without derivatives. Offhand,
I can't think of a single perception that could be represented by the
function you propose.

The picture you're developing here, aside from being full of unexplained
ad-hoc processes, is diverging more and more from the basic PCT model
and the basic reorganizing system. If you were coming up with increased
explanatory power I would have no objection. Maybe you're right. Maybe
when you finally turn Little Baby on and let it run, we will all be
wiser. I guess we will all just have to wait for that.

···

---------------------------------------------------------------------
Best,

Bill P.