I include the original two posts in their entirety at the end for your perusal.
My claims were that in the past 10 years neither position of each man has changed and that Bill’s reply was a fallacy meant to deflect the thrust of Martin’s criticisms.
Martin did the same thing in this post to me. He made my ‘honesty’ the issue and never countered my claims.
What Martin is referring to is that in order to distinguish between the dialogue of Bill and Martin, I added the chevrons in front of what one said to differentiate between the two. The reason is and was very simple, as you will see when you look at the original’s. Bill simply put a dashed line between his response and Martin’s, and if you look closely at the header of Bill’s reply you will see he made a mistake and did not distinguish himself and Martin. I made no changes to anything that was said by either party, nor did I take anything ‘out of context’.
Since bringing up old posts seems to be in vogue these days, I’ll help contribute. I actually went to the archives to look up something for Ely and started coming across all this great stuff.
Here is a wonderful example of a typical CSGnet exchange. In it Martin Taylor is trying to point out to Powers that his ideas for input functions will not work as anticipated.
Bills response; both a red herring and strawman. He never does answer Martin's criticism's and instead changes the argument and throws it back into Martin's lap. Instead of Bill's equations not being adequate. It becomes Martins attempt to 'mix' theory's together that becomes the issue and Martins points are all but forgotten.
If you are going to reformat archived messages to make such point, it’s more honest to keep the attributions of who said what the same as in the original message. Changing the attribution completely alteres the implications of the interchange, and is unfair to all parties concerned.
Date: Tue Jan 03, 1995 10:17 am PST
Subject: Re: alternate reinforcement model
>[Martin Taylor 950103 11:30]
>Bill Powers (941230.0740 MST)
Happy New Year to All (and Sundry, too).
>The picture you’re developing here, aside from being full of unexplained
>ad-hoc processes, is diverging more and more from the basic PCT model
>and the basic reorganizing system.
I find it hard to come up with ways of writing so that you interpret as I
intend. The localized reorganizing system is different from the separate
reorganizing system, yes. It is an attempted answer to the question "Is
it absolutely necessary at this stage of understanding PCT to assume that
there exists a separate system devoted to reorganization." It is necessarily
different from the separate reorganizing system in that it suggests that
the answer to the question might be "No, it is not absolutely necessary
to assert that reorganizing is done by a system separate from the main
hierarchy." It is not an attempt to supplant or revise the notion of
reorganization, which is unchanged in the proposal.
In order to make the proposal as straightforward as possible, I suggested
a reorganizing system in which everything was the same as in the “standard”
system except for its localization. I proposed that the outputs of the
control systems for the intrinsic variables contribute to reference signals
in the main hierarchy, rather than causing changes in the structure of the
main hierarchy. This seems to me to be reducing the number of ad-hoc
processes, not increasing them.
Another place in which the number of ad-hoc assumptions is reduced is that
in the localized proposal, there is only one signal that reorganization
should be occurring rather than many such signals. That signal is the
output of a control unit whose input is based on the error signal(s) in
one (localized reorganization) or many (separate reorganizing system) ECU(s)
in the main perceptual hierarchy.
** Each ECU has such a set.**
** Perceptual input function: (sum of squared inputs + derivative of**
** same) Output function: (leaky integral, time-constant tau)**
(“Set” = control structure)
As you know, I proposed this only because it is the function you already
had proposed for the globalized error variable. If I have mis-stated it,
I’ll substitute whatever function you want to use in the separate system.
The output of THIS ECU, in both proposals, is what drives the reorganization
mechanism (but in the separate system there are additional outputs from the
control systems of the physiological intrinsic variables, which also
contribute to driving the reorganization mechanism).
There is an unexplained process in the localized reorganization structure,
but it is EXACTLY the same unexplained process as exists in the separate
reorganization structure, and if you come up with an explanation for either,
it will apply (probably) to the other. And that is: "what is the mechanism
whereby weights, linkages and functions change, and whereby new ECUs are
constructed." We don’t know the answer to that. I’ve suggested the
Genetic Algorithm as a possible part of such a mechanism.
>I have considerable trouble with the idea that the perceptual input
>functions of all ECUs at all levels are basically identical.
So would I. I haven’t seen any such proposal made by anyone on CSG-L.
Where did you get this straw man? Not, surely, from my proposal about
how each localized reorganizing unit uses the error variable in “its”
On the other hand, it seems to me quite unnecessary to propose that there
is a different type of function in order to explain the different types
of perception at the different levels of control. If you look at the
outputs at different levels of, say, a multilayer perceptron, you (the
external observer) can quite easily see that different kinds of external
structures are being analyzed. For example, in his inaugural lecture at
the University of Toronto, Geoff Hinton used as an illustration a three(?)
layer perceptron in which the input was the sibling and parental relationships
among a smallish group of people (two-by-two). If I remember rightly, the
training input would be three items, two names and the name of a relationship,
such as Maria, sister, Antonio, suitably coded. One of the outputs at the
top level was identifiable as Italian vs English (i.e. the output was high
if the (single) person presented at the input was Italian, and low if the
person was English, and another unit had the reverse). At an intermediate
level the output could be considered as degree of membership in an extended
family, of which there were several, interlinked. Italian and English, and
family, were not part of any training criterion–I forget whether the
structure was self-trained or whether there was a teacher. I’ve probably
got the details wrong (it was many years ago), but not the principle, which
is that the kinds of perception output at the different levels were quite
But the point is that these different kinds of perception were all generated
by the same kind of perceptual function, and the linkages among them were
the same as are posited to exist among the perceptual functions in a
control hierarchy. Each unit in the neural net could, in principle, have
been the perceptual input function for some ECU in a control hierarchy,
controlling for different kinds of perception at different levels.
>of perception I think I see in human behavior (and my own experience)
>seem to involve quite different computations as we go up the levels.
>It’s hard for me to see how a perceptual function that computes sequence
>uses the same basic operations as the perceptual function that computes
>tension in a tendon.
I reserve judgment on this. In my own musings, I have persuaded myself
that there is a need for three different kinds of perceptual function
(six, if taking a derivative is asserted to be different from taking
a static value). I have not persuaded myself that three (or six) is any
kind of limit.
For the curious, the three different kinds of perceptual function I have
mulled over are: (1) functions that combine incoming values with some
kind of weighting function, possibly with mutual multiplication, and
apply a non-linear saturating function to the result; (2) functions
like the first, but in which some of the inputs are reciprocal connections
to other similar perceptual functions (i.e. the output of A is an input to
B, and the output of B is an input to A, possibly indirectly); (3) functions
that perform logical operations that include IF…THEN…ELSE.
I hope that the foregoing paragraph doesn’t lead to another round of claims
that I am introducing unnecessary new stuff into PCT. As I said, it’s
what is mulling around in my head, not a cut-and-dried proposal.
> I also have considerable trouble with the
>particular form of perceptual function you propose for all ECUs.
Correction: … that you (I hope) proposed for the error-related variable
that is the perceptual function of the ECU whose output is part of the
reorganization signal (in the separate system), or that IS the reorganization
signal (in the localized system).
>hard to see the separation between cursor and target, in a tracking
>experiment, being perceived by a function that computes the square of
>the sum of weighted input signals, with or without derivatives.
Yes, it beats me why you would have wanted to try, other than that you
find it necessary to assume I am trying to change PCT in ways that I
explicitly and repeatedly say I am not.
On reorganization mechanism suggestions:
>If the higher system affects the lower only by varying the value of its
>reference signal (first paragraph), how does it also vary its linkages
>and functions and cause new ECU’s to grow (second paragraph)? You need
>to give each ECU capacities for doing those operations as well as
>varying the values of reference signals for lower systems, don’t you?
I don’t suppose it is worthwhile restating it yet again. But, YES. Each
ECU in the localized reorganizing scheme is allotted the capacities that
are allocated to the separate reorganizing system in that scheme, but they
apply only to its OWN local environment.
>And by what operations would an ECU be able to “induce” the growth of
>new ECUs when control of existing ECUs fails? How is it able to perceive
>that control has failed? What I’m concerned about is that you’re going
>to put a lot of program into the model that isn’t really part of the
>model: it just makes things happen when the programmer needs them to
>happen, like “inducing” (by magic) the generation of new ECUs. Isn’t the
>whole problem where these new ECUs come from and what gives them
Yes, that’s the same problem of mechanism that has to be addressed in
either structure. The same solution should apply to both proposals.
>Anyway, all this sounds quite incompatible with the Genetic Algorithm
>approach, in which new ECUs are developed not by experience with the
>current environment or failure to control in that environment, but by
>combining ECUs from the parents.
What kind of dormitive principle is this you are espousing? New ECUs developed
"by experience with the current environment or failure to control in that
environment"? How, pray?
>Or are you thinking of the GA as an
>alternative to the entire scheme above?
Not an alternative. A possible way in which new ECUs might be constructed
when the existing ones in a neighbourhood fail to control well during their
experience with the current environment. Along with all the other ways
that reorganization may happen–changes of link weights and of input
and output functions.
I hold out little hope that this will be understood any better than my
previous attempts; but little hope is better than none, and in the spirit
of the New Year, I will maintain that hope.
Take as the watchword in understanding it: Everything about the reorganizing
system is the same in the two proposals, EXCEPT its distribution or
Bill’s reply follows;
Date: Wed Jan 04, 1995 12:34 pm PST
Subject: Re: reorganization models
[From Bill Powers (950104.0600 MST)]
Martin Taylor (950103.1130) –
** The localized reorganizing system is different from the separate**
** reorganizing system, yes. It is an attempted answer to the**
** question "Is it absolutely necessary at this stage of understanding**
** PCT to assume that there exists a separate system devoted to**
** reorganization." It is necessarily different from the separate**
** reorganizing system in that it suggests that the answer to the**
** question might be "No, it is not absolutely necessary to assert**
** that reorganizing is done by a system separate from the main**
in B:CP, p. 182:
"This reorganizing system may some day prove to be no more than a
convenient fiction; its functions and properties may some day prove to
be aspects of the same systems that become organized. That possibility
does not reduce the value of isolating these special functions and
thinking about them as if they depended on the operation of some
So my answer to your question was given 21 years ago: no, it is not
necessary, but it is a conceptual convenience to see the properties of
“the” reorganizing system as being different from those of the hierarchy
of control systems “it” brings into being.
** I proposed that the outputs of the control systems for the**
** intrinsic variables contribute to reference signals in the main**
** hierarchy, rather than causing changes in the structure of the main**
** hierarchy. This seems to me to be reducing the number of ad-hoc**
** processes, not increasing them.**
I had commented:
>>If the higher system affects the lower only by varying the value of
>>its reference signal (first paragraph), how does it also vary its
>>linkages and functions and cause new ECU’s to grow (second paragraph)?
>>You need to give each ECU capacities for doing those operations as
>>well as varying the values of reference signals for lower systems,
and your reply was:
** I don’t suppose it is worthwhile restating it yet again. But, YES.**
** Each ECU in the localized reorganizing scheme is allotted the**
** capacities that are allocated to the separate reorganizing system**
** in that scheme, but they apply only to its OWN local environment.**
You seem to forget your own words within one page of having uttered
them. You want to “simplify” the process of reorganization by
eliminating the capacity to alter the hierarchy structurally, but you
then claim that your model "… is allotted the same capacities that are
allocated to the separate reorganizing system…".
By eliminating structure change as a consequence of reorganization, you
eliminate the fundamental property that I gave it in order to account
for the development of the hierarchy. By making reorganization work only
through adjustments of reference signals, you assume that there are
already places to send reference signals, so the hierarchy must already
be in existence in order for reorganization to work. This assumption, of
course, is what requires you to think of the initial reorganizing
process as equivalent to a highest level of control in the hierarchy
which works initially by sending reference signals directly to the motor
output systems. And having made that assumption, you must then propose
that new ECUs are “inserted” between the highest level and the lowest
(this somehow happening only through sending reference signals to lower
systems which do not yet exist, and without structural changes).
By proposing that reorganization works only through adjustments of
reference signals, you add to the reorganizing process an effect that I
did not put in it. In my proposal, changes in reference signals come
about through structural changes in the systems that are generating the
reference signals. It would make no sense for reorganization to set
reference signals directly, because in general there is no one “right”
reference signal. All reference signals must be adjusted by higher
systems on the basis of current disturbances and higher reference
signals, not be set to particular values. In my system, reorganization
does not produce any particular reference signals; it alters the
relationship between the reference signal in one system and the error
signals in higher systems. That is a structural change, not a change in
In short, I proposed a reorganizing system that works strictly through
making structural changes, not through manipulating signals in the
hierarchy. You have proposed one that works exclusively through
manipulating signals in the hierarchy, not through making structural
>>Anyway, all this sounds quite incompatible with the Genetic Algorithm
>>approach, in which new ECUs are developed not by experience with the
>>current environment or failure to control in that environment, but by
>>combining ECUs from the parents.
** What kind of dormitive principle is this you are espousing? New**
** ECUs developed "by experience with the current environment or**
** failure to control in that environment"? How, pray?**
When the current environment is such as to cause intrinsic errors, or
renders existing control system ineffective so they contain large and
persistent error signals, reorganization starts, altering the
organization of existing control systems and, especially during
development, creating new control systems out of “uncommitted neurons.”
The Genetic Algorithm, however, creates new control systems only by
combining the properties of control systems that existed in the parents.
This proposal attributes an entirely different origin to the control
systems in the offspring. It says that the newborn child contains
control systems in a hierarchy that resembles the hierarchy in the
parents, except that the properties of each new control system are some
combination of the properties of the parent’s equivalent control
systems. Under the Genetic Algorithm approach, there is no need for a
reorganizing system except perhaps to tune the properties of the
inherited hierarchy of control. The inherited hierarchy contains all the
levels and connections that are in the adult parents, from the very
beginning, complete with the organization of perceptual functions and
output functions as well as the interconnections from level to level.
This is necessary because a single ECU has no meaning in isolation; it
perceives only though lower perceptual processes, and it acts only
through lower control systems, so its properties must be appropriate to
the properties of all the systems below it. If one ECU is inherited,
then necessarily all ECUs below it that are related to it must also be
In my proposal, all that is inherited in a human being is a set of
levels of brain function, each one containing the basic materials from
which control systems of a particular type can be constructed, but with
no other pre-organization, or very little. The capacity to adapt
behavior for controlling a given environment is inversely related to the
amount of inherited organization in a system. I have chosen as a
starting point the assumption that all structural organization in the
human brain arises through reorganization and that none is inherited.
I will no doubt have to relax that rather extreme assumption some day,
but the more we can account for without relying on help from genetics,
the less burden we place on genetics to account for the events of a
I hope I have made my position clearer.
** Take as the watchword in understanding it: Everything about the**
** reorganizing system is the same in the two proposals, EXCEPT its**
** distribution or localization.**
Then whether it is distributed or localized, the reorganizing system
operates strictly through making structural changes in the brain, and
not through manipulation of the same signals that flow in the hierarchy
of behavioral control systems. The variables it controls are not the
variables that the hierarchy controls, but intrinsic variables at a
level of which the hierarchy knows nothing. If this is what you mean,
then I can accept the above “watchword.” Somehow, I doubt that this is
what you mean.
As to the three types of perceptual function you have been mulling over,
I don’t think you have considered how they would work as input functions
in a control system.
** For the curious, the three different kinds of perceptual function I**
** have mulled over are: (1) functions that combine incoming values**
** with some kind of weighting function, possibly with mutual**
** multiplication, and apply a non-linear saturating function to the**
You seem to be forgetting that a control system controls its perceptual
signal, not the inputs to the perceptual function. If the perceptual
signal is being made to track a changing reference signal, then the
inputs to the perceptual function must vary as the inverse perceptual
function of the reference signal (or one of the inverses). A saturating
input function would thus require the inputs to become extremely large
when the reference signal demands a perceptual signal near (or above!)
the saturation point.
All you’re doing here is adopting (loosely) the architecture that others
have proposed for adaptive neural nets. This architecture is designed to
produce categorical outputs on the basis of varying inputs; it is not
designed to create a perceptual signal whose variations are proportional
to variations in some external physical variable. This kind of input
function could not be used to model tracking behavior; its presence in a
control system would lead to behavior that we do not observe.
Your other two types of input functions are even worse-adapted to the
requirements of control, as you would find out if you tried to build a
simulation based on them.
Don’t forget that in the original attempts to get a Little Baby going,
your team was unable to make even a single control system operate using
the proposed integrating sigmoid input function until I persuaded them
to try a simple linear input function. That was at least two years ago.
What makes you think the nonlinear saturating input function is going to
work any better now?
A general comment based not only on the above interchanges but on papers
I’ve been receiving from various people:
When an attempt is made to bring together the latest or most influential
thinking to produce a model of behavior, the result – even not
considering PCT – impresses me as a mess, not an advance. The problem
is that reinforcement theory, Freudian theory, information theory, goal-
setting theory, genetic theory, quantum theory, neural network theory,
fractal theory, dynamic systems theory, classical conditioning theory,
Gibsonian theory, and all those other theories were never designed to
work together. Each one was developed from specific assumptions (both
covert and overt) and from observations that the others did not take
into account. A composite of all these theories is not an improvement on
any one of them. It is just a confusion of random ideas pulling in
When you try to mix PCT with all these other theories, the confusion
just becomes worse. If you stir a Big Mac, fries, and milkshake into a
big bowl with breakfast, tea, and dinner, the result is not an
improvement in the Big Mac lunch but garbage. Even the Big Mac becomes
just more garbage. That is what happens when PCT is combined with all
these older theories: it turns into garbage. It may not have been the
ultimate gourmet repast to begin with, but it is not improved by trying
to combine it with a mishmash of other ideas thrown together at random.
PCT covers a lot of ground and it was constructed to hang together
without internal contradictions. It is based on a fundamental phenomenon
that none of the other theories it keeps getting mixed in with took into
account, and in many cases it offers alternative explanations of
behavior that contradict the explanations offered by other theories. For
essentially every other theoretical explanation, PCT leads to a
different and usually incompatible explanation. You can’t just start
plugging other theories into PCT, or vice versa, and expect to get some
magical synergistic (oh, yeah, synergism theory) whole that is greater
than its parts.
I think it may be have been Dag Forssell who came up with an analogy:
building a car by using the best features of all the automobile
manufacturers. Combine the wheels of a Mercedes, the carburetor of a
Maserati, the engine of a Rolls Royce, the suspension of a Volkswagen,
the body of a Chevrolet truck, and the electronics of an Infiniti, and
what do you get? A nonfunctional monstrosity.
Best to all,