Reinforcement Learning

[From Rupert Young (2017.10.08 13.10)]

(Rick Marken (2017.10.07.1000)]

Sure. But, as far as I can see, it is mostly conceptual. The only

formal definition I’m aware of is Bill’s arm reorg example in LCS3,
of gradual weight adjustments of output links, which improves
control performance. I don’t see anything on how perceptual
functions are learned. Or how memory (which is learning) fits into
reorganisation.

I don't think anyone on here is proposing RL, but to rule it out it

is necessary to understand it. And some people probably don’t care
about a technique’s biological validity as long as it works, in
which case we’d need to show that a PCT approach is either better or
simpler.

Ok, but I am looking for details. How do discrete states and actions

map on to the concepts of PCT? What is actually being reorganised in
these scenarios? How can the scenarios be represented?

These would be the equivalent to "states" in traditional approaches?

And the available moves equivalent to “actions”? How would you
actually represent these perceptions in a computer simulation? If
your goal state is XXX (tic-tac-toe) how do you choose which of the
discrete sub-goals (actions) available to you in order to reduce the
error from your current state? How do you learn that one sub-goal is
better than another next time you play? What is being “reorganised”?
I’m thinking aloud, but open to suggestions.

Surely that is what we spend our early years, of development, doing;

learning perceptual functions (as well as improving performance).
This seems to be backed up by research such as the formation of
neural columns of visual “feature detectors” (Hubel and Weisel?),
the kittens restricted to environments of vertical lines and the
kitten pairs where only one was allowed to move. I can’t see that a
newborn baby would have any concept of something like “control of
the center”.

I would have thought that the major function or activity of

reorganisation was the learning of perceptual functions. In
principle, I don’t see why this shouldn’t be the same reorganisation
process, of adjusting the links and weights to input systems. We
“just” need a formal definition and some models that implement it.

Where would the robots get their perceptual functions from? They

don’t exist unless they are manually constructed; not sustainable.

I'm not necessarily expecting answers to these points I'm just

pointing out that learning is an area that is not fully formed in
the theory. I think, that if it were then it would represent a
significant progression for PCT, but we (I) are unable to move PCT
forward in this area unless we have a formal definition of learning,
that enables us to model and demonstrate the theory.

Rupert
···
            RY: One reason I am interested in

this is that I think artificial PCT systems will need
embody learning to progress beyond fairly basic systems.
This requires a formal definition of PCT learning,
which, unfortunately is currently lacking. I think it
will also be important for PCT to be taken seriously by
the mainstream.

RM: Reorganization is “learning” in PCT.

            RY: Although, it may be correct

that learning methods, such as RL, have substantial
drawbacks, they do have a formal definition. They also
appear to be quite successful, in some domains, of
constrained states and discrete actions, such as games.

          RM: But they are based on what PCT shows to be an

incorrect understanding of how control works. And what is
called RL in robotics is often not actually RL, as is the
case in the pendulum learning paper (* Learning to
Control an Inverted Pendulum Using Neural Networks*
by Charles W. Anderson).

          RM: I agree that it's important to include a capability

to learn (in terms of improving performance – control –
or learning how to control something that the system was
unable to control before). But I believe that a PCT-based
approach to robotics should start by ruling out RL as an
approach to learning and explain why it does; it’s
because RL is based on an S-R concept of how behavior
(control) works.

            RY: It would be useful to

understand why they are successful. It would also be
great if we could formulate a definition of PCT learning
not only for continuous variable control systems, but
also for the above game-type scenarios.

          RM: Reorganization should work for all "scenarios",

which I take to mean it should work for learning to
control all types of variables, including the higher level
variables (like programs and principles) that are
controlled in games.

            RY: So, any thoughts on how PCT

could be applied to learning, or even just playing,
games would be of great interest.

          RM: I think the higher level perceptions to be

controlled in the game – perceptions like “control of the
center”, “maintain material advantage”, “protect queen and
king” in a chess game – would be “givens”.

          I agree with previous discussions that suggested that

the types of perceptions we control have developed through
evolution; our brains have evolved the ability to perceive
the world in terms of principles like “control of the
center”, for example. I think it’s highly unlikely that
perceptual functions that can perceive these complex
variables could be developed within the time frame in
which we typically learn things – hours, days or even
weeks.

          So robot learning would consist mainly of reorganizing

the output functions that provide the references to the
systems that control lower level perceptions that are the
means of controlling these higher level perceptions.

[From Rupert Young (2017.10.08 13.40) ]

(Bruce Abbott (2017.01.02.0945 EDT)]

RY: Do you think it is valid, in terms of PCT, as it is introducing a perception by the back door?

BA: Not as currently implemented. But I don’t want to discuss this issue too much at this point, as I want to give others a chance at guessing the solution.

RY: I think that perception could be taken out into its own control system, which would then switch between two lower systems, in a similar way with the brake/throttle scenario?

BA: Yes, certainly, but at the cost of increased complexity, of course. Would we really need two nearly identical lower systems that differ only in you-know-what? Or could the higher system manipulate the lower system’s you-know-what directly? What perception would the higher system be controlling for?

Now that the reveal has been done we can come back to this question.

Although the positive feedback switch is a neat computational solution I think a PCT solution could (actually would) involve negative feedback as per usual; I've just tried it in my kitchen with a spanner and a wooden spoon. As you say it would be more complex. At some point it involves controlling the perception of the bob rising up, which involves the perception of the bob swinging back and forwards, which involves controlling the perception of the hand (cart) moving left and right, which is achieved by varying the speed of movement of the hand.

At a higher level that might be a sequence perception consisting of control of bob level (ref of 0.8) and then the balance control.

There's probably other arrangements that would work too.

Rupert

[From Rick Marken (2017.10.09.0915)]

···

Rupert Young (2017.10.08 13.10

RY: Sure. But, as far as I can see, it is mostly conceptual. The only

formal definition I’m aware of is Bill’s arm reorg example in LCS3,
of gradual weight adjustments of output links, which improves
control performance. I don’t see anything on how perceptual
functions are learned. Or how memory (which is learning) fits into
reorganisation.

RM: The arm reorg example in LCS3 seems to be more than conceptual; it actually works. I think the question of how perceptual functions are learned has to be preceded by research aimed at determining whether perceptual functions are learned. I now incline toward the idea that they are not; that the types of perceptual functions we have are built up by evolution. So, for example, I believe our brains come pre-wired to perceive, say, programmatic structures, such as the grammaticality of a sequence of words. It just has to learn what top perceive the grammar of the language into which one is born.I imagine this would involve using one’s built in program perception functions to perceive grammatical structure and learning to set references for the perceptions of grammatical structure that one intends to produce in order to be understood.Â

RM: As far as how memory fits into reorganization, I believe there is some speculation about this in the Memory chapter of B:CP. In terms of learning, reorganization would select the memory locations for the lower level references for the perceptions these systems should control as the means of controlling a higher level perception.Â

RM: The whole idea of reinforcement learning seems to be completely inconsistent with the PCT Model of behavior. Reinforcement increases the probability of certain actions rather than others. PCT shows that repetition of the same action will not produce a consistent (intended) result in a disturbance-prone world. So what has to be learned is how to vary actions appropriately in order to produce intended results. So some version of a reorganization model, which varies the parameters of control functions rather than the strength of particular outputs produced by these functions, and does so as the means of improving a control system’s ability to control a perceptual variable, seems like the best approach to control system learning.Â

RM: I would also suggest that the first thing to do when building a PCT-based robot is to figure out the “givens” of the system – the types of perceptual variables to control and the hierarchical arrangement of these variables – before trying to figure out how the robot learns to control better or ow it learns to control new perceptions. These “givens” would be equivalent to the physical and neural structures that exist when an organism first enters the world.

RM: Another possibility is just to try building a reoganization system into a simple robot that continuously tunes up its existing control systems. This would be a good exercise in developing a reorganization system that works in a system dealing with the real world and not just a software model of that world, as in the arm demo in LCS3

Best

Rick

I don't think anyone on here is proposing RL, but to rule it out it

is necessary to understand it. And some people probably don’t care
about a technique’s biological validity as long as it works, in
which case we’d need to show that a PCT approach is either better or
simpler.

Ok, but I am looking for details. How do discrete states and actions

map on to the concepts of PCT? What is actually being reorganised in
these scenarios? How can the scenarios be represented?

These would be the equivalent to "states" in traditional approaches?

And the available moves equivalent to “actions”? How would you
actually represent these perceptions in a computer simulation? If
your goal state is XXX (tic-tac-toe) how do you choose which of the
discrete sub-goals (actions) available to you in order to reduce the
error from your current state? How do you learn that one sub-goal is
better than another next time you play? What is being “reorganised”?
I’m thinking aloud, but open to suggestions.

Surely that is what we spend our early years, of development, doing;

learning perceptual functions (as well as improving performance).
This seems to be backed up by research such as the formation of
neural columns of visual “feature detectors” (Hubel and Weisel?),
the kittens restricted to environments of vertical lines and the
kitten pairs where only one was allowed to move. I can’t see that a
newborn baby would have any concept of something like “control of
the center”.

I would have thought that the major function or activity of

reorganisation was the learning of perceptual functions. In
principle, I don’t see why this shouldn’t be the same reorganisation
process, of adjusting the links and weights to input systems. We
“just” need a formal definition and some models that implement it.

Where would the robots get their perceptual functions from? They

don’t exist unless they are manually constructed; not sustainable.

I'm not necessarily expecting answers to these points I'm just

pointing out that learning is an area that is not fully formed in
the theory. I think, that if it were then it would represent a
significant progression for PCT, but we (I) are unable to move PCT
forward in this area unless we have a formal definition of learning,
that enables us to model and demonstrate the theory.

Rupert


Richard S. MarkenÂ

"Perfection is achieved not when you have nothing more to add, but when you
have nothing left to take away.�
                --Antoine de Saint-Exupery

RM: Reorganization is “learning” in PCT.

            RY: Although, it may be correct

that learning methods, such as RL, have substantial
drawbacks, they do have a formal definition. They also
appear to be quite successful, in some domains, of
constrained states and discrete actions, such as games.

          RM: But they are based on what PCT shows to be an

incorrect understanding of how control works. And what is
called RL in robotics is often not actually RL, as is the
case in the pendulum learning paper (* Learning to
Control an Inverted Pendulum Using Neural Networks*
by Charles W. Anderson).Â

          RM: I agree that it's important to include a capability

to learn (in terms of improving performance – control –
or learning how to control something that the system was
unable to control before). But I believe that a PCT-based
approach to robotics should start by ruling out RL as an
approach to learning and explain why it does; it’s
because RL is based on an S-R concept of how behavior
(control) works.

            RY: It would be useful to

understand why they are successful. It would also be
great if we could formulate a definition of PCT learning
not only for continuous variable control systems, but
also for the above game-type scenarios.

          RM: Reorganization should work for all "scenarios",

which I take to mean it should work for learning to
control all types of variables, including the higher level
variables (like programs and principles) that are
controlled in games.

            RY: So, any thoughts on how PCT

could be applied to learning, or even just playing,
games would be of great interest.

          RM: I think the higher level perceptions to be

controlled in the game – perceptions like “control of the
center”, “maintain material advantage”, “protect queen and
king” in a chess game – would be “givens”.

          I agree with previous discussions that suggested that

the types of perceptions we control have developed through
evolution; our brains have evolved the ability to perceive
the world in terms of principles like “control of the
center”, for example. I think it’s highly unlikely that
perceptual functions that can perceive these complex
variables could be developed within the time frame in
which we typically learn things – hours, days or even
weeks.

          So robot learning would consist mainly of reorganizing

the output functions that provide the references to the
systems that control lower level perceptions that are the
means of controlling these higher level perceptions.

[From Rick Marken (2017.10.12.1730)]

Hierarchical Control1.pdf (262 KB)

···

Rupert Young (2017.10.11 10.00)

RY: Are we talking about the same thing, how would an agent learn to

control a new perception?

RM: I think we learn to control new perceptions, not new types of perceptions. For example, we learn to control new word perceptions , where the specific words we learn depend on the language community into which we are born. What I believe is built in by evolution is the perceptual functions that let us learn to perceive and, thus, control, these new word type perceptions.Â

RM: An example of the type of perceptual function that might be built-in to perceive words is shown in Figure 11.3 in B:CP. In that figure the circuit that implements the perceptual function produces a specific word perception (“juice”). What I think is built -in is the sequence of reverberation loops that make it possible to perceive any new word. What would be learned is which phoneme-type input perceptions should go into such a sequence perceiving network to produce a new word perception.Â

Â

RY: Are you aware of any specific research to

support this?

RM: Not really. The notion that the world of experience is made up of a hierarchy of different types of perceptual variables is, as far as I can tell, unique to the PCT model of behavior. But there is some “naturalistic” evidence that people come equipped with certain types of built-in perceptual functions that are used as the basis of learning new perceptions of that type. Perhaps the most obvious example is language, where people learn to perceive and control the words and grammatical structure of the language into which they are born – a language with very different words and grammars that the others that the kid might have had to learn. Chomsky, I believe, suggested that this was because all humans come with the mental capacity – called a language acquisition device (LAD) – that allows them to learn the particular language of their group. In PCT, this LAD is the built in sequence (word) and program (grammar) type perceptual functions.Â

Â

RY: Are "types" really anything more than a useful way for

an observer to classify perceptions (akin to “races”), which are
just forms of the general principle of output of perceptual
functions?

RM: Perhaps. But they a central hypothesis of the PCT model, a hypothesis that people might go out and start testing if they could get over their inclination to study controlling as though it were the behavior of a sequential state S-R device. I have done some research testing the notion of hierarchical levels of different types of perceptual variables. Some of it is described in the attached paper; you can demonstrate it to yourself in this demo: http://www.mindreadings.com/ControlDemo/Hierarchy.html

Â

RY: Anyway, it would still be necessary to learn specific perceptual

functions within those types wouldn’t it?

RM: I think what would mainly have to be learned is what inputs go into the function. I just can’t believe that a function that produces, say, the perception of a grammatical sentence in some language, can be constructed from scratch in a few years. I think the functional connections are there (as in the sequence detecting function in Figure 11.3 of B:CP) ; what must be learned are the inputs to the functions.Â

Â

RY: For example, is a baby

born with the perceptual function which provides the ability to
perceive the word “rambunctious”?

RM: Yes, I think so.Â

Â

RY: With usual reorganisation, changes, to the system parameters, moves

the system closer to being able to control more efficiently
(intrinsic error reduces). But what would be changed in this case?
If the parameter being changed is the address then the change could
result in a different memory being accessed that has nothing to do
with what was being controlled? So how would reorganisation work in
this case?

RM: Actually, if you look at the diagram of the memory addressing system proposed in B:CP you will see that what is being addressed is the reference signal to a lower level system that is part of the means used by the higher level system – the one sending the address signal – to control its perception. So reorganizing the way the reference for the lower level system is addressed is functionally equivalent to reorganizing the parameters of the output function of a control system as the means of getting it to control better.

RY: As I understand Bill’s arm reorg system it is about the
strength of the output of functions. It starts off with random
weights (gains) on output connections to 14 lower systems. Through
reorganisation the strengths of those weights change with 13
reducing relative to the one that has effects that result in better
control.

RM: I believe that in that model the 14 weights were the weights of an impulse response function that constitutes the output function of the control system being reorganized. The value of all 14 weights were varied randomly based on the size of the error in the control system. The result was a nice negative exponential shaped impulse response function that was continuously convolved against the error signal to produce the output that produces the best control (lowest time varying error). Reinforcement (strengthening) would not have worked and was not involved.

RY: Therein lies the rub! I don't see that this can be done manually

except for some basic functions, so some form of learning would be
required.

RM: The “givens” are presumably the different types of perceptual variables that would be found, via PCT research, to be the types that are controlled by humans. Powers hypothesized what these “given” types would be in B:CP. Of course, once these types are identified it’s going to be a difficult job trying to figure out how to build the perceptual functions that implement them.Â

Â

RY: I'm also not convinced that "type" is a meaningful term

except to the observer. After all what makes one function different
from another apart from the variable it is controlling?

RM: OK, so you are not convinced of the correctness of the PCT model of behavior as the control of a hierarchy of different types of perceptual variables. And you shouldn’t be since it is still almost pure speculation and has hardly been tested at all. And it won’t get tested until a lot more researchers quit studying control systems as S-R devices and start studying them as what PCT says they are: perceptual control systems.Â

RY: I did that a couple of years ago here, https://www.youtube.com/watch?v=QF7K6Lhx5C8

RM: That’s terrific. Could you send me the code for that when you get a chance.Â

RY: But that is on the output side. I think similar learning is also

required on the input side, to learn perceptual functions.

RM: I agree that we learn to perceive things; I just believe that the structures that allow us to learn the new perceptions are given. But I think you can build a simple demonstration of perceptual learning using what I think may be one of the simplest perceptual function “givens”; a weighted linear combination of inputs. So how about this; build a perceptual function for the balancing robot that is a linear combination of two inputs; orientation to gravity (the gyro sensor) and orientation to visual upright (if you can get it). Have the system reorganize the weights of this perceptual function until control of this perception is as good as you can get it. See whether the result is that the perception becomes all gyro, all visual or some proportional combination of both.Â

RY: Do you

see any reason why that is not practical within living systems,
given that we have years of development available to us to build up
these perceptual functions? To learn something new we first need to
learn to perceive it before (or at the same time) we can improve the
performance.

RM: Of course, we learn new perceptions; I just don’t think we learn the perceptual functions that create these perceptions. The simple perceptual function I suggested is the kind of “given” I am imagining: a weighted sum like p = ag+bv. So the learning involves varying the weights, a and b, of the existing perceptual function. I think the existence of a functional architecture like this is even more essential to learning things like programs (grammar) and principles (of good writing style, for example). I think it’s unlikely that the nervous system could develop, through random trial and error, the neural architecture for a perceptual function that would perceive the degree, for example, to which you have “control of the center” in chess in the few years it takes for a reasonably bright child to learn to perceive this principle. Â

BestÂ

          RM: I think the question of how perceptual functions are

learned has to be preceded by research aimed at
determining whether perceptual functions are learned. I
now incline toward the idea that they are not; that the
types of perceptual functions we have are built up by
evolution.

RM: So what has to be learned is how to vary
actions appropriately in order to produce intended
results. So some version of a reorganization model, which
varies the parameters of control functions rather than the
strength of particular outputs produced by these
functions, and does so as the means of improving a control
system’s ability to control a perceptual variable, seems
like the best approach to control system learning.Â

          RM: I would also suggest that the first thing to do

when building a PCT-based robot is to figure out the
“givens” of the system – the types of perceptual
variables to control and the hierarchical arrangement of
these variables –

          RM: Another possibility is just to try building a

reoganization system into a simple robot that continuously
tunes up its existing control systems. This would be a
good exercise in developing a reorganization system that
works in a system dealing with the real world and not just
a software model of that world, as in the arm demo in LCS3

Rick


Richard S. MarkenÂ

"Perfection is achieved not when you have nothing more to add, but when you
have nothing left to take away.�
                --Antoine de Saint-Exupery

[Martin Taylor 2017.10.12.23.45]

[From Rick Marken (2017.10.12.1730)]

RM:
… The notion that the world of experience is made up of a
hierarchy of different types of perceptual variables is, as far as
I can tell, unique to the PCT model of behavior.

That's an interesting comment. On what, I decline to speculate.

I first came across "*      the notion that the world of experience is

made up of a hierarchy of different types of perceptual variables* ",
in my Children’s Encyclopedia (Chambers? Cassels?) in the late
1940s, where it was simply stated as a fact in language suitable for
children. If I remember correctly, only seven different types/levels
were mentioned. The idea was, after all, at that time fairly new,
having been proposed only in 1868 as a result of research in much
the same spirit as your “levels of perception” demo (though of
course not using control theory, so I guess it doesn’t count as real
research).

But I guess it takes time for news to reach La La Land.

Martin

[From Rick Marken (2017.10.13.1050)]

···

Martin Taylor (2017.10.12.23.45)–

  RM:

… The notion that the world of experience is made up of a
hierarchy of different types of perceptual variables is, as far as
I can tell, unique to the PCT model of behavior.

MT: That’s an interesting comment. On what, I decline to speculate.

RM: Gee, Martin. So catty! I thought Canadians were supposed to be nice. So much for stereotypes;-)

MT: I first came across "* the notion that the world of experience is
made up of a hierarchy of different types of perceptual variables* ",
in my Children’s Encyclopedia (Chambers? Cassels?) in the late
1940s, where it was simply stated as a fact in language suitable for
children. If I remember correctly, only seven different types/levels
were mentioned. The idea was, after all, at that time fairly new,
having been proposed only in 1868 as a result of research in much
the same spirit as your “levels of perception” demo (though of
course not using control theory, so I guess it doesn’t count as real
research).

RM: I think you’re talking about the Donders “mental chronometry” research, which presumed to measure the “stages” of mental processing, not the levels of perception. But there is research that is consistent with the notion that the world of experience is made up of a hierarchy of different types of perceptual variables; the single cell studies of Hubel and Weisel who found that cells at higher levels of the visual system detect visual features of greater complexity that those detected by cells at lower levels, with the more complex features being functions of the simpler ones.

Â

MT: But I guess it takes time for news to reach La La Land.

RM: Gee, catty and snarky, too;-)

RM: Actually, not that I have you here, I seem to recall you saying that there was a rebuttal to our “Power law of movement” paper that was being reviewed (I presume for submission to Experimental Brain Research). My co-author feels that we should have been informed of that but we haven’t. I would appreciate it if you could let me know the title and author of the rebuttal so I can contact the journal about either being a reviewer or publishing a rebuttal of our own.Â

Best

Rick

Â

Richard S. MarkenÂ

"Perfection is achieved not when you have nothing more to add, but when you
have nothing left to take away.�
                --Antoine de Saint-Exupery

[From Rupert Young (2017.10.18 18.10)]

(Rick Marken (2017.10.12.1730)]

I presume you mean that the baby is born with perceptual function

type structure from which will form the perceptual function that,
after learning, will provide the ability to perceive the word
“rambunctious”, rather than being able to perceive it at birth
(which was what I meant)?

I think I am saying the same as the first part of this. By "what is

being addressed is the reference signal to a lower level system" do
you mean a specific value of the reference signal? What do you mean
by “reorganizing the way the reference for the lower level system is
addressed”; what is the parameter being changed?

Normally, if it were gain being changed from 100 to 101, say, then

this would have a small effect on the residual error. If the error
reduced you’d keep changing in this direction, and a change to 102
would have a similar effect. So, the error might change from 0.10 to
0.09, for example. I.e. there is a correspondence between the degree
of change in the parameter (gain) and the degree of change in the
error.

However, if it is the address being changed, a change from 100 to

101 (the number of the box where the value of the reference signal
is located) may result in a wildly different value being retrieved,
resulting in much different control (better or worse). So, in this
case there is no correspondence between the degree of change in
the parameter (address) and the degree of change in the error. Or
perhaps you mean something different?

I don't think that is quite right. It is not the weight value that

is varied randomly, but the change in the weight. As can be
seen in this code excerpt, if the error has increased then tumble
the change to the weight (-1 >< 1), otherwise keeping changing
the weight by the same amount (in the same direction).

for I := 1 to MaxMatrix do // for each control system

    begin

      if ErSq[I] >= LastErSq[I] then // "tumble"  that control

system

        for J := 1 to MaxMatrix do // for each weight: change

direction

          InputCorrect[I, J] := 2.0 * (0.5 - Random);

      LastErSq[I] := ErSq[I];

    end;

where that change (InputCorrect) is applied to the  weight

      Wo[I, J] := Wo[I, J] + Emax * GainReorg * InputCorrect[I, J];

the result is that the weights that have greatest effect on reducing

the error become larger (stronger) and those that don’t become
smaller (weaker). This is shown in the weight matrix which shows the
stronger weights (~1.0) as light squares and the weaker weights
(~0.0) as dark squares.

![mpnldnbglahabbpk.png|334x157](upload://Vp2TnYJQUV5Q8onbdlrP1ir8XT.png)

Yes, exactly. It would be better if this could be done by some

learning process.

I don't have access to the actual code used and can't quite remember

which form of gradient descent I used, ecoli or hillclimb, but it
was something like this,

if error has increased

    delta = -delta; // hillclimb or

    delta = random = (2 * (Math.random() - 0.5)); // ecoli

correction_to_gain = learningrate * delta;

and the error was root mean square over a period of iterations

error_response = Math.sqrt(errorSum / period);

I may have used the size of error within the delta calculation.

I think "orientation to visual upright" would be difficult, but I'll

think of something similar.

I think I will leave the question of the degree to which the

functional architecture is learned through evolution or development
up to the neuroscientists.

Regards,

Rupert
···
            RY: For example, is a baby born

with the perceptual function which provides the ability
to perceive the word “rambunctious”?

RM: Yes, I think so.

            RY: With usual reorganisation,

changes, to the system parameters, moves the system
closer to being able to control more efficiently
(intrinsic error reduces). But what would be changed in
this case? If the parameter being changed is the address
then the change could result in a different memory being
accessed that has nothing to do with what was being
controlled? So how would reorganisation work in this
case?

          RM: Actually, if you look at the diagram of the memory

addressing system proposed in B:CP you will see that what
is being addressed is the reference signal to a lower
level system that is part of the means used by the higher
level system – the one sending the address signal – to
control its perception. So reorganizing the way the
reference for the lower level system is addressed is
functionally equivalent to reorganizing the parameters of
the output function of a control system as the means of
getting it to control better.

RY: As I understand Bill’s arm reorg system it is
about the strength of the output of functions. It starts
off with random weights (gains) on output connections to
14 lower systems. Through reorganisation the strengths
of those weights change with 13 reducing relative to the
one that has effects that result in better control.

          RM: I believe that in that model the 14 weights were

the weights of an impulse response function that
constitutes the output function of the control system
being reorganized. The value of all 14 weights were varied
randomly based on the size of the error in the control
system. The result was a nice negative exponential shaped
impulse response function that was continuously convolved
against the error signal to produce the output that
produces the best control (lowest time varying error).
Reinforcement (strengthening) would not have worked and
was not involved.

RM: So what has to be learned is how to vary
actions appropriately in order to produce
intended results. So some version of a
reorganization model, which varies the
parameters of control functions rather than
the strength of particular outputs produced
by these functions, and does so as the means
of improving a control system’s ability to
control a perceptual variable, seems like
the best approach to control system
learning.

          RM: The "givens" are presumably the different types of

perceptual variables that would be found, via PCT
research, to be the types that are controlled by humans.
Powers hypothesized what these “given” types would be in
B:CP. Of course, once these types are identified it’s
going to be a difficult job trying to figure out how to
build the perceptual functions that implement them.

          RY: I did that a couple of years ago

here, https://www.youtube.com/watch?v=QF7K6Lhx5C8

          RM: That's terrific. Could you send me the code for

that when you get a chance.

            RY: But that is on the output side.

I think similar learning is also required on the input
side, to learn perceptual functions.

          RM: I agree that we learn to perceive things; I just

believe that the structures that allow us to learn the new
perceptions are given. But I think you can build a simple
demonstration of perceptual learning using what I think
may be one of the simplest perceptual function “givens”; a
weighted linear combination of inputs. So how about this;
build a perceptual function for the balancing robot that
is a linear combination of two inputs; orientation to
gravity (the gyro sensor) and orientation to visual
upright (if you can get it). Have the system reorganize
the weights of this perceptual function until control of
this perception is as good as you can get it. See whether
the result is that the perception becomes all gyro, all
visual or some proportional combination of both.

            RY: Do you see any reason why that

is not practical within living systems, given that we
have years of development available to us to build up
these perceptual functions? To learn something new we
first need to learn to perceive it before (or at the
same time) we can improve the performance.

          RM: Of course, we learn new perceptions; I just don't

think we learn the perceptual functions that create these
perceptions. The simple perceptual function I suggested is
the kind of “given” I am imagining: a weighted sum like p
= ag+bv. So the learning involves varying the weights, a
and b, of the existing perceptual function. I think the
existence of a functional architecture like this is even
more essential to learning things like programs (grammar)
and principles (of good writing style, for example). I
think it’s unlikely that the nervous system could develop,
through random trial and error, the neural architecture
for a perceptual function that would perceive the degree,
for example, to which you have “control of the center” in
chess in the few years it takes for a reasonably bright
child to learn to perceive this principle.

[From Rick Marken (2017.10.20.1030)]

mpnldnbglahabbpk.png

···

Rupert Young (2017.10.18 18.10)Â

RY: I presume you mean that the baby is born with perceptual function

type structure from which will form the perceptual function that,
after learning, will provide the ability to perceive the word
“rambunctious”, rather than being able to perceive it at birth
(which was what I meant)?

RM: Yes, exactly!Â

RY: I think I am saying the same as the first part of this. By "what is

being addressed is the reference signal to a lower level system" do
you mean a specific value of the reference signal? What do you mean
by “reorganizing the way the reference for the lower level system is
addressed”; what is the parameter being changed?

RM: Since the references for lower level variables in all the models I’ve built have been continuous variables, I would call the outputs of the higher level systems the “address” signals for those references. Maybe it works that way all the way up the hierarchy so that “addressing” is really just the quantitative output of higher level systems. But maybe not. If not, I don’t really know what parameter changes for such an addressing scheme would look like.

Â

RY: Normally, if it were gain being changed from 100 to 101, say, then

this would have a small effect on the residual error. If the error
reduced you’d keep changing in this direction, and a change to 102
would have a similar effect. So, the error might change from 0.10 to
0.09, for example. I.e. there is a correspondence between the degree
of change in the parameter (gain) and the degree of change in the
error.

RY: However, if it is the address being changed, a change from 100 to

101 (the number of the box where the value of the reference signal
is located) may result in a wildly different value being retrieved,
resulting in much different control (better or worse). So, in this
case there is no correspondence between the degree of change in
the parameter (address) and the degree of change in the error. Or
perhaps you mean something different?

RM: Actually, I’m just describing the ideas Bill described in B:CP. I don’t really know how to implement a reorganization of an addressing scheme if the addresses are like computer addresses – that is, arbitrarily related to their contents.

RY: I don’t think that is quite right.

RM: Yes, sorry, we’re talking about different things.Â

Â

RY: It is not the weight value that

is varied randomly, but the change in the weight. As can be
seen in this code excerpt, if the error has increased then tumble
the change to the weight (-1 >< 1), otherwise keeping changing
the weight by the same amount (in the same direction).

for I := 1 to MaxMatrix do // for each control system

    begin

      if ErSq[I] >= LastErSq[I] then // "tumble"  that control

system

        for J := 1 to MaxMatrix do // for each weight: change

direction

          InputCorrect[I, J] := 2.0 * (0.5 - Random);

      LastErSq[I] := ErSq[I];

    end;



where that change (InputCorrect) is applied to the  weight



      Wo[I, J] := Wo[I, J] + Emax * GainReorg * InputCorrect[I, J];



the result is that the weights that have greatest effect on reducing

the error become larger (stronger) and those that don’t become
smaller (weaker). This is shown in the weight matrix which shows the
stronger weights (~1.0) as light squares and the weaker weights
(~0.0) as dark squares.Â

RM: I think what this matrix shows is essentially the correlation between the output weights and input weights of each system. The whiter the cell in the matrix, the higher the correlation between system output and input weights. The matrix shows that the reorganization scheme is essentially orthogonalizing the systems so that the outputs of each systems mainly affect the inputs it controls. I don’t think there is any “strengthening” of connections going on. Again, the perceptual functions are not being reorganized; just the connection of outputs to inputs.

RY: I think I will leave the question of the degree to which the

functional architecture is learned through evolution or development
up to the neuroscientists.

RM: Probably a good idea;-)Â

BestÂ

Rick

Regards,

Rupert


Richard S. MarkenÂ

"Perfection is achieved not when you have nothing more to add, but when you
have nothing left to take away.�
                --Antoine de Saint-Exupery

            RY: For example, is a baby born

with the perceptual function which provides the ability
to perceive the word “rambunctious”?

RM: Yes, I think so.Â

          RM: Actually, if you look at the diagram of the memory

addressing system proposed in B:CP you will see that what
is being addressed is the reference signal to a lower
level system that is part of the means used by the higher
level system – the one sending the address signal – to
control its perception…

RY: As I understand Bill’s arm reorg system it is
about the strength of the output of functions. It starts
off with random weights (gains) on output connections to
14 lower systems. Through reorganisation the strengths
of those weights change with 13 reducing relative to the
one that has effects that result in better control.

          RM: I believe that in that model the 14 weights were

the weights of an impulse response function that
constitutes the output function of the control system
being reorganized. The value of all 14 weights were varied
randomly based on the size of the error in the control
system. The result was a nice negative exponential shaped
impulse response function that was continuously convolved
against the error signal to produce the output that
produces the best control (lowest time varying error).
Reinforcement (strengthening) would not have worked and
was not involved.

RM: So what has to be learned is how to vary
actions appropriately in order to produce
intended results. So some version of a
reorganization model, which varies the
parameters of control functions rather than
the strength of particular outputs produced
by these functions, and does so as the means
of improving a control system’s ability to
control a perceptual variable, seems like
the best approach to control system
learning.Â

          RM: Of course, we learn new perceptions; I just don't

think we learn the perceptual functions that create these
perceptions…

[From Rupert Young (2017.10.22 21.10)]

(Rick Marken (2017.10.20.1030)

Yep, it is an area that lacks detail as the theory currently

stands. I think it (learning in general) would be an important
thing to work out, probably the most important way in which PCT
could be progressed.

I don't see any indication of this (correlation) in the write-up;

besides all the input weights are 1 (or 0). According to p130 of
LCS3 the squares in the matrix represent the output weights, which
shows which environmental variable is affected by the weight.

As it also says, the "white square shows that the ...  control

system’s output is having a relatively large effect on the …
environmental variable … The darker squares in the same row show
there is little effect on the others", p131. I don’t think it makes
any difference whether we use the term “larger effect” of a
connection or “stronger effect” of a connection, it’s just
terminology. It also doesn’t mean that reinforcement is involved.

Regards,

Rupert

mpnldnbglahabbpk.png

···

RM: Actually, I’m just describing the ideas Bill
described in B:CP. I don’t really know how to implement a
reorganization of an addressing scheme if the addresses
are like computer addresses – that is, arbitrarily
related to their contents.

            the result is that

the weights that have greatest effect on reducing the
error become larger (stronger) and those that don’t
become smaller (weaker). This is shown in the weight
matrix which shows the stronger weights (~1.0) as light
squares and the weaker weights (~0.0) as dark squares.

          RM: I think what this matrix shows is essentially the

correlation between the output weights and input weights
of each system. The whiter the cell in the matrix, the
higher the correlation between system output and input
weights. The matrix shows that the reorganization scheme
is essentially orthogonalizing the systems so that the
outputs of each systems mainly affect the inputs it
controls. I don’t think there is any “strengthening” of
connections going on. Again, the perceptual functions are
not being reorganized; just the connection of outputs to
inputs.

[From Rick Marken (2017.10.23.1145)]

mpnldnbglahabbpk.png

···

Rupert Young (2017.10.22 21.10)

RY: I don't see any indication of this (correlation) in the write-up;

besides all the input weights are 1 (or 0). According to p130 of
LCS3 the squares in the matrix represent the output weights, which
shows which environmental variable is affected by the weight.

RM: Yes, the input weights are either 0 or 1. Each of the perceptual functions of the 14 control systems is a 14 element vector where each element is the weight given to the input from each of 14 environmental variables – joint angles in this case. Rather than randomly assigning weights to these 14 environmental variables, as in the MultiControl demo, each perceptual function gets a weight of 1 in its vector that corresponds to a unique one of the 14 joint angles and a 0 for all the other angles. So each of the 14 control systems perceives one unique environmental variable (joint angle).Â

RM: The outputs of these 14 control systems are also a 14 element vector where each element of the output vector is a weight that determines the effect of that element on the corresponding joint angle. The simulation begins with the elements of the output vectors on all 14 control systems set to 0. E. coli type reorganization then changes the weights of these elements continuously between a value of 0 and 1. A perfect “solution” to the reorganization would result in the output vector of each control system matching its input vector. For example, the input vector (perceptual function) for control system 1 is [1,0,0,0,0,0,0,0,0,0,0,0,0,0], meaning that it perceives only shoulder roll angle. The perfect reorganization solution output vector would look just like the input vector, meaning that the output of system 1 would affect only shoulder roll angle. But the best solution reorganization can come up with might be [.8,.02,.01,0,.2,0,0,.1,0,0,.3,.005,0,.15].

RM: The matrix on p. 130 of LCS III (the one copied above) measures what is essentially the correlation between the input and output vectors for each system during each iteration of the reorganization process. If the input and output vectors are identical, the correlation between them is 1.0. The closer an element of the matrix is to white, the closer the correlation between input and output vectors is to 1.0. If each system’s output matrix were identical to only its own input matrix then the diagonal elements would be while and all other elements would be black. Negative correlations would be blackest.

RM: In fact, Bill didn’t use correlation to measure the similarity of input to output vectors for each system. What he did is described on p. 130. It’s a little arcane but it amounts to the same thing as measuring the correlation between the output and input vectors for each system: the whiteness of each element in the matrix is a measure of what is essentially the correlation between each output and input vector.Â

RY: As it also says, the "white square shows that the ...  control

system’s output is having a relatively large effect on the …
environmental variable …

RM: Yes, large relative to the effect of it’s output on the other environmental variables (joint angles) that are not controlled by that system. For example, if the output variable for system one (shoulder roll angle) is [.8,.02,.01,0,.2,0,0,.1,0,0,.3,.005,0,.15] then even though the weight for the angle being controlled is not 1.0 (it’s .8) it is much larger than the weights for the effect of this output on the other systems. The correlation between this output vector and the input vector for system 1–[1,0,0,0,0,0,0,0,0,0,0,0,0,0] – is .9 – almost white.Â

Â

RY: The darker squares in the same row show

there is little effect on the others", p131. I don’t think it makes
any difference whether we use the term “larger effect” of a
connection or “stronger effect” of a connection, it’s just
terminology. It also doesn’t mean that reinforcement is involved.

RM: Right. It works the way it works. But I do think it is misleading to call it reinforcement learning because reinforcement learning implies that the result of the reorganization is “selected” by external events (consequences of action) when, in fact, the result of reorganization is selected by the reorganization system trying to reduce intrinsic error. The appearance that external events select behavior is another example of a behavioral illusion.Â

          RM: I think what this matrix shows is essentially the

correlation between the output weights and input weights
of each system. The whiter the cell in the matrix, the
higher the correlation between system output and input
weights. The matrix shows that the reorganization scheme
is essentially orthogonalizing the systems so that the
outputs of each systems mainly affect the inputs it
controls.

BestÂ

Rick

Richard S. MarkenÂ

"Perfection is achieved not when you have nothing more to add, but when you
have nothing left to take away.�
                --Antoine de Saint-Exupery

[From Rupert Young (2017.10.23 21.50)]

(Rick Marken (2017.10.23.1145)]

Yep, so the output connection to the joint angle which has the most

effect on the corresponding input becomes stronger, and the others
become weaker.

Who is calling it reinforcement learning?

I think the language can be a bit confusing as a consequence of the

action is that the intrinsic error reduces. So, just to be clear, do
you mean that reinforcement learning implies that the changes in
weight are due to the presence of specific events? Whereas,
there is no direct link between the event and the weight change, but
there is between the error and the weight change.

Rupert
···

RM: The outputs of these 14 control systems are also a
14 element vector where each element of the output vector
is a weight that determines the effect of that element on
the corresponding joint angle. The simulation begins with
the elements of the output vectors on all 14 control
systems set to 0. E. coli type reorganization then changes
the weights of these elements continuously between a value
of 0 and 1. A perfect “solution” to the reorganization
would result in the output vector of each control system
matching its input vector. For example, the input vector
(perceptual function) for control system 1 is
[1,0,0,0,0,0,0,0,0,0,0,0,0,0], meaning that it perceives
only shoulder roll angle. The perfect reorganization
solution output vector would look just like the input
vector, meaning that the output of system 1 would affect
only shoulder roll angle. But the best solution
reorganization can come up with might be
[.8,.02,.01,0,.2,0,0,.1,0,0,.3,.005,0,.15].

      RM: Right. It works the way it works.

But I do think it is misleading to call it reinforcement
learning …

      RM: Right. It works the way it works.

But I do think it is misleading to call it reinforcement
learning because reinforcement learning implies that the
result of the reorganization is “selected” by external events
(consequences of action) when, in fact, the result of
reorganization is selected by the reorganization system trying
to reduce intrinsic error. The appearance that external events
select behavior is another example of a behavioral illusion.

[From Bruce Nevin (2017.10.24.16:33 ET)]

Rick Marken (2017.10.12.1730) –

 Not really. The notion that the world of experience is made up of a hierarchy of different types of perceptual variables is, as far as I can tell, unique to the PCT model of behavior. Â

Unique to the PCT model of behavior, yes, but as discussed a number of times on CSGnet the perceptual hierarchy as a model of *perception *is not at all unique to PCT.Â

But there is some “naturalistic” evidence that people come equipped with certain types of built-in perceptual functions that are used as the basis of learning new perceptions of that type. Perhaps the most obvious example is language, where people learn to perceive and control the words and grammatical structure of the language into which they are born – a language with very different words and grammars that the others that the kid might have had to learn. Chomsky, I believe, suggested that this was because all humans come with the mental capacity – called a language acquisition device (LAD) – that allows them to learn the particular language of their group. In PCT, this LAD is the built in sequence (word) and program (grammar) type perceptual functions. Â

I have been inveighing against Chomsky’s LAD and its congeners for 26 years on this forum. Universal Grammar, innate language acquisition device, and kindred notions are neither necessary nor sufficient in an account of grammar, and too powerful to support a competent theory–akin to saying “God did it”. For a glimpse of the absurdities in Chomsky’s views, consider pp. 4-5 of the review at

http://brooklynrail.org/2016/09/field-notes/understanding-the-labyrinth-noam-chomskys-science-and-politics

All that is biologically necessary is the cognitive capacity to control complex dependencies among different kinds of perceptions (some of them those constituting language), and the physiological capacity to control a sufficiently large system of phonological contrasts (differences of sound/articulation that make a difference between one word and another. Higher primates may have the former, but they lack the latter. No one has yet discerned regularities in the utterances of cetacians that might function as words. I emphasized the phrase “biologically necessary” above because language is a product (and a means) of collective control over many generations of people living in communities.

As to Bill’s “juice” example in B:CP, our alphabetic habits prejudice us. The requirements for written records, even for phonemic writing, are not the same as the requirements for speech. It is not at all clear that the perceptions that are controlled as means of uttering a word like “juice” correspond to letters or to segmental phonemes, and there is a considerable body of work contradicting that presumption.Â

···

On Thu, Oct 12, 2017 at 5:33 PM, Richard Marken rsmarken@gmail.com wrote:

[From Rick Marken (2017.10.12.1730)]

Rupert Young (2017.10.11 10.00)

RY: Are we talking about the same thing, how would an agent learn to

control a new perception?

RM: I think we learn to control new perceptions, not new types of perceptions. For example, we learn to control new word perceptions , where the specific words we learn depend on the language community into which we are born. What I believe is built in by evolution is the perceptual functions that let us learn to perceive and, thus, control, these new word type perceptions.Â

RM: An example of the type of perceptual function that might be built-in to perceive words is shown in Figure 11.3 in B:CP. In that figure the circuit that implements the perceptual function produces a specific word perception (“juice”). What I think is built -in is the sequence of reverberation loops that make it possible to perceive any new word. What would be learned is which phoneme-type input perceptions should go into such a sequence perceiving network to produce a new word perception.Â

Â

RY: Are you aware of any specific research to

support this?

RM: Not really. The notion that the world of experience is made up of a hierarchy of different types of perceptual variables is, as far as I can tell, unique to the PCT model of behavior. But there is some “naturalistic” evidence that people come equipped with certain types of built-in perceptual functions that are used as the basis of learning new perceptions of that type. Perhaps the most obvious example is language, where people learn to perceive and control the words and grammatical structure of the language into which they are born – a language with very different words and grammars that the others that the kid might have had to learn. Chomsky, I believe, suggested that this was because all humans come with the mental capacity – called a language acquisition device (LAD) – that allows them to learn the particular language of their group. In PCT, this LAD is the built in sequence (word) and program (grammar) type perceptual functions.Â

Â

RY: Are "types" really anything more than a useful way for

an observer to classify perceptions (akin to “races”), which are
just forms of the general principle of output of perceptual
functions?

RM: Perhaps. But they a central hypothesis of the PCT model, a hypothesis that people might go out and start testing if they could get over their inclination to study controlling as though it were the behavior of a sequential state S-R device. I have done some research testing the notion of hierarchical levels of different types of perceptual variables. Some of it is described in the attached paper; you can demonstrate it to yourself in this demo: http://www.mindreadings.com/ControlDemo/Hierarchy.html

Â

RY: Anyway, it would still be necessary to learn specific perceptual

functions within those types wouldn’t it?

RM: I think what would mainly have to be learned is what inputs go into the function. I just can’t believe that a function that produces, say, the perception of a grammatical sentence in some language, can be constructed from scratch in a few years. I think the functional connections are there (as in the sequence detecting function in Figure 11.3 of B:CP) ; what must be learned are the inputs to the functions.Â

Â

RY: For example, is a baby

born with the perceptual function which provides the ability to
perceive the word “rambunctious”?

RM: Yes, I think so.Â

Â

RY: With usual reorganisation, changes, to the system parameters, moves

the system closer to being able to control more efficiently
(intrinsic error reduces). But what would be changed in this case?
If the parameter being changed is the address then the change could
result in a different memory being accessed that has nothing to do
with what was being controlled? So how would reorganisation work in
this case?

RM: Actually, if you look at the diagram of the memory addressing system proposed in B:CP you will see that what is being addressed is the reference signal to a lower level system that is part of the means used by the higher level system – the one sending the address signal – to control its perception. So reorganizing the way the reference for the lower level system is addressed is functionally equivalent to reorganizing the parameters of the output function of a control system as the means of getting it to control better.

RY: As I understand Bill’s arm reorg system it is about the
strength of the output of functions. It starts off with random
weights (gains) on output connections to 14 lower systems. Through
reorganisation the strengths of those weights change with 13
reducing relative to the one that has effects that result in better
control.

RM: I believe that in that model the 14 weights were the weights of an impulse response function that constitutes the output function of the control system being reorganized. The value of all 14 weights were varied randomly based on the size of the error in the control system. The result was a nice negative exponential shaped impulse response function that was continuously convolved against the error signal to produce the output that produces the best control (lowest time varying error). Reinforcement (strengthening) would not have worked and was not involved.

RY: Therein lies the rub! I don't see that this can be done manually

except for some basic functions, so some form of learning would be
required.

RM: The “givens” are presumably the different types of perceptual variables that would be found, via PCT research, to be the types that are controlled by humans. Powers hypothesized what these “given” types would be in B:CP. Of course, once these types are identified it’s going to be a difficult job trying to figure out how to build the perceptual functions that implement them.Â

Â

RY: I'm also not convinced that "type" is a meaningful term

except to the observer. After all what makes one function different
from another apart from the variable it is controlling?

RM: OK, so you are not convinced of the correctness of the PCT model of behavior as the control of a hierarchy of different types of perceptual variables. And you shouldn’t be since it is still almost pure speculation and has hardly been tested at all. And it won’t get tested until a lot more researchers quit studying control systems as S-R devices and start studying them as what PCT says they are: perceptual control systems.Â

RY: I did that a couple of years ago here, https://www.youtube.com/watch?v=QF7K6Lhx5C8

RM: That’s terrific. Could you send me the code for that when you get a chance.Â

RY: But that is on the output side. I think similar learning is also

required on the input side, to learn perceptual functions.

RM: I agree that we learn to perceive things; I just believe that the structures that allow us to learn the new perceptions are given. But I think you can build a simple demonstration of perceptual learning using what I think may be one of the simplest perceptual function “givens”; a weighted linear combination of inputs. So how about this; build a perceptual function for the balancing robot that is a linear combination of two inputs; orientation to gravity (the gyro sensor) and orientation to visual upright (if you can get it). Have the system reorganize the weights of this perceptual function until control of this perception is as good as you can get it. See whether the result is that the perception becomes all gyro, all visual or some proportional combination of both.Â

RY: Do you

see any reason why that is not practical within living systems,
given that we have years of development available to us to build up
these perceptual functions? To learn something new we first need to
learn to perceive it before (or at the same time) we can improve the
performance.

RM: Of course, we learn new perceptions; I just don’t think we learn the perceptual functions that create these perceptions. The simple perceptual function I suggested is the kind of “given” I am imagining: a weighted sum like p = ag+bv. So the learning involves varying the weights, a and b, of the existing perceptual function. I think the existence of a functional architecture like this is even more essential to learning things like programs (grammar) and principles (of good writing style, for example). I think it’s unlikely that the nervous system could develop, through random trial and error, the neural architecture for a perceptual function that would perceive the degree, for example, to which you have “control of the center” in chess in the few years it takes for a reasonably bright child to learn to perceive this principle. Â

BestÂ

          RM: I think the question of how perceptual functions are

learned has to be preceded by research aimed at
determining whether perceptual functions are learned. I
now incline toward the idea that they are not; that the
types of perceptual functions we have are built up by
evolution.

RM: So what has to be learned is how to vary
actions appropriately in order to produce intended
results. So some version of a reorganization model, which
varies the parameters of control functions rather than the
strength of particular outputs produced by these
functions, and does so as the means of improving a control
system’s ability to control a perceptual variable, seems
like the best approach to control system learning.Â

          RM: I would also suggest that the first thing to do

when building a PCT-based robot is to figure out the
“givens” of the system – the types of perceptual
variables to control and the hierarchical arrangement of
these variables –

          RM: Another possibility is just to try building a

reoganization system into a simple robot that continuously
tunes up its existing control systems. This would be a
good exercise in developing a reorganization system that
works in a system dealing with the real world and not just
a software model of that world, as in the arm demo in LCS3

Rick


Richard S. MarkenÂ

"Perfection is achieved not when you have nothing more to add, but when you
have nothing left to take away.�
                --Antoine de Saint-Exupery

[From Bruce Nevin (2017.10.24.17:28 ET)]

In my (2017.10.24.16:33 ET) post just now I said pp. 4-5, but the linked review has no page numbers. Refer to pp. 4-5 in the attached PDF, or atÂ
http://brooklynrail.org/2016/09/field-notes/understanding-the-labyrinth-noam-chomskys-science-and-politics

search for the phrase “language organ” and read the ensuing three paragraphs.

Field.Notes.pdf (590 KB)

···

On Tue, Oct 24, 2017 at 5:19 PM, Bruce Nevin bnhpct@gmail.com wrote:

[From Bruce Nevin (2017.10.24.16:33 ET)]

Rick Marken (2017.10.12.1730) –

 Not really. The notion that the world of experience is made up of a hierarchy of different types of perceptual variables is, as far as I can tell, unique to the PCT model of behavior. Â

Unique to the PCT model of behavior, yes, but as discussed a number of times on CSGnet the perceptual hierarchy as a model of *perception *is not at all unique to PCT.Â

But there is some “naturalistic” evidence that people come equipped with certain types of built-in perceptual functions that are used as the basis of learning new perceptions of that type. Perhaps the most obvious example is language, where people learn to perceive and control the words and grammatical structure of the language into which they are born – a language with very different words and grammars that the others that the kid might have had to learn. Chomsky, I believe, suggested that this was because all humans come with the mental capacity – called a language acquisition device (LAD) – that allows them to learn the particular language of their group. In PCT, this LAD is the built in sequence (word) and program (grammar) type perceptual functions. Â

I have been inveighing against Chomsky’s LAD and its congeners for 26 years on this forum. Universal Grammar, innate language acquisition device, and kindred notions are neither necessary nor sufficient in an account of grammar, and too powerful to support a competent theory–akin to saying “God did it”. For a glimpse of the absurdities in Chomsky’s views, consider pp. 4-5 of the review at

http://brooklynrail.org/2016/09/field-notes/understanding-the-labyrinth-noam-chomskys-science-and-politics

All that is biologically necessary is the cognitive capacity to control complex dependencies among different kinds of perceptions (some of them those constituting language), and the physiological capacity to control a sufficiently large system of phonological contrasts (differences of sound/articulation that make a difference between one word and another. Higher primates may have the former, but they lack the latter. No one has yet discerned regularities in the utterances of cetacians that might function as words. I emphasized the phrase “biologically necessary” above because language is a product (and a means) of collective control over many generations of people living in communities.

As to Bill’s “juice” example in B:CP, our alphabetic habits prejudice us. The requirements for written records, even for phonemic writing, are not the same as the requirements for speech. It is not at all clear that the perceptions that are controlled as means of uttering a word like “juice” correspond to letters or to segmental phonemes, and there is a considerable body of work contradicting that presumption.Â

/Bruce

On Thu, Oct 12, 2017 at 5:33 PM, Richard Marken rsmarken@gmail.com wrote:

[From Rick Marken (2017.10.12.1730)]

Rupert Young (2017.10.11 10.00)

RY: Are we talking about the same thing, how would an agent learn to

control a new perception?

RM: I think we learn to control new perceptions, not new types of perceptions. For example, we learn to control new word perceptions , where the specific words we learn depend on the language community into which we are born. What I believe is built in by evolution is the perceptual functions that let us learn to perceive and, thus, control, these new word type perceptions.Â

RM: An example of the type of perceptual function that might be built-in to perceive words is shown in Figure 11.3 in B:CP. In that figure the circuit that implements the perceptual function produces a specific word perception (“juice”). What I think is built -in is the sequence of reverberation loops that make it possible to perceive any new word. What would be learned is which phoneme-type input perceptions should go into such a sequence perceiving network to produce a new word perception.Â

Â

RY: Are you aware of any specific research to

support this?

RM: Not really. The notion that the world of experience is made up of a hierarchy of different types of perceptual variables is, as far as I can tell, unique to the PCT model of behavior. But there is some “naturalistic” evidence that people come equipped with certain types of built-in perceptual functions that are used as the basis of learning new perceptions of that type. Perhaps the most obvious example is language, where people learn to perceive and control the words and grammatical structure of the language into which they are born – a language with very different words and grammars that the others that the kid might have had to learn. Chomsky, I believe, suggested that this was because all humans come with the mental capacity – called a language acquisition device (LAD) – that allows them to learn the particular language of their group. In PCT, this LAD is the built in sequence (word) and program (grammar) type perceptual functions.Â

Â

RY: Are "types" really anything more than a useful way for

an observer to classify perceptions (akin to “races”), which are
just forms of the general principle of output of perceptual
functions?

RM: Perhaps. But they a central hypothesis of the PCT model, a hypothesis that people might go out and start testing if they could get over their inclination to study controlling as though it were the behavior of a sequential state S-R device. I have done some research testing the notion of hierarchical levels of different types of perceptual variables. Some of it is described in the attached paper; you can demonstrate it to yourself in this demo: http://www.mindreadings.com/ControlDemo/Hierarchy.html

Â

RY: Anyway, it would still be necessary to learn specific perceptual

functions within those types wouldn’t it?

RM: I think what would mainly have to be learned is what inputs go into the function. I just can’t believe that a function that produces, say, the perception of a grammatical sentence in some language, can be constructed from scratch in a few years. I think the functional connections are there (as in the sequence detecting function in Figure 11.3 of B:CP) ; what must be learned are the inputs to the functions.Â

Â

RY: For example, is a baby

born with the perceptual function which provides the ability to
perceive the word “rambunctious”?

RM: Yes, I think so.Â

Â

RY: With usual reorganisation, changes, to the system parameters, moves

the system closer to being able to control more efficiently
(intrinsic error reduces). But what would be changed in this case?
If the parameter being changed is the address then the change could
result in a different memory being accessed that has nothing to do
with what was being controlled? So how would reorganisation work in
this case?

RM: Actually, if you look at the diagram of the memory addressing system proposed in B:CP you will see that what is being addressed is the reference signal to a lower level system that is part of the means used by the higher level system – the one sending the address signal – to control its perception. So reorganizing the way the reference for the lower level system is addressed is functionally equivalent to reorganizing the parameters of the output function of a control system as the means of getting it to control better.

RY: As I understand Bill’s arm reorg system it is about the
strength of the output of functions. It starts off with random
weights (gains) on output connections to 14 lower systems. Through
reorganisation the strengths of those weights change with 13
reducing relative to the one that has effects that result in better
control.

RM: I believe that in that model the 14 weights were the weights of an impulse response function that constitutes the output function of the control system being reorganized. The value of all 14 weights were varied randomly based on the size of the error in the control system. The result was a nice negative exponential shaped impulse response function that was continuously convolved against the error signal to produce the output that produces the best control (lowest time varying error). Reinforcement (strengthening) would not have worked and was not involved.

RY: Therein lies the rub! I don't see that this can be done manually

except for some basic functions, so some form of learning would be
required.

RM: The “givens” are presumably the different types of perceptual variables that would be found, via PCT research, to be the types that are controlled by humans. Powers hypothesized what these “given” types would be in B:CP. Of course, once these types are identified it’s going to be a difficult job trying to figure out how to build the perceptual functions that implement them.Â

Â

RY: I'm also not convinced that "type" is a meaningful term

except to the observer. After all what makes one function different
from another apart from the variable it is controlling?

RM: OK, so you are not convinced of the correctness of the PCT model of behavior as the control of a hierarchy of different types of perceptual variables. And you shouldn’t be since it is still almost pure speculation and has hardly been tested at all. And it won’t get tested until a lot more researchers quit studying control systems as S-R devices and start studying them as what PCT says they are: perceptual control systems.Â

RY: I did that a couple of years ago here, https://www.youtube.com/watch?v=QF7K6Lhx5C8

RM: That’s terrific. Could you send me the code for that when you get a chance.Â

RY: But that is on the output side. I think similar learning is also

required on the input side, to learn perceptual functions.

RM: I agree that we learn to perceive things; I just believe that the structures that allow us to learn the new perceptions are given. But I think you can build a simple demonstration of perceptual learning using what I think may be one of the simplest perceptual function “givens”; a weighted linear combination of inputs. So how about this; build a perceptual function for the balancing robot that is a linear combination of two inputs; orientation to gravity (the gyro sensor) and orientation to visual upright (if you can get it). Have the system reorganize the weights of this perceptual function until control of this perception is as good as you can get it. See whether the result is that the perception becomes all gyro, all visual or some proportional combination of both.Â

RY: Do you

see any reason why that is not practical within living systems,
given that we have years of development available to us to build up
these perceptual functions? To learn something new we first need to
learn to perceive it before (or at the same time) we can improve the
performance.

RM: Of course, we learn new perceptions; I just don’t think we learn the perceptual functions that create these perceptions. The simple perceptual function I suggested is the kind of “given” I am imagining: a weighted sum like p = ag+bv. So the learning involves varying the weights, a and b, of the existing perceptual function. I think the existence of a functional architecture like this is even more essential to learning things like programs (grammar) and principles (of good writing style, for example). I think it’s unlikely that the nervous system could develop, through random trial and error, the neural architecture for a perceptual function that would perceive the degree, for example, to which you have “control of the center” in chess in the few years it takes for a reasonably bright child to learn to perceive this principle. Â

BestÂ

          RM: I think the question of how perceptual functions are

learned has to be preceded by research aimed at
determining whether perceptual functions are learned. I
now incline toward the idea that they are not; that the
types of perceptual functions we have are built up by
evolution.

RM: So what has to be learned is how to vary
actions appropriately in order to produce intended
results. So some version of a reorganization model, which
varies the parameters of control functions rather than the
strength of particular outputs produced by these
functions, and does so as the means of improving a control
system’s ability to control a perceptual variable, seems
like the best approach to control system learning.Â

          RM: I would also suggest that the first thing to do

when building a PCT-based robot is to figure out the
“givens” of the system – the types of perceptual
variables to control and the hierarchical arrangement of
these variables –

          RM: Another possibility is just to try building a

reoganization system into a simple robot that continuously
tunes up its existing control systems. This would be a
good exercise in developing a reorganization system that
works in a system dealing with the real world and not just
a software model of that world, as in the arm demo in LCS3

Rick


Richard S. MarkenÂ

"Perfection is achieved not when you have nothing more to add, but when you
have nothing left to take away.�
                --Antoine de Saint-Exupery

[From Rick Marlen (2017.10.27.1030)]

···

Bruce Nevin (2017.10.24.16:33 ET)–

But there is some “naturalistic” evidence that people come equipped with certain types of built-in perceptual functions that are used as the basis of learning new perceptions of that type. Perhaps the most obvious example is language, where people learn to perceive and control the words and grammatical structure of the language into which they are born – a language with very different words and grammars that the others that the kid might have had to learn. Chomsky, I believe, suggested that this was because all humans come with the mental capacity – called a language acquisition device (LAD) – that allows them to learn the particular language of their group. In PCT, this LAD is the built in sequence (word) and program (grammar) type perceptual functions. Â

BN: I have been inveighing against Chomsky’s LAD and its congeners for 26 years on this forum. Universal Grammar, innate language acquisition device, and kindred notions are neither necessary nor sufficient in an account of grammar, and too powerful to support a competent theory–akin to saying “God did it”. For a glimpse of the absurdities in Chomsky’s views, consider pp. 4-5 of the review at

http://brooklynrail.org/2016/09/field-notes/understanding-the-labyrinth-noam-chomskys-science-and-politics

BN: All that is biologically necessary is the cognitive capacity to control complex dependencies among different kinds of perceptions (some of them those constituting language), and the physiological capacity to control a sufficiently large system of phonological contrasts

RM: That’s basically what I said: In order to be able to control complex dependencies (as in grammars) among different types of perceptions you have to be able to perceive these complex dependencies; that is, in order to control programs, like grammars, you have to be able to perceive programs.Â

RM: I believe that the neural architectures that make it possible to perceive programs are built in by evolution. I don’t believe it is possible to develop the these architectures in the 3 or 4 years that it takes for kids to learn to talk. So I agree with Chomsky that humans almost certainly come equipped with an innate ability to do language. He calls this innate ability a language acquisition device (LAD), but this says little more than that humans have an innate ability to learn language.Â

RM: PCT Is a bit more specific about what it is that is innate in humans that makes it possible for them to learn language. PCT says it’s the ability to perceive programs – particular networks of contingencies between lower level perceptions. That’s why I say that the ability to perceive (and therefore control) programs is the PCT equivalent of the LAD. The PCT version is a bit more specific about what makes it possible for humans to do (control) language and why even our cousin primates can’t do it (not very well, anyway).Â

RM: I believe PCT would say that the evolutionary brain change that made complex language communication possible was the development of the brain structures that made it possible for humans to perceive (and thus control) programs. Of course, the ability to perceive programs makes it possible to perceive and control any program, not just language. So I believe that once these program perception structures evolved, humans became capable of not only doing language – any language – but also complex activities that require the ability to control for the occurrence of a particular network of contingencies, such as controlling for the logistics (a network of contingencies) involved in hunting in bands or building skyscrapers.

BestÂ

Rick

Â

(differences of sound/articulation that make a difference between one word and another. Higher primates may have the former, but they lack the latter. No one has yet discerned regularities in the utterances of cetacians that might function as words. I emphasized the phrase “biologically necessary” above because language is a product (and a means) of collective control over many generations of people living in communities.

As to Bill’s “juice” example in B:CP, our alphabetic habits prejudice us. The requirements for written records, even for phonemic writing, are not the same as the requirements for speech. It is not at all clear that the perceptions that are controlled as means of uttering a word like “juice” correspond to letters or to segmental phonemes, and there is a considerable body of work contradicting that presumption.Â

/Bruce


Richard S. MarkenÂ

"Perfection is achieved not when you have nothing more to add, but when you
have nothing left to take away.�
                --Antoine de Saint-Exupery

On Thu, Oct 12, 2017 at 5:33 PM, Richard Marken rsmarken@gmail.com wrote:

[From Rick Marken (2017.10.12.1730)]

Rupert Young (2017.10.11 10.00)

RY: Are we talking about the same thing, how would an agent learn to

control a new perception?

RM: I think we learn to control new perceptions, not new types of perceptions. For example, we learn to control new word perceptions , where the specific words we learn depend on the language community into which we are born. What I believe is built in by evolution is the perceptual functions that let us learn to perceive and, thus, control, these new word type perceptions.Â

RM: An example of the type of perceptual function that might be built-in to perceive words is shown in Figure 11.3 in B:CP. In that figure the circuit that implements the perceptual function produces a specific word perception (“juice”). What I think is built -in is the sequence of reverberation loops that make it possible to perceive any new word. What would be learned is which phoneme-type input perceptions should go into such a sequence perceiving network to produce a new word perception.Â

Â

RY: Are you aware of any specific research to

support this?

RM: Not really. The notion that the world of experience is made up of a hierarchy of different types of perceptual variables is, as far as I can tell, unique to the PCT model of behavior. But there is some “naturalistic” evidence that people come equipped with certain types of built-in perceptual functions that are used as the basis of learning new perceptions of that type. Perhaps the most obvious example is language, where people learn to perceive and control the words and grammatical structure of the language into which they are born – a language with very different words and grammars that the others that the kid might have had to learn. Chomsky, I believe, suggested that this was because all humans come with the mental capacity – called a language acquisition device (LAD) – that allows them to learn the particular language of their group. In PCT, this LAD is the built in sequence (word) and program (grammar) type perceptual functions.Â

Â

RY: Are "types" really anything more than a useful way for

an observer to classify perceptions (akin to “races”), which are
just forms of the general principle of output of perceptual
functions?

RM: Perhaps. But they a central hypothesis of the PCT model, a hypothesis that people might go out and start testing if they could get over their inclination to study controlling as though it were the behavior of a sequential state S-R device. I have done some research testing the notion of hierarchical levels of different types of perceptual variables. Some of it is described in the attached paper; you can demonstrate it to yourself in this demo: http://www.mindreadings.com/ControlDemo/Hierarchy.html

Â

RY: Anyway, it would still be necessary to learn specific perceptual

functions within those types wouldn’t it?

RM: I think what would mainly have to be learned is what inputs go into the function. I just can’t believe that a function that produces, say, the perception of a grammatical sentence in some language, can be constructed from scratch in a few years. I think the functional connections are there (as in the sequence detecting function in Figure 11.3 of B:CP) ; what must be learned are the inputs to the functions.Â

Â

RY: For example, is a baby

born with the perceptual function which provides the ability to
perceive the word “rambunctious”?

RM: Yes, I think so.Â

Â

RY: With usual reorganisation, changes, to the system parameters, moves

the system closer to being able to control more efficiently
(intrinsic error reduces). But what would be changed in this case?
If the parameter being changed is the address then the change could
result in a different memory being accessed that has nothing to do
with what was being controlled? So how would reorganisation work in
this case?

RM: Actually, if you look at the diagram of the memory addressing system proposed in B:CP you will see that what is being addressed is the reference signal to a lower level system that is part of the means used by the higher level system – the one sending the address signal – to control its perception. So reorganizing the way the reference for the lower level system is addressed is functionally equivalent to reorganizing the parameters of the output function of a control system as the means of getting it to control better.

RY: As I understand Bill’s arm reorg system it is about the
strength of the output of functions. It starts off with random
weights (gains) on output connections to 14 lower systems. Through
reorganisation the strengths of those weights change with 13
reducing relative to the one that has effects that result in better
control.

RM: I believe that in that model the 14 weights were the weights of an impulse response function that constitutes the output function of the control system being reorganized. The value of all 14 weights were varied randomly based on the size of the error in the control system. The result was a nice negative exponential shaped impulse response function that was continuously convolved against the error signal to produce the output that produces the best control (lowest time varying error). Reinforcement (strengthening) would not have worked and was not involved.

RY: Therein lies the rub! I don't see that this can be done manually

except for some basic functions, so some form of learning would be
required.

RM: The “givens” are presumably the different types of perceptual variables that would be found, via PCT research, to be the types that are controlled by humans. Powers hypothesized what these “given” types would be in B:CP. Of course, once these types are identified it’s going to be a difficult job trying to figure out how to build the perceptual functions that implement them.Â

Â

RY: I'm also not convinced that "type" is a meaningful term

except to the observer. After all what makes one function different
from another apart from the variable it is controlling?

RM: OK, so you are not convinced of the correctness of the PCT model of behavior as the control of a hierarchy of different types of perceptual variables. And you shouldn’t be since it is still almost pure speculation and has hardly been tested at all. And it won’t get tested until a lot more researchers quit studying control systems as S-R devices and start studying them as what PCT says they are: perceptual control systems.Â

RY: I did that a couple of years ago here, https://www.youtube.com/watch?v=QF7K6Lhx5C8

RM: That’s terrific. Could you send me the code for that when you get a chance.Â

RY: But that is on the output side. I think similar learning is also

required on the input side, to learn perceptual functions.

RM: I agree that we learn to perceive things; I just believe that the structures that allow us to learn the new perceptions are given. But I think you can build a simple demonstration of perceptual learning using what I think may be one of the simplest perceptual function “givens”; a weighted linear combination of inputs. So how about this; build a perceptual function for the balancing robot that is a linear combination of two inputs; orientation to gravity (the gyro sensor) and orientation to visual upright (if you can get it). Have the system reorganize the weights of this perceptual function until control of this perception is as good as you can get it. See whether the result is that the perception becomes all gyro, all visual or some proportional combination of both.Â

RY: Do you

see any reason why that is not practical within living systems,
given that we have years of development available to us to build up
these perceptual functions? To learn something new we first need to
learn to perceive it before (or at the same time) we can improve the
performance.

RM: Of course, we learn new perceptions; I just don’t think we learn the perceptual functions that create these perceptions. The simple perceptual function I suggested is the kind of “given” I am imagining: a weighted sum like p = ag+bv. So the learning involves varying the weights, a and b, of the existing perceptual function. I think the existence of a functional architecture like this is even more essential to learning things like programs (grammar) and principles (of good writing style, for example). I think it’s unlikely that the nervous system could develop, through random trial and error, the neural architecture for a perceptual function that would perceive the degree, for example, to which you have “control of the center” in chess in the few years it takes for a reasonably bright child to learn to perceive this principle. Â

BestÂ

          RM: I think the question of how perceptual functions are

learned has to be preceded by research aimed at
determining whether perceptual functions are learned. I
now incline toward the idea that they are not; that the
types of perceptual functions we have are built up by
evolution.

RM: So what has to be learned is how to vary
actions appropriately in order to produce intended
results. So some version of a reorganization model, which
varies the parameters of control functions rather than the
strength of particular outputs produced by these
functions, and does so as the means of improving a control
system’s ability to control a perceptual variable, seems
like the best approach to control system learning.Â

          RM: I would also suggest that the first thing to do

when building a PCT-based robot is to figure out the
“givens” of the system – the types of perceptual
variables to control and the hierarchical arrangement of
these variables –

          RM: Another possibility is just to try building a

reoganization system into a simple robot that continuously
tunes up its existing control systems. This would be a
good exercise in developing a reorganization system that
works in a system dealing with the real world and not just
a software model of that world, as in the arm demo in LCS3

Rick


Richard S. MarkenÂ

"Perfection is achieved not when you have nothing more to add, but when you
have nothing left to take away.�
                --Antoine de Saint-Exupery

[Martin Taylor 2017.10.27.13.41]

[From Rick Marlen (2017.10.27.1030)]

Unsurprisingly, I take a more nuanced view, but opposing one belief

to another is not very scientific, so I won’t bother to go further.
You might be able to guess my view from what follows.

I'll leave Bruce to comment on whether the Language Acquisition

Device (Is that the correct expansion of LAD?) is identical to the
ability to perceive programs. My contribution here is based on
research I had to review when I was on the Psychology Grants
Committee of the Canadian National Science and Engineering Research
Council a few decades ago. Vincent DiLollo submitted a proposal that
intrigued me then and still does. Like Plooij, DiLollo found in
individuals a sequence of plateaus of development that tended to
occur at around the same age in almost every “normal” person. He
attributed these plateaus (As far as I remember there were 14 levels
for people who reached very high academic or intellectual levels –
not the same thing). DiLollo attributed the levels to the
acquisition of new abilities that each required one new function
that had not been present earlier in life. I see his levels as
analogous to (and perhaps being) the development of new kinds of
controllable perception.

The problem for this analogy and the requirement for language

processing to be dependent on program level perceptual control is
that DiLollo found that the ability to process “If-then” structures
arrive much later than the ability to communicate by speech.
Obviously there is very little we would call “grammar” in a child’s
first two-word utterances, but something very like grammar does
emerge soon thereafter. The difference between “Give me” and “Me
give” is not learned late in life, but is easily transformed into
“Baby give me” as distinct from “Me give Baby”.

Is program-level perceptual control ever required for normal

conversational level? Perhaps it is, in the sense of “If I
say it this way, then she will see it the way I want her to
see it, else if I say it that way she will be annoyed.”
Instead, I think what needs program-level perceptual control is the
ability to puzzle through a sequence of steps needed to solve
puzzles. Some birds and some animals can do this with simple
puzzles, and some may be able to tell others of their species how to
do the same kind of puzzle (I think of porpoises).

That ability to communicate about program-level perceptions rather

than about the current circumstances (Me have dolly, not Baby) is
the distinctive feature that enables the development of complex
technique by human societies rather than by individual humans
starting from scratch on individual puzzles. To use recursion in
language may be the tipping point: “If you want to do that then if
you get this, then if you …” allows a teacher to pass an algorithm
for making something to a student, who may then improve the
algorithm and pass it further along.

I don't remember enough of DiLollo's work to be able to tell you his

methods, and it wouldn’t matter to Rick anyway, since all work not
based in PCT must be discarded, but from what I remember I think his
findings were pretty clear. Together with the way I think PCT
actually predicts the development of language ability, I would agree
with Rick (and DiLollo) that evolution as provided our brains with
enough functional capacity to make explicit and communicate
program-level perceptions may be the distinguishing feature that
separates us from (most?) other living species. I hope some day
someone learns enough porpoise to know if they do the same.

Martin
···
              Bruce Nevin

(2017.10.24.16:33 ET)–

                    <RM> But there

is some “naturalistic” evidence that people come
equipped with certain types of built-in
perceptual functions that are used as the basis
of learning new perceptions of that type. Perhaps the most
obvious example is language, where people
learn to perceive and control the words and
grammatical structure of the language into
which they are born – a language with very
different words and grammars that the others
that the kid might have had to learn. Chomsky,
I believe, suggested that this was because all
humans come with the mental capacity – called
a language acquisition device (LAD) – that
allows them to learn the particular language
of their group. In PCT, this LAD is the built
in sequence (word) and program (grammar) type
perceptual functions.

                  BN: I have been

inveighing against Chomsky’s LAD and its congeners
for 26 years on this forum. Universal Grammar,
innate language acquisition device, and kindred
notions are neither necessary nor sufficient in an
account of grammar, and too powerful to support a
competent theory–akin to saying “God did it”. For
a glimpse of the absurdities in Chomsky’s views,
consider pp. 4-5 of the review at

http://brooklynrail.org/2016/09/field-notes/understanding-the-labyrinth-noam-chomskys-science-and-politics

BN: All that is * biologically* necessary is the cognitive capacity to control
complex dependencies among different kinds of
perceptions (some of them those constituting
language), and the physiological capacity to
control a sufficiently large system of
phonological contrasts

          RM: That's basically what I said: In order to be able

to control complex dependencies (as in grammars) among
different types of perceptions you have to be able to
perceive these complex dependencies; that is, in order to
control programs, like grammars, you have to be able to
perceive programs.

          RM: I believe that the neural architectures that make

it possible to perceive programs are built in by
evolution. I don’t believe it is possible to develop the
these architectures in the 3 or 4 years that it takes for
kids to learn to talk.

          So I agree with Chomsky that humans almost certainly

come equipped with an innate ability to do language. He
calls this innate ability a language acquisition device
(LAD), but this says little more than that humans have an
innate ability to learn language.

          RM: PCT Is a bit more specific about what it is that is

innate in humans that makes it possible for them to learn
language. PCT says it’s the ability to perceive programs
– particular networks of contingencies between lower
level perceptions. That’s why I say that the ability to
perceive (and therefore control) programs is the PCT
equivalent of the LAD. The PCT version is a bit more
specific about what makes it possible for humans to do
(control) language and why even our cousin primates can’t
do it (not very well, anyway).

[From Rick Markenm (2017.10.27.1255)]

···

Martin Taylor (2017.10.27.13.41)–

[From Rick Marlen (2017.10.27.1030)]

MT: I'll leave Bruce to comment on whether the Language Acquisition

Device (Is that the correct expansion of LAD?) is identical to the
ability to perceive programs.

RM: I think the LAD is just a term Chomsky used to refer to an innate ability to learn language. That’s what I understand it to mean, anyway. I don’t think Chomsky described any particular mechanism by which this LAD might work. I think the idea that the perceptual functions required to perceive and thus control the grammatical aspects of language is just a step toward specifying what it is about the ability to learn language that might be innate.

Â

MT: My contribution here is based on

research I had to review when I was on the Psychology Grants
Committee of the Canadian National Science and Engineering Research
Council a few decades ago. Vincent DiLollo submitted a proposal that
intrigued me then and still does. Like Plooij, DiLollo found in
individuals a sequence of plateaus of development that tended to
occur at around the same age in almost every “normal” person. He
attributed these plateaus (As far as I remember there were 14 levels
for people who reached very high academic or intellectual levels –
not the same thing). DiLollo attributed the levels to the
acquisition of new abilities that each required one new function
that had not been present earlier in life. I see his levels as
analogous to (and perhaps being) the development of new kinds of
controllable perception.

RM: Sounds interesting, though I would bet that the levels were described as output rather than input capabilities. But, still, interesting.Â

MT: The problem for this analogy and the requirement for language

processing to be dependent on program level perceptual control is
that DiLollo found that the ability to process “If-then” structures
arrive much later than the ability to communicate by speech.
Obviously there is very little we would call “grammar” in a child’s
first two-word utterances, but something very like grammar does
emerge soon thereafter. The difference between “Give me” and “Me
give” is not learned late in life, but is easily transformed into
“Baby give me” as distinct from “Me give Baby”.

RM: That’s why we need research (like DiLollo’s). I said language depends on the ability to perceive “programs” because program perception is one of the types of perception Powers hypothesized to be controlled and I’ve demonstrated that people can, indeed, control programs. But I see great research possibilities for determining the exact nature of the perceptions being controlled when people control for producing grammatical utterances, for example.Â

Â

MT: Is program-level perceptual control ever required for normal

conversational level?

RM: A good question for empirical research.Â

MT: I don't remember enough of DiLollo's work to be able to tell you his

methods, and it wouldn’t matter to Rick anyway, since all work not
based in PCT must be discarded,

RM: Not true. My work on object interception, for example, is based on work that was not based on PCT. But it was very relevant since the researchers were essentially looking for controlled variables without explicitly thinking of it that way. It sounds to me like DiLollo’s work might be the same kind of thing. When I get a chance I’ll see if I can find some reference to his work. It’s only a “large segment” of the conventional research literature that has to go into the waste basket.

Â

MT: but from what I remember I think his

findings were pretty clear. Together with the way I think PCT
actually predicts the development of language ability, I would agree
with Rick (and DiLollo) that evolution as provided our brains with
enough functional capacity to make explicit and communicate
program-level perceptions may be the distinguishing feature that
separates us from (most?) other living species. I hope some day
someone learns enough porpoise to know if they do the same.

 RM: Yes, and that’s the only point I was trying to make: The ability to use language exists because humans have evolved the perceptual capabilities that allow them to control for the perception of grammatical utterances. It’s a long way from there to finding out what those perceptual capabilities are – a road that is paved with research (aimed at testing for the variables people control when they use language)!

BestÂ

          RM: PCT Is a bit more specific about what it is that is

innate in humans that makes it possible for them to learn
language. PCT says it’s the ability to perceive programs
– particular networks of contingencies between lower
level perceptions. That’s why I say that the ability to
perceive (and therefore control) programs is the PCT
equivalent of the LAD. The PCT version is a bit more
specific about what makes it possible for humans to do
(control) language and why even our cousin primates can’t
do it (not very well, anyway).Â

Rick

Richard S. MarkenÂ

"Perfection is achieved not when you have nothing more to add, but when you
have nothing left to take away.�
                --Antoine de Saint-Exupery

[From Bruce Nevin (2017.10.28.16:15 ET)]

Rick Marken (2017.10.27.1030) re my (2017.10.24.16:33 ET)

That’s basically what I said: In order to be able to control complex dependencies (as in grammars) among different types of perceptions you have to be able to perceive these complex dependencies; that is, in order to control programs, like grammars, you have to be able to perceive programs.

Yes. I was disagreeing with Chomsky, not with you.

However, now that you have elaborated a bit, I do have a point of disagreement with what you have said. Natural language grammar does not require control of program perceptions. (In this, I agree with Martin.) Formal grammars are programs. The canonical and seminal examples of formal grammars are the various grammars of logic (Carnap, Tarsky, and the rest). A formal grammar has a finite alphabet of abstract symbols; ‘rules of formation’ for combining these symbols into strings; ‘rules of transformation’ for inferring other strings, which can involve combinations and deformations of strings, and rules of interpretation for mapping such strings onto intelligible sentences of a natural language. Chomsky’s theories of grammar have the same form, except that the ‘semantic interpretation’ is a mapping onto logical or quasi-logical expressions whose natural language interpretation is tacitly assumed according to established conventions.

Martin Taylor 2017.10.27.13.41

Martin, we are in substantial agreement.Â

 To use recursion in language may be the tipping point: “If you want to do that then if you get this, then if you …” allows a teacher to pass an algorithm for making something to a student, who may then improve the algorithm and pass it further along.

Recursion is a property of formal grammars. In a formal grammar, “I realized that Rick thought I was attacking him” involves recursion. To illustrate the point, here is a simple (and old fashioned) pair of rewrite rules of the sort invented by Emil Post and taken up by Chomsky:

S → NP VP

VP → Vs that S

(S = “sentence”, NP = “noun phrase”, VP = “verb phrase”, V = “verb” and Vs is a subset of V.) The first rule in the derivation of a sentence structure must have “initial symbol” S as its input (to the left of the arrow). Recursion is where a rule can apply recursively to part of its own output–a symbol on the left of the arrow can also occur on the right of the arrow. The recursion here is where S also occurs in the output of a rule.

A grammar of word entry and reductions does not involve recursion. Omitting considerable detail,

attack enters on the pair (I, him)

think enters on the pair (Rick, attack)

realize enters on the pair (I, think)

There are no symbols that occur both in the input and in the output of rules.

Yes, Chomsky introduced the LAD acronym for Language Acquisition Device in the 1960s. More recently, it’s been a Language Organ.

Chomsky has had to reduce his supposed biologically innate LAD and Universal Grammar to the claim that recursion is unique to language and unique to humans. In his view, “a biologically innate ‘language organ’ did not evolve but appeared suddenly due to a single mutation in a single individual, perhaps caused by a “strange cosmic ray showerâ€?.., conferring a uniquely human capacity for recursion. (Recursion is a property of symbolic rules when a symbol in the input to a rule also occurs in its output, so that it can re-apply to its own output. This is relevant for a theory of language that employs symbol-manipulating rules.) According to this tale, this lone super-hominid was thereby endowed with rule-governed symbolic information-processing that other animals did not enjoy. It could ‘think’ or ‘talk to itself’ mentally. Obviously there were no others to talk to, but according to Chomsky it wouldn’t have conversed even if there were, because although the superior capacity of its brain to manipulate abstract concepts conferred reproductive advantage and was inherited by its descendants, two or three thousand generations of such super-hominids used this private ‘mentalese’ only to engage in interior monologues, thinking to themselves, until, perhaps some 50,000 years later, for unspecified reasons that are of no interest to Chomsky, the social use of language for communication emerged.” [Quoted from the review that I cited, q.v.]

Like Plooij, DiLollo found in individuals a sequence of plateaus of development that tended to occur at around the same age in almost every “normal” person. He attributed these plateaus (As far as I remember there were 14 levels for people who reached very high academic or intellectual levels – not the same thing). DiLollo attributed the levels to the acquisition of new abilities that each required one new function that had not been present earlier in life. I see his levels as analogous to (and perhaps being) the development of new kinds of controllable perception.

I think it is very important to replicate this work. We know that Plooij & Plooij’s results are not the end of the line. Cognitive development obviously continues beyond early childhood into adolescence and adulthood–as you say, Martin, with the development of new perceptual input functions and control loops, rather than additional levels of the hierarchy.

The ability to use language exists because humans have evolved the perceptual capabilities that allow them to control for the perception of grammatical utterances. It’s a long way from there to finding out what those perceptual capabilities are – a road that is paved with research (aimed at testing for the variables people control when they use language)!

There is a name for the basic methods for identifying what perceptions are controlled in a language: Descriptive Linguistics. I attempted to demonstrate some of these methods, with kind assistance of Christine Forssell, at our meeting in 2003 in LA (IIRC). Properly understood and conducted, these are experimental methods involving systematic disturbance to identify controlled variables, not merely methods of observation and classification as some have claimed. By extending the substitution tests that identify phonemic contrasts, morphemes, and words to larger domains and by testing for relative acceptability (or for restrictions of context for optimum acceptability) a grammar of word-entry and reduction has been reached. Whereas a formal grammar aims to account for ‘all and only the sentences of a language’, the grammar of word dependencies and reductions also accounts easily for the fragments and midstream recastings that are normal for ordinary, unedited human discourse.

···

On Fri, Oct 27, 2017 at 3:58 PM, Richard Marken rsmarken@gmail.com wrote:

[From Rick Markenm (2017.10.27.1255)]

Martin Taylor (2017.10.27.13.41)–

[From Rick Marlen (2017.10.27.1030)]

MT: I'll leave Bruce to comment on whether the Language Acquisition

Device (Is that the correct expansion of LAD?) is identical to the
ability to perceive programs.

RM: I think the LAD is just a term Chomsky used to refer to an innate ability to learn language. That’s what I understand it to mean, anyway. I don’t think Chomsky described any particular mechanism by which this LAD might work. I think the idea that the perceptual functions required to perceive and thus control the grammatical aspects of language is just a step toward specifying what it is about the ability to learn language that might be innate.

Â

MT: My contribution here is based on

research I had to review when I was on the Psychology Grants
Committee of the Canadian National Science and Engineering Research
Council a few decades ago. Vincent DiLollo submitted a proposal that
intrigued me then and still does. Like Plooij, DiLollo found in
individuals a sequence of plateaus of development that tended to
occur at around the same age in almost every “normal” person. He
attributed these plateaus (As far as I remember there were 14 levels
for people who reached very high academic or intellectual levels –
not the same thing). DiLollo attributed the levels to the
acquisition of new abilities that each required one new function
that had not been present earlier in life. I see his levels as
analogous to (and perhaps being) the development of new kinds of
controllable perception.

RM: Sounds interesting, though I would bet that the levels were described as output rather than input capabilities. But, still, interesting.Â

MT: The problem for this analogy and the requirement for language

processing to be dependent on program level perceptual control is
that DiLollo found that the ability to process “If-then” structures
arrive much later than the ability to communicate by speech.
Obviously there is very little we would call “grammar” in a child’s
first two-word utterances, but something very like grammar does
emerge soon thereafter. The difference between “Give me” and “Me
give” is not learned late in life, but is easily transformed into
“Baby give me” as distinct from “Me give Baby”.

RM: That’s why we need research (like DiLollo’s). I said language depends on the ability to perceive “programs” because program perception is one of the types of perception Powers hypothesized to be controlled and I’ve demonstrated that people can, indeed, control programs. But I see great research possibilities for determining the exact nature of the perceptions being controlled when people control for producing grammatical utterances, for example.Â

Â

MT: Is program-level perceptual control ever required for normal

conversational level?

RM: A good question for empirical research.Â

MT: I don't remember enough of DiLollo's work to be able to tell you his

methods, and it wouldn’t matter to Rick anyway, since all work not
based in PCT must be discarded,

RM: Not true. My work on object interception, for example, is based on work that was not based on PCT. But it was very relevant since the researchers were essentially looking for controlled variables without explicitly thinking of it that way. It sounds to me like DiLollo’s work might be the same kind of thing. When I get a chance I’ll see if I can find some reference to his work. It’s only a “large segment” of the conventional research literature that has to go into the waste basket.

Â

MT: but from what I remember I think his

findings were pretty clear. Together with the way I think PCT
actually predicts the development of language ability, I would agree
with Rick (and DiLollo) that evolution as provided our brains with
enough functional capacity to make explicit and communicate
program-level perceptions may be the distinguishing feature that
separates us from (most?) other living species. I hope some day
someone learns enough porpoise to know if they do the same.

 RM: Yes, and that’s the only point I was trying to make: The ability to use language exists because humans have evolved the perceptual capabilities that allow them to control for the perception of grammatical utterances. It’s a long way from there to finding out what those perceptual capabilities are – a road that is paved with research (aimed at testing for the variables people control when they use language)!

BestÂ

          RM: PCT Is a bit more specific about what it is that is

innate in humans that makes it possible for them to learn
language. PCT says it’s the ability to perceive programs
– particular networks of contingencies between lower
level perceptions. That’s why I say that the ability to
perceive (and therefore control) programs is the PCT
equivalent of the LAD. The PCT version is a bit more
specific about what makes it possible for humans to do
(control) language and why even our cousin primates can’t
do it (not very well, anyway).Â

Rick


Richard S. MarkenÂ

"Perfection is achieved not when you have nothing more to add, but when you
have nothing left to take away.�
                --Antoine de Saint-Exupery

[From Rick Marken (2017.10.29.0745)]

···

Bruce Nevin (2017.10.28.16:15 ET)

RM: That’s basically what I said: In order to be able to control complex dependencies (as in grammars) among different types of perceptions you have to be able to perceive these complex dependencies; that is, in order to control programs, like grammars, you have to be able to perceive programs.

BN: Yes. I was disagreeing with Chomsky, not with you.

BN: However, now that you have elaborated a bit, I do have a point of disagreement with what you have said. Natural language grammar does not require control of program perceptions. (In this, I agree with Martin.) Formal grammars are programs.

RM: So what type of perception do you think we control when we produce natural language grammar? I’m not married to the idea of programs being the type of perception controlled when we speak grammatically As I said, I used “program” as a description of the type of perception controlled when producing grammatical utterances because that’s the perceptual type Bill proposed that seemed most relevant to control of grammar. But the only way to find out what types of perception are controlled when speaking a language is to test for the variables that are controlled.

RM: The ability to use language exists because humans have evolved the perceptual capabilities that allow them to control for the perception of grammatical utterances. It’s a long way from there to finding out what those perceptual capabilities are – a road that is paved with research (aimed at testing for the variables people control when they use language)!

MT: There is a name for the basic methods for identifying what perceptions are controlled in a language: Descriptive Linguistics. I attempted to demonstrate some of these methods, with kind assistance of Christine Forssell, at our meeting in 2003 in LA (IIRC). Properly understood and conducted, these are experimental methods involving systematic disturbance to identify controlled variables, not merely methods of observation and classification as some have claimed.

RM: This sounds great. I’m afraid I don’t remember the details of that talk very well. Could you recount one or two examples of the use of this technique. I think there are great possibilities for identifying higher level controlled variables using these techniques of linguistic analysis.Â

BN: By extending the substitution tests that identify phonemic contrasts, morphemes, and words to larger domains and by testing for relative acceptability (or for restrictions of context for optimum acceptability) a grammar of word-entry and reduction has been reached.

RM: Yes, this sounds very promising. I think it would be nice to hear about an example or two of using this technique to identify some higher level perception that is controlled in speaking.

Â

BN: Whereas a formal grammar aims to account for ‘all and only the sentences of a language’, the grammar of word dependencies and reductions also accounts easily for the fragments and midstream recastings that are normal for ordinary, unedited human discourse.

RM: An example or two of how this is demonstrated would be nice.

BestÂ

Rick

Â

/Bruce


Richard S. MarkenÂ

"Perfection is achieved not when you have nothing more to add, but when you
have nothing left to take away.�
                --Antoine de Saint-Exupery

On Fri, Oct 27, 2017 at 3:58 PM, Richard Marken rsmarken@gmail.com wrote:

[From Rick Markenm (2017.10.27.1255)]

Martin Taylor (2017.10.27.13.41)–

[From Rick Marlen (2017.10.27.1030)]

MT: I'll leave Bruce to comment on whether the Language Acquisition

Device (Is that the correct expansion of LAD?) is identical to the
ability to perceive programs.

RM: I think the LAD is just a term Chomsky used to refer to an innate ability to learn language. That’s what I understand it to mean, anyway. I don’t think Chomsky described any particular mechanism by which this LAD might work. I think the idea that the perceptual functions required to perceive and thus control the grammatical aspects of language is just a step toward specifying what it is about the ability to learn language that might be innate.

Â

MT: My contribution here is based on

research I had to review when I was on the Psychology Grants
Committee of the Canadian National Science and Engineering Research
Council a few decades ago. Vincent DiLollo submitted a proposal that
intrigued me then and still does. Like Plooij, DiLollo found in
individuals a sequence of plateaus of development that tended to
occur at around the same age in almost every “normal” person. He
attributed these plateaus (As far as I remember there were 14 levels
for people who reached very high academic or intellectual levels –
not the same thing). DiLollo attributed the levels to the
acquisition of new abilities that each required one new function
that had not been present earlier in life. I see his levels as
analogous to (and perhaps being) the development of new kinds of
controllable perception.

RM: Sounds interesting, though I would bet that the levels were described as output rather than input capabilities. But, still, interesting.Â

MT: The problem for this analogy and the requirement for language

processing to be dependent on program level perceptual control is
that DiLollo found that the ability to process “If-then” structures
arrive much later than the ability to communicate by speech.
Obviously there is very little we would call “grammar” in a child’s
first two-word utterances, but something very like grammar does
emerge soon thereafter. The difference between “Give me” and “Me
give” is not learned late in life, but is easily transformed into
“Baby give me” as distinct from “Me give Baby”.

RM: That’s why we need research (like DiLollo’s). I said language depends on the ability to perceive “programs” because program perception is one of the types of perception Powers hypothesized to be controlled and I’ve demonstrated that people can, indeed, control programs. But I see great research possibilities for determining the exact nature of the perceptions being controlled when people control for producing grammatical utterances, for example.Â

Â

MT: Is program-level perceptual control ever required for normal

conversational level?

RM: A good question for empirical research.Â

MT: I don't remember enough of DiLollo's work to be able to tell you his

methods, and it wouldn’t matter to Rick anyway, since all work not
based in PCT must be discarded,

RM: Not true. My work on object interception, for example, is based on work that was not based on PCT. But it was very relevant since the researchers were essentially looking for controlled variables without explicitly thinking of it that way. It sounds to me like DiLollo’s work might be the same kind of thing. When I get a chance I’ll see if I can find some reference to his work. It’s only a “large segment” of the conventional research literature that has to go into the waste basket.

Â

MT: but from what I remember I think his

findings were pretty clear. Together with the way I think PCT
actually predicts the development of language ability, I would agree
with Rick (and DiLollo) that evolution as provided our brains with
enough functional capacity to make explicit and communicate
program-level perceptions may be the distinguishing feature that
separates us from (most?) other living species. I hope some day
someone learns enough porpoise to know if they do the same.

 RM: Yes, and that’s the only point I was trying to make: The ability to use language exists because humans have evolved the perceptual capabilities that allow them to control for the perception of grammatical utterances. It’s a long way from there to finding out what those perceptual capabilities are – a road that is paved with research (aimed at testing for the variables people control when they use language)!

BestÂ

          RM: PCT Is a bit more specific about what it is that is

innate in humans that makes it possible for them to learn
language. PCT says it’s the ability to perceive programs
– particular networks of contingencies between lower
level perceptions. That’s why I say that the ability to
perceive (and therefore control) programs is the PCT
equivalent of the LAD. The PCT version is a bit more
specific about what makes it possible for humans to do
(control) language and why even our cousin primates can’t
do it (not very well, anyway).Â

Rick


Richard S. MarkenÂ

"Perfection is achieved not when you have nothing more to add, but when you
have nothing left to take away.�
                --Antoine de Saint-Exupery