Shannon's project

[From Bill Powers (960104.0915 MST)]

Shannon Williams (960103) --

Now that I'm really awake I can work a bit more on your post. You're
approximating the theory of reorganization here, but slipping off into
reinforcement theory. Not to say you shouldn't, but you should be clear
about which theory you're using.

     See it this way:

···

****************
     1) At first the infant has some disturbance (hunger, cold, ?)

First you start with some "intrinsic" variable -- one that is sensed by
built-in sensors and needs no learning to be detected. It's best, I
think, to give the name of the variable involved rather than one of its
states. Hunger and cold are states of two variables. One variable we
might call nutritional state or stomach loading, so being hungry means
that this state is less than its reference state, and being overstuffed
means it is greater than its reference state. The other variable is
_temperature_: Being cold means sensing a temperature lower than a
reference temperature; being hot means sensing a temperature higher than
the reference temperature. The same variable is involved, but in two
different states. The name of a variable is the name of the scale on
which it is measured: nutritional state, temperature. The current state
of the variable is its measure right now on that scale: hungry, cold,
full, hot. Likewise, the reference level is a specific position on that
scale: three-quarters full, 98.6 degrees F.

     2) If a perception fixes the disturbance then he tries to maintain
     that perception. This happens to decrease the likely hood of him
     encountering the disturbance again. In other words, a perception
     that proves successful becomes a reference.

Perceptions don't fix disturbances. Perceptions are just reports of the
magnitude of some variable along its scale of measurement. To say this
more precisely, you could say that when a perception is brought to some
specific reference level, doing so has a side-effect on other variables
such a temperature or fullness.

Reinforcement theory says that an action is performed because it has a
reinforcing effect: the reinforcing effect maintains the behavior.
Reorganization theory says that an action is performed because it was or
became part of a control process after the last reorganization.

The difference is like the difference between the impulse theory of
motion and Newton's laws. The impulse theory says that motion continues
only as long as there is something, like "impulse," maintaining it.
Newton said that motion naturally continues as it is unless some force
is applied to change it.

Reinforcement theory says that behavior will not continue in its present
form unless there is something to maintain it in that form.
Reorganization theory says that behavior will naturally continue in its
present form unless something happens to cause it to change to a
different form. Under reinforcement theory, what maintains the form of
the behavior is the reinforcement that it generates through
environmental contingencies. Under reorganization theory, what causes
behavior to change form is a departure of critical variables from
specific reference states.

     3) The reason that #2 works is because perception generates
     (causes) output. In other words, there is a direct relationship
     between what you perceive and how you behave.

This is definitely not the PCT premise, although it may be yours. In a
control system, specific perceptions do not produce specific behaviors.
If the perception is the position of your car relative to its lane, this
perception does not -- and MUST NOT -- produce any specific action on
the steering wheel. Instead, the driver must _vary_ the action on the
steering wheel in order to maintain the _same_ perception. This is
usually necessary because external disturbances are always pushing the
car this way and that. Because of these disturbances, the perception of
the car being in the center of its lane may go with any degree or
direction of action on the steering wheel.

What keeps the car in its lane is the difference between the position
that is perceived and a reference-position that the driver knows is the
right one. The car may be perceived as two feet to the right of the
center of the lane, but you can't predict what the resulting behavior
will be unless you know how far right of center the driver intends the
car to be. Perhaps he means it to be five feet to the right of the
center of the lane -- I'm sure you can think of examples where this
would be true.

If you're going to construct a control-system model, you need to be
precise about your terms, because there are precise concepts behind
them.
-----------------------------------------------------------------------
Best,

Bill P.

[From Bill Powers (960104.2010 MST)]

Shannon Williams (960104) --

     Let us say that signals are sent to our muscle system:

     1) Aren't these signals generated by the output of some control
     loops?
     2) Don't these signals consist of input into the muscle's system?
     3) Aren't these signals to the muscle system, the muscles system's
        perception of its world?
     4) Doesn't this perception directly generate output?

So far this is not a control system, but an input-output system you're
describing. The output (muscle tension) is a function of the input
(driving signals). I wouldn't use the term "perception" in this context,
although you might make a case for doing so if you're proposing that a
nuscle by itself is a control system controlling its own inputs (which,
in a way, it is, except that its reference signal is always zero).

     5) Doesn't every single control loop in the brain have to
     contribute to this perception in order to have an influence on the
     muscle system?
     6) Doesn't every newly generated loop need to contribute to this
        perception?
     7) What part of the new loop would contribute to this 'overall'
        perception- p, r, or e?

Yes to 5; no to 6 (there can be control loops that don't send signals to
a particular muscle because that muscle doesn't affect the controlled
perception). As to (7), the error signal in a control loop would be the
signal contributing to the input to the muscle.

     You will be unable to visualize an evolving reference until you can
     change your concept of perception. (How much do you want to bet?)

Before I decide whether I want to change my concept of perception, you'd
better explain what I'm supposed to change it _to_ (and _from_). Then
maybe I will be able to understand what you mean by "reference." It
doesn't seem to have much to do with what I mean by it. I don't
guarantee that I'll want to change my definition of perception; after
all, if those other guys can pre-empt the meaning of control, why can't
I pre-empt the meaning of perception? Maybe I'll tell you to go find
your own word. If people keep taking away my best words, I will be left
with no way to say anything.

     You are not talking about reorganization here. You are not going
     to visualize the problems that I am encountering with your model,
     until you address re-organization.

I don't know much about reorganization. Maybe you're about to make some
progress on it. Go for it. What problems?

···

-----------------------------------------------------------------------
Best,

Bill P.

[From Bill Powers (960115.0630 MST)]

Shannon Williams (960114.01:00)--

Me:

If the brain has the ability to predict how its perceptions will
change if it does and does not emit any behavior, the first project
you have to tackle is explaining how it does this, with a model.

     I have ten diagrams now that explain this simply. You can use a
     trainable neural network to explain how memory, imagination,
     language, math, ANYTHING can work. I even see how the networks
     could have successively evolved as species evolved. It's easy.

I agree that drawing diagrams is easy, and that having big general ideas
is easy. What isn't easy is showing that they're based on anything real.

     I have already posted two of the diagrams. You cannot comprehend
     the remaining diagrams if you cannot comprehend those two.

I don't know what you mean by comprehending a diagram. I have _seen_
your first two diagrams, but I can't say I comprehend them. You're going
to have to slow down and go into the details, like explaining what you
mean by the labels on the diagrams. You're using words that look like
the words in PCT, and you're drawing boxes and arrows, but you seem to
have your own meanings for all these things. If you want anyone else to
understand what you're talking about, you're going to have to be patient
and explain everything.

Consider our favorite example, a driver steering a car along a road.
How does the driver's brain predict what the path of the car will be
if the driver doesn't emit any behavior?

     You can only predict what is predictable. But if my car is veering
     to the left, it is easy for me to recognize/predict that it will
     continue veering to the left unless I or some obstacle does
     something about it.

If it's easy to recognize/predict that, you should be able to explain
how the brain does it. Here is an image of the scene in the windshield,
projected optically onto the driver's retina. How do we get from this to
a perception that your car is veering to the left? What form, what
physical form, would this prediction take? How is the prediction
converted into an action on the steering wheel that would actually keep
the car on course? These are the sorts of things that PCT tries to
explain (although the PCT model doesn't rely on prediction to accomplish
control, so we don't have to explain prediction).

So far we've discussed only the surface of your ideas. What's behind
them? Are you saying anything more than "Trainable neural networks can
do anything we can imagine, so the problem is solved?"

     Do you understand how recognition/prediction would work for a
     driver? Do you understand how the neural network would be used to
     recognize/predict?

No. Please explain. In the case I spoke of, there is no sensory
information available to tell the driver when the next bump in the road
will come, or what in direction and at what speed the next gust of
crosswind will occur. Yet the driver keeps the car on the road. How is
this done, when predicting even one second ahead is impossible?

···

-----------------------------------------------------------------------
Best,

Bill P.

[From Bill Powers (951230.0500 MST)]

I'm trying to get this morning thought down before it fades away. This
is really a comment on Shannon Williams' language project. This
introductory paragraph was written after the third paragraph below was
finished, which illustrates one of the points I'm trying to make.

Control behavior can look like open-loop generation of patterns. What
the observer doesn't see is that the output is continually being
monitored and compared with an intended goal-pattern, and continually
being corrected. The observer doesn't see that the outcome never quite
matches the intention; that little errors are always being made and
compensated for by changing the output from what it would have been
without the error.

This is more evident in spoken language than written language. What you
see of my written output looks like a machine-generated result -- you
never see that this sentence originally started, "What you see of my
output," with "written" being inserted during a pause after the word
"looks" had already been typed. Written language, especially with word
processing, allows perceived errors in meaning or grammar to be
corrected after they have been committed, so the reader never sees the
errors. The result is to create an illusion of machine-like open-loop
error-free production.

This creates the erroneous impression that language is generated from
the top down in an open-loop algorithmic way. All the theories of
linguistics that I have seen try to picture language in this way, as a
production process rather than a control process. Computer programs are
ideally suited to implementing such conceptions, because they don't make
mistakes. If you have correctly embodied the production rule in the
program, the program will produce sentences that are structured
according to the rule every time, with no need to delete, rearrange,
insert, substitute, or append words.

The computer cannot make a spelling error, or a syntax error -- at least
in terms of the rules that the programmer wrote into the program. There
may be errors in the sense that the output doesn't match what a real
person would say, but those are errors in the conception of the rule,
not its execution, and errors in the sense of "mistake" or "blunder,"
not in the sense of a PCT error signal. The computer always correctly
follows the rules that you give it. What linguists do is try to think up
rules that will produce natural looking language, use the rules to
generate output sentences, compare the result with a real person's
output, and try to correct differences by modifying the assumed rules.

That, of course, is a closed-loop control process; the linguist does not
simply generate a series of rules differing from the previous ones in a
pre-selected sequence, which would be a rather pointless exercise (why
not just generate the best rule first?). The rules are modified on the
basis of comparing the actual outcome with a desired outcome, the
desired outcome being what the programmer knows, as a normal human
being, that a normal human being would say. Of course the linguist also
happens to know how to produce correct sentences in other languages, but
for the most part the basic problems being tackled are in the linguist's
native language, and the reference condition is what the linguist knows
how to say without knowing why he or she knows how to say it. The
ambiguity in "The shooting of the hunters awakened me" is recognized not
because the linguist sees an ambiguity in the production rules that
supposedly govern language, but simply because the linguist, just like
the non-linguist, _recognizes_ the two meanings. So the linguist is
using the linguists's own natural language capabilities, which are not
understood, as the reference against which to compare the outcomes of
using various sorts of artificial language-producing rules.

A literal transcription of spoken language does not often look like the
written examples of language that we find in books. The uh talker,
speaker, one generating the language or language output is continuous,
continuously editing on the fly as if herding, or correcting, the
sentence as it developed, develops so it -- so the result somehow gets
across what the speaker is trying to say. When you see written down what
some speaker actually said, the result often looks like incoherent
babble. The ability to edit on the fly varies tremendously from one
person to another; some people are barely capable of producing their
native language at all, while others seem to produce edited copy without
any more than an infrequent blunder. Yet communication still seems to
take place, or the illusion of it.

I think that spoken rather than written language leads to a much more
realistic way of considering how language is created by a speaker or
writer. It shows how perceptions are continually involved in production,
because the errors are more evident, as is the process of error
correction. One doesn't simply convert a rule of language directly into
output; instead, one starts producing a sentence, perceives what has
been produced so far in terms of fit to some rule or principle, and
selects the next word, in part, to correct incipient departures from the
rule or principle.

I think that developing sentences are being perceived from many points
of view simultaneously, with many control processes working at the same
time (and sometimes conflicting). Furthermore, I think that non-verbal
perceptions play a primary role in causing sentences to be formed as
they eventually come out. As we speak or write, we are listening and
reading. And as we listen and read, we are experiencing the meanings of
words (calling this "semantics" doesn't add much information). These
meanings are other perceptions evoked from memory by some associative
process, so that as the sentence unfolds, there unfolds also a mental
image, a story, an event, a structure of experience that is more than
just words.

When we are listening to or reading some other person's output or one of
our own previous outputs, we concentrate on the evoked experiences at
many levels, converting the buzz-hiss-pop-click-honks or successions of
chicken tracks into imaginary happenings, more or less as if they were
actually happening. But when we are speaking or writing, we begin with
some happening already in mind, and as we speak or write we are
comparing the experiences evoked by our own words with the reference
experiences we are trying to represent in language. We are not just
trying to produce word-symbols; we are trying to control perceptions
that arise because of listening to or reading the word-symbols. We are
trying to control _our own_ perceptions so they match the meanings we
are trying to convey, assuming that if we can make sense of the words, a
listener will make the same sense of them (often a vain hope).

We also monitor our verbal outputs so we can perceive in them whatever
linguistic structures we have learned they should have. This puts an
added constraint on the words we emit. A Chomskyian, for example, would
hear a slowly unfolding sentence as a series of fine structures which
lead to larger structures and finally to an overriding "deep structure".
In the process, the intermediate and deep structures might change
radically as the developing sentence rules out some analyses and
introduces others, so the final deep structure might not resemble the
one that seemed to be there when the sentence was half finished. A
linguist from another school would perceive the developing sentence in a
different way, seeing different principles or rules taking form and
shifting as the sentence makes its way from start to finish. Someone as
old as I (altered from "as old as me") would apply grammatical analysis
of a much older and I'm sure obsolete form, in terms of adverbs and
adjectives and prepositional phrases and so forth, again with the
particular parsing shifting and not arriving at any final structure
until the sentence is finished, hopefully.

What we are all looking for during this process of perceiving the
linguistic structure of our own sentences are mistakes. We are
monitoring our own words, analyzing our own sentences, to keep them in
conformity with whatever linguistic rules and principles we know
explicitly or have intuited implicitly. We may think "Who did you give
that to?" but by the time we have uttered or written it, it has turned
into "Whom did you give that to?" or even "To whom did you give that?"
Only if we don't know the applicable rules, or don't for the moment care
about them, or have some reason to produce a grammatically incorrect
sentence, do we actually say "Who did you give that to?" If we have
misunderstood the rule that everyone else uses, we will carefully amend
the sentence we thought, "I do hope you will come with him and me," into
"I do hope you will come with he and I." There isn't any innate rule
against which we try to match our productions; there are only the rules
as we understand them.

The difference between perceiving meanings and linguistic structures can
easily be seen. Consider this "sentence:"

boy finger hurt stove hot cry

These words evoke a set of six imagined perceptions, which we can easily
link together into an imagined event. We can describe this event by
using a grammatical sentence:

The boy hurt his finger on the hot stove and cried.

In constructing this sentence, we reveal that there are experiences we
consider realistic and others we do not believe could happen. For
example, we would never say

The stove hurt its finger on the hot boy and cried.

There is no linguistic reason not to say this sentence; it is a
perfectly good sentence. But the _experiences_ to which it refers would
not be expected to occur. These ruled-out experiences are ruled out by
the sentence structure we actually chose; it was the boy's finger, not
the stove's; the boy, not the stove, was hurt and cried. The stove, not
the boy, was hot. So the sentence structure refers not only to the raw
experiences evoked by the individual words in the original set of six,
but to imagined higher-order perceptions having to do with what is
connected to what, what action was performed by what entity, what
attribute goes with what object. The detailed structure of the language
contains signals telling us how to link up the basic perceptions -- in
short, what higher-level perceptions to use.

Language must have evolved from simple lists of experience-evoking
noises or marks into conventional structures that remove ambiguities.
Suppose the original set of terms had been

Boy stove hurt I finger hot cry laugh

If a speaker said this to another person, the other person might reply

laugh I

cry boy

I hurt

The speaker would be frustrated, because the meaning is not getting
across:

No no no. Laugh boy. Cry _I_. Finger I. _I_ hurt. Bad boy.

It isn't hard to see why language conventions would arise, and how. If
we see communication not simply as a production process but as a process
of getting something back from another person, we can see how
communication errors are detected. What we want is for the other person
to give back some indication that the same meaning has been extracted as
was intended. When the other person said "laugh I" and so forth he
presented pairs of elements together that the speaker did not intend to
be grouped. In attempting to correct the error, the speaker presents his
own groupings: boy with laugh, I with cry, I with finger, I with hurt,
with pauses between (indicated by periods above) to demonstrate the
grouping. Presumably, the listener then indicated understanding, perhaps
by taking the other's hand and looking sympathetically at the finger or
saying "boy" and shaking his head disapprovingly.

Probably the best indication that this PCT view of language is not far
off the track is paraphrasing. There are different but closely
equivalent ways of describing the same experience. I can say "The hot
stove hurt my finger" or "my finger was hurt by the hot stove" or "I
burned my finger on the stove." If I say, "The hot stove hurt my
finger," and you reply "The hot stove burned my finger," (repeating my
words exactly), I can be pretty sure you have not understood me. For one
thing, you should have changed "me" to "you." It would have been much
clearer that you understood me if you came back with a paraphrase or a
relevant add-on: "You burned your finger on the stove" or even "Does it
still hurt?" This tells me that you aren't simply playing back a
recording of my words without understanding the experiences to which
they point.

It's very hard to handle the generation of a paraphrase by using a top-
down production model. What kind of rule could possibly say "generate a
different sentence with a different structure and using different words
so that the meaning is the same as that of the original sentence?" Such
a statement of a rule is simply one long begged question. The ONLY way I
can see to explain paraphrasing is with a control model.

OK, the sump pump has cut off so I guess I've got rid of whatever
accumulated overnight. Hope all that was relevant to something, Shannon.

···

----------------------------------------------------------------------
Best,

Bill P.

(bob hintz - 961230 -23:21)

I have never actually sent a message to the net, but this subject is
coming too close to my own on-going concerns to let go by. I am not
very skilled with this medium but have been "lurking" for over a year.
I met many of you in Durango this past summer. I have read B:CP many
years ago. I have read most of the stuff I picked up this summer and I
have read what Kent McClelland gave me a few weeks ago. I have even
gotten through a 1st reading of "Without Miracles".

I think the beginning question for understanding language
acquisition must focus on how the infant learns that its own behavior
makes a difference in the control of its own error signals. It has
always bothered me that people treat language use as an individual
activity, when it is clearly social. The infant cannot control its
error signals and simply dies if there is no care taker to provide the
necessary experiences (food, shelter, etc.) to allow the control of
what might be considered initial intrinsic error. The services are
initially provided as scheduled by the caretaker who makes every effort
to interpret the behavior of the infant for clues about its internal
state. The reference signal for the caretaker is some definition of a
contented or cared for baby. Some baby behaviors are perceived as
different from this reference, thus, generating an error signal which
results in output to alter the perception of the baby. The caretaker
must already have some programs which suggest good things to do for
babies. These are utilized on a best guess basis initially, as the
caretaker has no direct access to the baby's internal perceptions. The
baby doesn't have any organized output that can communicate this
information. Both the baby and the caretaker must create a method to
facilitate social interactions that each has a reason to engage in.

The only means a baby has to control its errors is the sound it makes to
influence its caretaker. This will be unique in that it will be
created primarily by the people in the interactions. Baby sitters do
not understand the meaning of the sounds the same way a parent does.
This is frequently frustrating to both. The unit that learns language
is the dyad not the individual. It is initially created to communicate
about error for the purpose of control. I need information to provide
care, you need to provide it to facilitate receiving the care I
provide. Neither of us can succeed without the other.

Much of the complication of language arises when we join together to
deal with world that exists outside both of us.

(I hope I have actually sent a message) - Baby Bob

[From Bill Powers (951231.0930 MST)]

Shannon Williams (951230) --

The PCT concept of control says that what we really control are
perceptions. The sounds we make in speech are physical events outside
us. A tape recorder reproduces sounds by reproducing the physical
events, but a human being reproduces the perceptions of the sounds, not
the sounds. This is good, because if you told me, in a woman's voice,
what your name is, I would not be able to repeat it aloud if I had to
reproduce the physical sound-waves. When I repeat your name, I repeat it
using different fundamental frequencies, different timings, and
different harmonic content -- in my voice, not yours. What I am
repeating, therefore, is the perceptual signal in me that arises when
either you or I say your name, but not the physical sound-waves.

The only way I can know I am repeating your name is that I get the same
perceptual signal that I got before. Therefore the reference signal
which I use for repeating your name has to be a copy of the perceptual
signal that arose inside me when I heard your name before. This
perceptual signal (and therefore the reference signal) doesn't have to
have any resemblance to the sound-waves.

Here's a very crude imaginary way in which a control process could be
used to reproduce a phoneme. One method of phoneme recognition uses
formant frequencies. By analyzing a sound spectrum into a few
frequencies, it is possible to characterize a phoneme as a few numbers:
these numbers would be like neural perceptual signals with specific
magnitudes, coming out of a perceptual function that does a spectral
analysis. When you have signals A, B, C, and D present, each with a
specific magnitude, a specific phoneme is present at the input. But the
signals themselves are not phonemes. They are simply indicators of four
magnitudes. If you can set four reference magnitudes, compare them with
the actual magnitudes, and convert the four error signals to the
amplitudes of four sine-wave output frequencies, you can control the
input frequencies to make the four perceptual signals match the four
reference signals. Then you will be hearing the phoneme that the four
reference signals specify.

Now to reproduce a phoneme that you are hearing, you record the
magnitudes of the four perceptual signals. Then you use these recorded
values as reference signals, and the control action will adjust the four
sine-wave output amplitudes so that the magnitudes of the four
perceptual signals match the recorded values. You will then be hearing
the same phoneme you heard before (I'm skipping over the design of the
output function).

The system that selects a phoneme to reproduce does not have to generate
the actual reference signals. All it has to do is pick out of memory the
set of reference signals that is wanted, the set that occurred together.
It must choose _this_ set, not _that_ set. So the organization of the
system that picks the phoneme does not have to have anything to do with
phonemes. All it has to do, for example, is point to the set that is
associated in memory with some other experience, which is the occasion
for uttering the phoneme.

(A long time ago, I learned a football cheer that started, "Sis! Boom!
Bah! Sis! Boom! Bah!" I learned it that way and cheered it that way, and
it wasn't until much later that I realized that Sis-Boom-Bah was the
fuse going SSSSS and the explosion going BOOM, followed by me going AH!
(not BAH). The AH is the phoneme that completes the sequence. The first
two elements of the sequence are generated independently, but I control
the sequence -- make it right -- by adding my own sound at the end.
Maybe the actual production of the AH happens the way I imagined it
above.)

Now consider your diagram:

input (supervisor says a/some letters) --------------------------
                                                                  >
                                                                 \|/

···

----------
                                                              >Black |
                             -------> [Voice Synthesizer]---->|Box |
                             > >which |
                             > >converts>
              ------------ | ----------- |sounds |
              > Learned |___|__/ ___\ |Generated| |to |
     -------->| pattern | | / |reference| |letters |
     > > generator>---- ----------- ----------
     > ------------ | | |
     > /|\ | \|/ |
     > > ------------> C ------------- |
     > > > >
     > > > >
     > ----[pattern adjustment algorithm]---- |
     > >
     > >
     --------------------------------------------------------------

You can see that the black box converting sounds to letters isn't
necessary. All that is required is that the black box convert the input
sounds to some set of perceptual signals. The "pattern generator" should
be a higher-level input function (analyzer) that converts the basic
perceptual signals into another set of perceptual signals that represent
invariants, just as the four frequencies represented invariants of a
phoneme. The "generated reference" is then just a set of recordings of
the signals from the input function that occurred together. When the
supervisor speaks a sound, the input function produces its perceptual
signals which are stored. Then the same reference signals can be
selected (by a higher-level system not shown), and the output function
(when designed) will convert the error into the sounds that will
reproduce the same perceptual signals, via the "speech synthesizer."

This is very close to your original concept, just tweaked a little to
make it more familiar to me. What I've tried to bring out is that the
basic task is simply to reproduce what the supervisor is heard to say --
the fact that what is being said is the name of a letter is irrelevant.
It could be the name of anything, or just a sound pattern. Going through
the process of converting a sound to a letter, and then a letter back to
a sound, isn't necessary. The "letter" can be any perceptual signal.
-----------------------------------------------------------------------
Best,

Bill P.