# Different ways to use the Neural net

[from Shannon Williams (960111.14:14 CST)]

Bill Powers (960110.1730 MST)--

Suppose we shrink your diagram and show it as an adaptive input function
to a control system:

R1
>
input---> neural net --> p ------>C-------e--->--
> > /|\ | |
> > > > action
> > C <-------- |
> > /|\ |
> > > \|/
> > R2 |
> > >
> <------- feedback through environment <-----
>
---- <--- disturbance

This is not what I have in mind.

I read a book some time ago that showed how some signals generated by
some neurons seem to serve no purpose except to delay the signal. He then
went on to explain how it was done. Afterwards, he issued a challenge for
anyone to figure out what purpose it served to delay some signals.

I cannot remember the name of the book or even what it was about. Last night
I searched all of my books, looking for the diagram that explained how the
delay happened. I can't find it. So maybe the delay concept is false, in
which case, the diagrams below have no meaning.

But anyway, this is the way that I envision using the neural network
described by Andy Clark:

-----> [neural net] -----> [output of neural net]
> /|\ |
delay | |
> > >
>----------> C <-------
>
[input from world]

This means that the neural net could be used to predict. Say you are
teaching the neural net to say the alphabet (what I actully have in mind is
describing smooth changes in our perceptions- like visual changes or
tactile changes- but it is easier to describe the alphabet). Then this is
how this schematic will learn to respond:

[input from world] delayed [input] [output from net]
------------------ --------------- -----------------
A random random
B A B
C B C
. . .
. . .
. . .
Z Y Z

This means that you have a method to store experiences. This schematic does
not show how you would access these experiences, but I can envision a
number of simple methods. I would want to see a neural net first though
before I chose a method for using it.

But, if you had a number of these networks hooked up, or just loop the
output of the neural net back to its input, then you can base your final
output for the external world, on forcasts from perceptions:

---------------------------------> [prediction]
> >
\|/ |
\ /
\---> [neural net] ----/ -----> [disabled output]
/|\ |
> > >
delay | |
> > >
>----------> C <----------------
>
[disabled input]

If you had another neural net between your current input and output, and
the input that the output causes, then you could learn to predict how your
output affects the world:

--> [neural net] ---> [predicted input from world]
> /|\ |
> > >
> > >
> C---------
[output sent to muscles] | /|\
> > >
> > >
AND ----------- [input from world]
>
>
[input from world as it apeared at the time output was sent to muscles]

If we can again disable the learning circuit, then we can combine the
last two schematics to see how we can predict how we can affect the world.

I have four more diagrams in my notebook, but before I try to draw them
with this editor, tell me what you think of these.

What I see from these diagrams is:

1) Cummulative affects of our behavior can look like e=r-p. (This is because
our output to the external world can be based on current input or
predicted input. And our output can be generated automatically to
avoid situations that are 'unpleasant' or to seek situations that
are 'pleasant'.)
2) We do not need an internal structure that operates as
e=r-p in order to make our behavior look like e=r-p.
3) We automatically think in terms of time.

···

------------------------------------------------------------------------
Also, I am leaving Sunday for Calgary. I will be working there for the
next two to four weeks. Depending on how the project goes, I might not
respond for a while (after Sunday).

Take Care,
Shannon

PS. Does anybody want to have dinner in Calgary?

[Martin Taylor 960111 17:10]

Shannon Williams (960111.14:14 CST)

I read a book some time ago that showed how some signals generated by
some neurons seem to serve no purpose except to delay the signal. He then
went on to explain how it was done. Afterwards, he issued a challenge for
anyone to figure out what purpose it served to delay some signals.

It's commonly used in artificial neural networks that "recognize" temporal
patterns (e.g. speech). In its simplest form, it allows differentiation
and the recognition of acceleration. The delays can be used to form a
shift register, the outputs of which can be used in parallel as inputs
to a simple neuron, as in the figure (data comes in along the dashed lines
from the left, at a steady rate, moving along the line one delay unit per
dash. The data are picked off at the points where a vertical line crosses
the left-right dashed line:

> position | velocity | acceleration
> / \ /|\
> /+1 \-1 / | \
> \ / -1/ +|2 \-1
input -|-> \ / \ | /
input --||--> \ | /
\|/
input -|||--->

Much more complex patterns of weights can be used, making temporal patterns
as easy to work with as are spatial patterns. There are, of course, lots
of complications when one uses a temporal pattern as a controlled perception,
not least the fact that the output cannot affect the earlier parts of the
pattern as readily as it can affect the later. This may well be why languages
have evolved so that the early parts of significant units (sentence, word...)
are usually more mutually discriminable and informative than the later parts.

[Figure omitted]

This means that the neural net could be used to predict.

Yes, that's roughly what is sometimes done in the kind of self-training
I mentioned in an earlier message today. It's not the way I had envisaged
the Little Baby learning to predict letter strings.

The LB's predictions had to come from its failure to control adequately
the distance between its "fingertip" and the quasi-phonetic feature
representation of the time-smoothed representation of a syntactically
regular string (notice that the smoothing, known in speech analysis as
"co-articulation", is very helpful to a control hierarchy that learns
to predict, whereas it causes difficulty for an inflow pattern recognizer,
at least of the types used in current speech recognition systems).

The LB's predictions come from a higher level learning to perceive and to
control for syntactic regularities in its input. When the finger
synchronizes with, rather than follows, the trajectory of the input,
then the syntactic regularities of the input disappear from the finger-target
distance vector.

[input from world] delayed [input] [output from net]
------------------ --------------- -----------------
A random random
B A B
C B C
. . .
. . .
. . .
Z Y Z

This means that you have a method to store experiences.

For a short time. To store them for longer times by related mechanisms
requires recurrence. That's one hypothesis for how short-term memory works.

But, if you had a number of these networks hooked up, or just loop the
output of the neural net back to its input, then you can base your final
output for the external world, on forcasts from perceptions:

You've been reading Hans Blom's postings, haven't you

Why not? There are certainly cases where prediction would be helpful,
however it's done. And at the higher levels, where imagination loops come
into their own, it would seem to be essential.

If you had another neural net between your current input and output, and
the input that the output causes, then you could learn to predict how your
output affects the world:

Same comment.

What I see from these diagrams is:

1) Cummulative affects of our behavior can look like e=r-p. (This is because
our output to the external world can be based on current input or
predicted input. And our output can be generated automatically to
avoid situations that are 'unpleasant' or to seek situations that
are 'pleasant'.)
2) We do not need an internal structure that operates as
e=r-p in order to make our behavior look like e=r-p.

I must be really on a different wavelength from you, because, while agreeing
with a lot of what you say earlier in the message, this seems a total
_non sequitur_. I don't see any relation between e=r-p and what precedes

3) We automatically think in terms of time.

PCT would say this is true of any organism that has at least a sequence
level of control. (Personally, I'm not at all convinced that there is
a set of levels above which time-related perceptions are controlled and
below which they are not. I think that velocities and accelerations, both
temporal and spatial, are perceptions at the same level as positions. It
is only a logical analysis that says one has to have position before one
can compute velocity. The same shift register that provides input to a
position sensing system can provide input to a velocity or acceleration
sensing system, just as certain cells in the retina provide spatial
difference information of various kinds to other retinal cells or to
the optic nerve.)

Also, I am leaving Sunday for Calgary.

I hope you get a Chinook. But I forgot. Snow comes from Georgia, these
days, doesn't it?-)

PS. Does anybody want to have dinner in Calgary?

Love to. Are you paying the plane fare?-)

Martin