Input Functions

[From Rupert Young (980113.1300 UT)]

To go back a bit...

(Bill Powers (971207.0815 MST)]

But now suppose you create a control system that senses their sum: s1 + s2
(both weights are 1). The perceptual signal is compared with some desired
value, and the error signal is amplified to produce an output that affects
both s1 and s2. The result is that the _SUM_ is under control: we end up
with s1+s2 = r, the reference value that exists at the moment.

Before we introduced the control system, a plot of s1 against s2 would just
show some spaghetti-like trajectory of the point (s1,s2), as external
forces pushed the values of the two variables around in random ways. A sort
of space-filling curve. But after the control system was built and turned
on, the plot would look like this:

* | s1
    * |
       * |
          * |
             * |
------------------------------------ s2
                 > *
                 > *
                 > *
                 > *
                 > *
                 >

That would be for r = 0. For other values of r, the line would shift in a
direction at right angles to the line.

Suddenly there is a relationship between s1 and s2: their sum is constant,
at an adjustable value. All those external forces can still push the point
(s1,s2) back and forth along the line, but they can't push it off the line
any more. Try it with your simulation and see.

One dimension of this two-dimensional universe has been brought under
control, and in the process we have _created_ a vector in the direction of
the line (or rather, normal to it).

If we weight the two variables differently as we sum them, we get a
perception equal to w1*s1 + w2*s2. We will find that a new vector has been
created, with a different slope. Now the disturbances can move (s1,s2)
along the line in the different direction, but still not off the line.

By vector do you mean the plot in the above graph ? The values of s1 and s2
could stay the same so the above line would stay the same. Is it the vectors
of {w1*s1 w2*s2} (ie. the plot of w1*s1 against w2*s2) that has a different
slope ?

Regards,
Rupert

[From Bill Powers (980113.0829 MST)]

Rupert Young (980113.1300 UT) --

To go back a bit...

(Bill Powers (971207.0815 MST)]

But now suppose you create a control system that senses their sum: s1 + s2
(both weights are 1). The perceptual signal is compared with some desired
value, and the error signal is amplified to produce an output that affects
both s1 and s2. The result is that the _SUM_ is under control: we end up
with s1+s2 = r, the reference value that exists at the moment.
If we weight the two variables differently as we sum them, we get a
perception equal to w1*s1 + w2*s2. We will find that a new vector has been
created, with a different slope. Now the disturbances can move (s1,s2)
along the line in the different direction, but still not off the line.

By vector do you mean the plot in the above graph ?

Yes, or a direction at right angles to it.

The values of s1 and s2
could stay the same so the above line would stay the same.

No, they don't need to stay the same. They can vary, as long as the point
(s1,s2) stays on the line.

Suppose the perception is

p = 2*s1 + 3*s2

If the reference level is 12 units, all the following values of s1 and s1
will result in zero error:

s1 s2
1 10/3
2 8/3
3 2
4 4/3

and so on.

Any other pairs of values will result in an error signal, which will be
amplified to increase or decrease both s1 and s2 (since both weights are
positive). As a result, the perception will change until 2*s1 + 3*s2 is
close to 12.

A disturbance that affects s1 and s2 in the ratio 3:-2 will be unresisted;
the point (s1,s2) will simply move along the line.

Is it the vectors
of {w1*s1 w2*s2} (ie. the plot of w1*s1 against w2*s2) that has a
different slope ?

The plot is always s1 against s2 (the dimensions of the space). The
weights determine the direction of a line in this space when the sum is
held constant. You have to remember the action of the control system. The
control system is maintaining w1*s1 + w2*s2 at some constant value (the
reference level). If you plot w1*s1 + w2*s2 = constant, you will get a
straight line in the s1,s2 space.

If you plot this line for different values of the constant, you will get a
family of parallel lines. Changing the reference level moves you from one
line to another line. The variables s1 and s2 are still free to move along
the line for any particular constant, but changing the constant, the
reference level, changes _which_ line they can move along when disturbed.

Any clearer?

Best,

Bill P.

[From Rupert Young (980113.1900 UT)]

[From Bill Powers (980113.0829 MST)]

>> If we weight the two variables differently as we sum them, we get a
>> perception equal to w1*s1 + w2*s2. We will find that a new vector has been
>> created, with a different slope. Now the disturbances can move (s1,s2)
>> along the line in the different direction, but still not off the line.

The plot is always s1 against s2 (the dimensions of the space). The
weights determine the direction of a line in this space when the sum is
held constant. You have to remember the action of the control system. The
control system is maintaining w1*s1 + w2*s2 at some constant value (the
reference level). If you plot w1*s1 + w2*s2 = constant, you will get a
straight line in the s1,s2 space.

Ah, I thought (from the first paragraph) that the values of s1 and s2 were
staying the same but you mean that having new weights and keeping the
_weighted sum_ constant we get new values of s1 and s2 (assuming they are
controlled) from which a new vector (of s1 and s2 values) with a different
direction is formed.

If you plot this line for different values of the constant, you will get a
family of parallel lines.

Right, gotcha.

Any clearer?

Yes. Thanks.

Regards,
Rupert

[From Rupert Young (980114.1000 UT)]

(Martin Taylor 971214 18:10)

  The correlation indicates how the _form_ of the input matches the _form_
  of the perceptual function. The length of the input vector is the
  _strength_ of the input, and the length of the weight vector is the
  _sensitivity_ of the input function. You want to have these three separate,
  because they deal in factors that individually can change in ways that
  affect control.

If the correlation is low, the input isn't very like what the perceptual
function is looking for, and some other function orthogonal to this one
may be found that will show a high correlation with the input. That, of
course, wouldn't matter to this one, but if some higher system wants to
perceive "what's out there", it might be able to compare (ratio, subtract,
or whatever) the two (or more) outputs. If you don't consider the
correlation as a separate factor, you tend to lose sight of this
possibility.

Do you mean the correlation between the input vector and weight vector ?
Perhaps you could explain the significance of this correlation value ?
I've been palying around with some values and get the following,

From p = w1*s1 + w2*s2

and corr = sum(wi*si)/sqrt(sum(wi^2)*sum(si^2))

where p = 2, w1 = 0.8 and w2 = 0.3

we get

s1 s2 p corr
-5.00 20.00 2.00 0.11
-4.00 17.33 2.00 0.13
-3.00 14.67 2.00 0.16
-2.00 12.00 2.00 0.19
-1.00 9.33 2.00 0.25
0.00 6.67 2.00 0.35
1.00 4.00 2.00 0.57
2.00 1.33 2.00 0.97
3.00 -1.33 2.00 0.71
4.00 -4.00 2.00 0.41
5.00 -6.67 2.00 0.28

In all cases the perception is the same but the correlation varies wildly.

What does the correlation show us ? Aren't the input and weight values
vectors from _different_ spaces ?

Regards,
Rupert

[From Rupert Young (980114.1000 UT)]

(Martin Taylor 971214 18:10)

  The correlation indicates how the _form_ of the input matches the _form_
  of the perceptual function. The length of the input vector is the
  _strength_ of the input, and the length of the weight vector is the
  _sensitivity_ of the input function. You want to have these three
separate,
  because they deal in factors that individually can change in ways that
  affect control.

Do you mean the correlation between the input vector and weight vector ?
Perhaps you could explain the significance of this correlation value ?
I've been palying around with some values and get the following,

From p = w1*s1 + w2*s2

and corr = sum(wi*si)/sqrt(sum(wi^2)*sum(si^2))

where p = 2, w1 = 0.8 and w2 = 0.3

we get

s1 s2 p corr
-5.00 20.00 2.00 0.11
-4.00 17.33 2.00 0.13
-3.00 14.67 2.00 0.16
-2.00 12.00 2.00 0.19
-1.00 9.33 2.00 0.25
0.00 6.67 2.00 0.35
1.00 4.00 2.00 0.57
2.00 1.33 2.00 0.97
3.00 -1.33 2.00 0.71
4.00 -4.00 2.00 0.41
5.00 -6.67 2.00 0.28

In all cases the perception is the same but the correlation varies wildly.

What does the correlation show us ? Aren't the input and weight values
vectors from _different_ spaces ?

Last question first. No they are not. They are referenced to each other.
s1 is called s1 because it is connected to the input that has the weight
called w1. The fact that one set of numbers refers to the intensity of
something (such as a voltage or a neural firing rate or the rate of incoming
photons) and the other is a transfer function (such as neural firing rate
per billion incoming photons) is neither here nor there. The "space"
is the space defined by the labelling of the inputs and of the weights.
Those labels are defined by virtue of the fact that a particular s.k goes
to the input that has a particular weight, and because it is labelled
s.k, not s.j, that input weight is labelled w.k, not w.j. The labelling
defines the space, and it is the same for the s as for the w.

Here's a graph of some of what you got, as best I can do in ASCII.

      5-| * |
        > * |
        > *|
     s1 | |*
        > > *
      0-|----------|------*----------
        > 5 0 5 *10 s2
        > > *

It shows the line that Bill P has mentioned often in this dicussion. It's
not easy to see in an ASCII graph, but you can tell by looking at your
numbers that when s1 changes by 3, s2 changes by -8. Since your weights were
w1=0.8, w2=0.3, the line is clearly perpendicular to the W vector, as
Bill has pointed out.

Now, what does the correlation show us? Here's one more line for your
list of numbers:

2.191 0.822 2.00 1.00

2.191/0.822 happens to be 0.8/0.3. In other words, the direction of the
{s1, s2} vector is the same as the direction of the {w1,w2} vector. What
this says is that the W vector describes the S vector exactly, with nothing
left over for any different input function. If there was another input
function orthogonal to your chosen W vector, it would produce zero output
for this single particular input vector. So long as the correlation is
1.00 between input and weight vectors, the single value of p describes
everything there is to know about this input.

If the correlation between the S vector and the W vector is not 1.00, then
there must be some other set of weights V orthogonal to this first set,
giving a non-zero value of p2 = v1*s1 + v2*s2. Let's try this with your
numbers (adding my extra line), using the weights v1 = 0.3, v2 = -0.8.

  s1 s2 p1 p2 corr
-5.00 20.00 2.00 -17.50 0.11
-4.00 17.33 2.00 -15.07 0.13
-3.00 14.67 2.00 -12.63 0.16
-2.00 12.00 2.00 -10.20 0.19
-1.00 9.33 2.00 -7.77 0.25
0.00 6.67 2.00 -5.33 0.35
1.00 4.00 2.00 -2.90 0.57
2.00 1.33 2.00 -0.47 0.97

2.19 0.82 2.00 0.00 1.00

3.00 -1.33 2.00 1.97 0.71
4.00 -4.00 2.00 4.40 0.41
5.00 -6.67 2.00 6.83 0.28

Since there are only two input values, s1 and s2, p1 and p2 specify them
exactly, whereas the specification of p1 = 2.00 only specifies that the
S vector falls on a particular line. But when the correlation is unity
between the S vector and the original W vector, the value of p1 totally
describes the input. When the correlation is far from unity, there's
lots left over, still to be described.

It's exactly the same as an analysis of variance. The calculations are
the same, and the implications are the same, provided you do the ANOVA
properly, and don't get shunted off into computing the so-called
"significance levels" (which never tell you anything but whether your
experiment is sensitive enough to show you the "effect" that is there--
but that's another story).

In an ANOVA, one uses a predetermined set of orthogonal weight vectors,
normalized so that they all have unit magnitude. The vector corresponding
to a perception of the mean is, for example: {1/sqrt(N), 1/sqrt(N), ...
1/sqrt(N)}, where N is the number of observations. (Notice that the
output perception is not the classical value of the mean. It is,
however, the appropriate perceptual value for the combined effect of
the inputs, equally weighted.) You can check other perceptions. For
example, If there are, say, 3 equispaced levels of some "independent
variable", you can see if there is a linear trend by checking the
magnitude of the perception corresponding to the weight vector
(-1/sqrt(2), 0, 1/sqrt(2)). If you want to look for a linear trend
with 4 equispaced intervals, you can use the weight vector
{-3/sqrt(20),-1/sqrt(20),1/sqrt(20),3/sqrt(20)). And so forth.
So long as a vector you choose is orthogonal to all the other vectors
you choose for your analysis, and has magnitude 1.0, it's a valid
perceptual function to use in the ANOVA.

ANOVA is essentially identical to the effect of an orthogonal set of
perceptual functions at one level (at least at one very low level) of the
hierarchy. And they are there for the same purpose--to produce perceptual
values that let you or the higher levels see if there is an anomaly in the
input that might mean something. (Sorry for the
hyperpseudoanthropomorphizationing)(couldn't think of any more syllables to
stick in :slight_smile:

Martin

[From Rupert Young (980122.1800 UT)]

(Martin Taylor 971216)

In other words, the direction of the
{s1, s2} vector is the same as the direction of the {w1,w2} vector. What
this says is that the W vector describes the S vector exactly, with .

So long as the correlation is
1.00 between input and weight vectors, the single value of p describes
everything there is to know about this input.

Excuse me for being dense but I'm still not following the relevance of the
correlations and am not sure what you mean by "nothing left over for any
different input function" and "describes everything there is to know about
this input".

Here's what I understand so far,

1) For a constant weight vector {w1 w2} any values of {s1 s2} which satisfy
the weighted sum p = s1*w1 + s2*w2 (ie. s1 and s2 are under control) will fall
on the same straight line.

2) If p=0 then the line will pass through the origin.

3) If p != 0 then the line will be parallel to the first line (same slope) at
a distance of p units.

4) The line will cut the s1 axis at p/w2 and the s2 axis at p/w1

5) The slope of the line is defined by the weights. The s1 in s2 gradient is
1 in w1/w2. If there are different weights then the slope will be different.

6) Two lines of different slope (two input functions) will always intersect.
The intersection point gives the values of s1 and s2 that satisfy both input
functions (weight vectors), both are controlling successfully at this point
(and no other). The same applies to orthogonal weight vectors.

7) Therefore with two inputs you can only have two functions which control
successfully. With three inputs, three functions. n inputs, n functions or, as
you said, n degrees of freedom. So, the more inputs, the more perceptions can
be controlled.

Is this correct so far ? Does this mean that the number of control systems
possible at one level depends on the number of different inputs available at
that level ? What is significant about orthogonal weight vectors, there is
still an intersection point ? I can see that the {s1 s2} vector you produced
{2.19 .82} has the same slope as the weights {0.8 0.3} but am not sure why
this is imporatant as other values of {s1 s2} maintain the perception at the
reference.

In an ANOVA, ...

I'm not familiar with ANOVA.

Regards,
Rupert

[From Rupert Young (980125.1330 UT)]

(Martin Taylor 980123 18:40)

I hope this is beginning to make some sense.

Yes it is, a very helpful and useful message. Thanks.

The end result of reorganization should therefore be expected to consist
(at any one level of the hierarchy) or a set of control systems whose
perceptual functions tend to be orthogonal to each other, and whose output
are distributed so that the vector of output effects for each control
system is highly correlated with its own perceptual input function.

Does this mean that there should be output weights which are the same as those
on the input function, or that _direction_ of the output vector should be the
same as that of the input weights ?

Also does this mean that one can choose the weights on n input functions each
with n inputs by ensuring that they are all orthogonal ?

Regards,
Rupert

[From Rupert Young (971029.1100 BST)]

(Bill Powers (970919.162MDT))

... the largest perceptual signals being those from systems
whose inputs come closest to the vector defined by their input functions.

And reading again chapter 8, Sensation control, of B:CP, particularly pgs
104-105, I intepret that with each input function there is associated a set
(or vector) of weights corresponding to each connection to lower-level control
systems. Each connection has a signal value which is the perception from the
lower-level control system. The weights and multiplied by the connection
signals and added to gether to get a value for the input perception to the
current system.

So, keeping it simple with just two connections, if w1 and w2 are the weights
and s1 and s2 are the incoming perceptions the current perception.

        p = (w1 * s1) + (w2 * s2)

I have some questions about this but first I'd like to check if this looks
right so far. Is it ok ?

···

--
Regards,
Rupert

[From Bruce Gregory (971029.1245 EST)]

Rick Marken (971029.0930)

The PCT system, of course, is in control of the variables it
perceives; so any disturbance to the variables that look like,
say, "walking", will be resisted. The PCT system has the purpose
of producing the perceptions that constitue "walking"; the
conventional system has no such purpose.

Nice post.

Bruce

[From Rick Marken (971029.0930)]

Rupert Young (971029.1100 BST)--

So, keeping it simple with just two connections, if w1 and w2 are
the weights and s1 and s2 are the incoming perceptions the current
perception.

        p = (w1 * s1) + (w2 * s2)

I have some questions about this but first I'd like to check if
this looks right so far. Is it ok ?

Yes. This is the current hypothesis about the nature of the perceptual
functions that produce sensation signals (level two perceptions). The
functions that produce signals representing more complex perceptual
variables (like transitions, relationships, principles, etc) would, of
course, be quite different. In my spreadsheet hierarchy, the level 1
perceptual signals (p1) are simply a linear function of an
environmental variable (q), so:

p1 = k * q.

The level 2 perceptual signals (p2) are linear combinations of
level 1 perceptual signals (as in your sensation function above)
so:

p2 = Sum (i=1 to Number of level 1 signal inputs) w.i*p1.i

The level 3 perceptual signals (p3) are _logical_ functions of level
2 perceptual signals, so one level 3 perception might be:

if p2.1>p2.2 then p3 = 1 else p3 = 0

Note that this means that p3 is a _binary_ variable; the perceptual
signal is either on (true) or not (false).

Of course, we have no understanding of the actual neural machanisms
that can compute perceptions (as functions of lower level perceptual
signals) of things like relationships, programs or principles. But
people do control these kinds of perceptions so, in theory, the
brain must be computing neural signals that represent these perceptual
variables.

Your question reminded me of a nice way to distinguish PCT from
other approaches to developing artificial life forms (like
robots). Virtually all current approaches to developing artificial
living systems are aimed at discovering ways to get these systems
to generate the right "outputs" -- the ones that will allow the
system produce the perception that the _developer_ of that system
wants to have. The PCT approach to developing artificial life
(I'm sure this is what Bill Powers is doing with his "bug" model)
is aimed at discovering ways to get the system to generate the right
"inputs" -- the ones that, when controlled by the system, allow the
system to produce the perceptions that the developer of that system
wants to have.

For example, the traditional approach to developing an artifical
"walking system" is to design the system so that it generates
outputs that look like "walking" to an observer. The perceptual
side of the story only enters the picture if it looks like walking
involves a response to sensory inputs (like "obstacle avoidance").
The PCT approach to developing an artificial walking system
starts by asking "what perceptual variables might be controlled
by the walker?". The designer than builds the perceptual functions
that compute these perceptual variables and provides the systems
with outputs that are the system's means of influencing these
variables.

The traditional approach can produce systems that are quite
good a producing the perceptions (like the perception of
"walking") that the developer wants to see; but the system can
can only produce those perceptions for the developer when it does
so in an environment where this perception cannot be disturbed.
Since the system itself cannot perceive what the developer perceives,
it can do nothing to bring this perceeption back to a reference
state if it is disturbed; the system can't control.

The PCT system, of course, is in control of the variables it
perceives; so any disturbance to the variables that look like,
say, "walking", will be resisted. The PCT system has the purpose
of producing the perceptions that constitue "walking"; the
conventional system has no such purpose.

Best

Rick

···

--
Richard S. Marken Phone or Fax: 310 474-0313
Life Learning Associates e-mail: rmarken@earthlink.net
http://home.earthlink.net/~rmarken

[From Bill Powers (971029.1155 MST)]

Rupert Young (971029.1100 BST)--

So, keeping it simple with just two connections, if w1 and w2 are the
weights and s1 and s2 are the incoming perceptions the current perception.

       p = (w1 * s1) + (w2 * s2)

I have some questions about this but first I'd like to check if this looks
right so far. Is it ok ?

Yes. To anticipate what your problem with this might be, you should try two
control systems operating at the same level, with differently-weighted
input functions.

Suppose that are two environmental variables, s1 and s2. In one control
system, the perceptual signal is made up of a*s1 + b*s2, and in the second
one by c*s1 + d*s2. The output signals simply add to the values of s1 and
s1, with signs appropriate to the signs of the weightings (so as to produce
negative feedback around each possible loop). Make both output functions
into rather slow leaky integrators so you don't have to worry about stability.

Suppose the weights are a = 1, b = 1, c = 1 and d = -1. This will mean that
one control system perceives s1 + s2, while the other perceives s1 - s2.
You will find that the two systems can control their respective perceptual
signals quite independently of each other. If you vary either reference
signal, or both, the corresponding perceptual signal will follow it just as
if the other control system weren't there.

I'm sure that you can see how this can be generalized to any number of
control systems at a given level, each perceiving and controlling a
different function of a large set of s's. Simultaneous independent control
of all the perceptual signals will be possible as long at all the
perceptual functions are reasonably orthogonal. Of course you can't have
independent control if the number of perceptual signals and control systems
is greater than the number of s's. I say that each perceptual signal, in
cases like this, represents a different "aspect" of the set of s's. The
"aspects" are generated by the forms of the input functions.

Note that you don't have to adjust the output weightings; all you need is
+1 or -1 to keep the sign of feedback negative around each loop.

If you play with this for a while, I think you'll see that it makes the PCT
model of perception make more sense.

Best,

Bill P.

[From Rupert Young (971207 1400 UT)]

I'd like to go back, if I may, to a query I had a while ago when my email
wasn't working.

(Rick Marken (971029.0930)

Rupert Young (971029.1100 BST)--

So, keeping it simple with just two connections, if w1 and w2 are
the weights and s1 and s2 are the incoming perceptions the current
perception.

        p = (w1 * s1) + (w2 * s2)

Yes. This is the current hypothesis about the nature of the perceptual
functions that produce sensation signals (level two perceptions).
The functions that produce signals representing more complex perceptual
variables (like transitions, relationships, principles, etc) would, of
course, be quite different.

But it some sense would not the higher functions still be the same because all
that neurons can do is a weighted sum of inputs.

(Bill Powers (971029.1155 MST)

>So, keeping it simple with just two connections, if w1 and w2 are the
>weights and s1 and s2 are the incoming perceptions the current perception.
> p = (w1 * s1) + (w2 * s2)
>I have some questions about this but first I'd like to check if this looks
>right so far. Is it ok ?
Yes. To anticipate what your problem with this might be, you should try two
control systems operating at the same level, with differently-weighted
input functions.

Ok, I tried that and it works. Though the main thing I am trying to understand
is how is it that a control system is sensitive to one input vector and not
another. And also what are the weights and what is their role ?
e.g. If we have a set of input intensity values that lay on the same vector,

x y
30 10
120 40
300 100

they would look something like this.

···

                              *

        >
        >
        >
        >
        >
        > *
        >
        >
        > *
        ----------------------------------------

What are the values of the weights ? Would they be wx = 3 & wy = 1.
In which case the magnitude would be

        mag = wx*x + wy*y

vector 1 3*30 + 1*10 = 100
vector 2 3*120 + 1*40 = 400
vector 3 3*300 + 1*100 = 1000

But isn't the magnitude of a vector the modulus ie. length of the hypotonuse ?

What if the inputs don't lay on this vector, there would still be a signal
wouldn't there, but for the wrong perception ?

Regards,
Rupert

[From Bill Powers (971207.0815 MST)]

Earliest sunset of the year today.

Rupert Young (971207 1400 UT) --

{Rupert]

I'd like to go back, if I may, to a query I had a while ago when my email
wasn't working.

So, keeping it simple with just two connections, if w1 and w2 are
the weights and s1 and s2 are the incoming perceptions the current
perception.

        p = (w1 * s1) + (w2 * s2)

[Rick Marken (971029.0930)]

Yes. This is the current hypothesis about the nature of the perceptual
functions that produce sensation signals (level two perceptions).
The functions that produce signals representing more complex perceptual
variables (like transitions, relationships, principles, etc) would, of
course, be quite different.

[Rupert]

But in some sense would not the higher functions still be the same because
all that neurons can do is a weighted sum of inputs.

Not so. We use linear weightings in our models because we can't solve many
nonlinear equations, not because neurons can only compute linear sums. My
basic hypothesis is that neurons can (roughly) add, subtract, multiply, and
divide, and also compute squares, cubes, logs, square roots, and many other
kinds of continuous functions -- approximately. They can also compute
dynamic functions like time integrations and differentiations. They do this
by analog computation, not symbolic or digital computations. Of course
these functions are not very exact, but in an adaptive system they can
probably become as exact as needed to account for the accuracy of living
control processes.

Suppose there were two signals, x and y, representing positions along
orthogonal axes in space. A perceptual signal made of x^2 + y^2 would
represent the square of the distance of the point (x,y) from some zero
point. The output signal of this control system would have to contain some
kind of switching so it could preserve negative feedback for all positive
and negative values of x and y, but assuming that could be done, we would
have a control system that controlled a position to keep it on a circle
with a radius whose square is equal to the reference signal of the control
system. Disturbances tangent to the radius would not be resisted; those
along the radius would be resisted.

That's just an example out of thin air. The actual way in which we control
in radius and angle probably involves many control systems each working in
a slightly different direction, with a higher system selecting which subset
of control systems is to be used over various arcs of a circle. That would
be more consistent with what is known about perception of directions in
space -- those visual vectors that Georgeopolis writes about. But we can
imagine simpler arrangements that would in principle do the same thing, and
get an understanding of control processes that deals with the same
_effects_ while avoiding the immense difficulties in achieving them
computationally in the anatomically correct way. The HPCT diagram is really
just a user-friendly fiction, but as long as we all know it is we can still
work out some useful principles.

I think you can take it for granted that perceptual input functions can
compute just about any reasonable function of the input signals that you
could think up.

(Bill Powers (971029.1155 MST)

[Bill]

So, keeping it simple with just two connections, if w1 and w2 are the
weights and s1 and s2 are the incoming perceptions the current >perception.

[Rupert]

       p = (w1 * s1) + (w2 * s2)

I have some questions about this but first I'd like to check if this looks
right so far. Is it ok ?

]Bill]

Yes. To anticipate what your problem with this might be, you should try two
control systems operating at the same level, with differently-weighted
input functions.

[Rupert]

Ok, I tried that and it works. Though the main thing I am trying to
understand is how is it that a control system is sensitive to one input
vector and not another.

The vector is _created_ by the weightings of the input function. It doesn't
exist in the two inputs s1 and s2. There is no objective vector that the
input function has to "recognize." This is hard to get used to if you've
heard only the conventional, naive-realist, version of perception, in which
the environment contains entities which the perceptual functions then have
to recognize and represent as signals.

In our little universe there are two variables, s1 and s2. They can vary
independently in any old way, depending on how external forces act on them.
But now suppose you create a control system that senses their sum: s1 + s2
(both weights are 1). The perceptual signal is compared with some desired
value, and the error signal is amplified to produce an output that affects
both s1 and s2. The result is that the _SUM_ is under control: we end up
with s1+s2 = r, the reference value that exists at the moment.

Before we introduced the control system, a plot of s1 against s2 would just
show some spaghetti-like trajectory of the point (s1,s2), as external
forces pushed the values of the two variables around in random ways. A sort
of space-filling curve. But after the control system was built and turned
on, the plot would look like this:

* | s1
    * |
       * |
          * |
             * |
------------------------------------ s2
                 > *
                 > *
                 > *
                 > *
                 > *
                 >

That would be for r = 0. For other values of r, the line would shift in a
direction at right angles to the line.

Suddenly there is a relationship between s1 and s2: their sum is constant,
at an adjustable value. All those external forces can still push the point
(s1,s2) back and forth along the line, but they can't push it off the line
any more. Try it with your simulation and see.

One dimension of this two-dimensional universe has been brought under
control, and in the process we have _created_ a vector in the direction of
the line (or rather, normal to it).

If we weight the two variables differently as we sum them, we get a
perception equal to w1*s1 + w2*s2. We will find that a new vector has been
created, with a different slope. Now the disturbances can move (s1,s2)
along the line in the different direction, but still not off the line.

What's happened is that we have put one degree of freedom of this little
universe under control, while leaving the point (s1,s2) free to change in
the uncontrolled degree of freedom. The degree of freedom that's now under
control is in the direction at right angles to the line.

What happens if we now build a second control system, sensing the same two
variables but giving them different weights? This will create another line
at an angle to the first line. If the second control system were the only
one, disturbances could now move (s1,s2) along that line, but not at right
angles to it (or at any other angle except along the line).

When _both_ control systems are operating, each one keeps (s1,s2) on its
own line, and the result is obvious: the only place (s1,s2) can go is to
the intersection of the two lines. Now disturbances can't move the point at
all. There are no more degrees of freedom left.

When the reference signals of the two control systems change, the
corresponding line moves at right angles to the direction of the line, so
the line stays parallel to a specific direction set by the input weights.
This means that (s1,s2) has to move so as to stay on the moving
intersection between the lines. So as you alter the two reference signals
you move the point around in two dimensions and all disturbances, in any
direction in 2-space, are resisted.

[Rupert]

What if the inputs don't lay on this vector, there would still be a signal
wouldn't there, but for the wrong perception ?

The two perceptions would just be whatever they are. If they aren't on
either of the two lines, both control systems would contain an error
signal. Each control system "sees" a universe in which only one weighted
sum of the two variables exists. That sum has whatever value it has. _All_
positions of the point (s1,s2) will produce some value of perceptual signal
in both systems, except (0,0). And even 0 is a value, so we can forget the
"except."

The other way way of looking at perception, which I think is what is giving
you trouble, is to think of a black box with many outputs and many inputs.
When the inputs are in one state, one of the outputs is activated. When
they're in a different state, a _different_ output is activated. So the
perceptual function is indicating _which input pattern is present_. This is
called identification or recognition. This is the sort of model that's used
by neural networkers, I think, and is the conceptual basis for conventional
models of perception. That's why everyone talks about "classification."
The output that is activated indicates the class to which the whole input
pattern belongs.

That kind of arrangement might exist at what I call the category level,
although as I described it it falsely implies that we can perceive only one
category at a time in a mutually-exclusive way. But at the lower levels,
where we exert continuous control in many degrees of freedom at the same
time, this kind of model of perception just won't work. We need to have
_all_ the perceptions in _all_ the degrees of freedom present at the same
time, continuously variable, so we can have many control systems working in
parallel.

People who haven't thought much about models of perception come into PCT in
a nice naive state, and when I describe how the model works they say, "Oh,
okay." So I don't take the time to go through all this. But people who have
already been studying models of perception have most probably been assuming
the other kind of model, in which "patterns" are "recognized." That's the
box with multiple inputs and outputs, with only one output at a time being
activated to signal the "right category." And of course if they've missed
my infrequent discussions of this difference in models, they can get very
confused trying to make the perception-creation model work like the
pattern-recognition model. I hope this discussion has alerted you to the
difference, and has helped with some of your problems.

Best,

Bill P.

[From Rupert Young (971207.1800 UT)]

(Bill Powers (971207.0815 MST))

[Rupert]
>But in some sense would not the higher functions still be the same because
>all that neurons can do is a weighted sum of inputs.

Not so. We use linear weightings in our models because we can't solve many
nonlinear equations, not because neurons can only compute linear sums. My
basic hypothesis is that neurons can (roughly) add, subtract, multiply, and
divide, and also compute squares, cubes, logs, square roots, and many other
kinds of continuous functions -- approximately.

Ok. Could you give an example of how, say, a squared function could be
computed ?

[Rupert]
>Though the main thing I am trying to
>understand is how is it that a control system is sensitive to one input
>vector and not another.

The vector is _created_ by the weightings of the input function. It doesn't
exist in the two inputs s1 and s2. There is no objective vector that the
input function has to "recognize." This is hard to get used to if you've
heard only the conventional, naive-realist, version of perception, in which
the environment contains entities which the perceptual functions then have
to recognize and represent as signals.

Yes, it certainly is very hard. Aren't the inputs (to the sensation level at
least) intensities ? If we are controlling a colour sensation aren't we
controlling a combination of the r,g,b intensities ?

In our little universe there are two variables, s1 and s2. They can vary
independently in any old way, depending on how external forces act on
them.....
Before we introduced the control system, a plot of s1 against s2 would just
show some spaghetti-like trajectory of the point (s1,s2), as external
forces pushed the values of the two variables around in random ways.

Are you saying that if we are looking at something, a Mondrian picture for
example, that the environmental variables (that are out there) are constantly
fluctuating and we have some internal controlling behavior that keeps those
variables constant ?

Suddenly there is a relationship between s1 and s2: their sum is constant,
at an adjustable value. All those external forces can still push the point
(s1,s2) back and forth along the line, but they can't push it off the line
any more. Try it with your simulation and see.

Yes, I tried it and the sum stays constant at whatever the reference signal is.
Though when I changed the weights (to 0.3 and 0.8) the sum didn't stay
constant, would you expect this ?

[Rupert]
>What if the inputs don't lay on this vector, there would still be a signal
>wouldn't there, but for the wrong perception ?

The two perceptions would just be whatever they are. If they aren't on
either of the two lines, both control systems would contain an error
signal. Each control system "sees" a universe in which only one weighted
sum of the two variables exists. That sum has whatever value it has. _All_
positions of the point (s1,s2) will produce some value of perceptual signal
in both systems, except (0,0). And even 0 is a value, so we can forget the
"except."

The other way way of looking at perception, which I think is what is giving
you trouble, is to think of a black box with many outputs and many inputs.
When the inputs are in one state, one of the outputs is activated. When
they're in a different state, a _different_ output is activated. So the
perceptual function is indicating _which input pattern is present_. This is
called identification or recognition. This is the sort of model that's used
by neural networkers, I think, and is the conceptual basis for conventional
models of perception. That's why everyone talks about "classification."
The output that is activated indicates the class to which the whole input
pattern belongs.

Yes, my understanding was that the reference signal was a particular magnitude
of a vector and the percetual signal another magnitude along the _same_
vector, with the erro being the difference between the two.

I hope this discussion has alerted you to the difference, and has helped
with some of your problems.

Very helpful thanks, though I feel further than ever from understanding.

···

--
Regards,
Rupert

[From Bill Powers (971208.1601 MST)]

Rupert Young (971207.1800 UT) --

Ok. Could you give an example of how, say, a squared function could be
computed ?

All you need is a neuron in which the output frequency varies as the square
of the input frequency -- approximately. Another way would be to have an
array of neurons with different biases, all receiving copies of the same
input signal. Each neuron produces an output frequency proportional to the
input frequency, but only when the input frequency is above some level
(known as "recruitment" to neurologists). When all the outputs of the
neurons are added together at their destination, the result will be a curve
with a slope that increases linearly (approximatly) with input frequency,
which will produce a square-law input-output curve. I'm sure there must be
a dozen other ways to do it.

Yes, it certainly is very hard. Aren't the inputs (to the sensation level at
least) intensities ? If we are controlling a colour sensation aren't we
controlling a combination of the r,g,b intensities ?

Yes, but what we're controlling are the weighted sums _derived from_ the
intensity signals. Like c1 := i*r + j*g + k*b, with c2 and other colors
being sums with different weights.

Are you saying that if we are looking at something, a Mondrian picture for
example, that the environmental variables (that are out there) are
constantly fluctuating and we have some internal controlling behavior that
keeps those variables constant ?

No, I'm saying that we're experiencing a lot of different perceptions at
the same time (at the sensation level). That's what we try to describe as
greenish blue and reddish orange, and so on.

There is some reason to think, however, that there is internal feedback in
the perceptual functions themselves. Edwin Land's theory of vision says
that colors are evaluated relative to a sort of average over the visual
field. This could be achieved with negative feedback (from the whole visual
field or large patches within it) that reduces all signals, leaving the
strongest ones sticking up out of the array of signals. But don't quote me
-- I don't have any theory of color vision, just a few notions.

Suddenly there is a relationship between s1 and s2: their sum is constant,
at an adjustable value. All those external forces can still push the point
(s1,s2) back and forth along the line, but they can't push it off the line
any more. Try it with your simulation and see.

Yes, I tried it and the sum stays constant at whatever the reference

signal >is. Though when I changed the weights (to 0.3 and 0.8) the sum
didn't stay

constant, would you expect this ?

Yes. The _original_sum, 1.0*s1 + 1.0*s2, didn't stay constant, but the new
one did, didn't it? In other words, the system was now controlling 0.3*s1 +
0.8*s2, and keeping THAT sum constant. The system is controlling the
derived perceptual signal, not the individual s's.

Yes, my understanding was that the reference signal was a particular

magnitude

of a vector and the perceptual signal another magnitude along the _same_
vector, with the error being the difference between the two.

This is right, but you have to realize that these signals are not
themselves vectors; they are scalars which can vary in magnitude only. The
direction in lower-level perceptual space is set by the input weights.
There isn't any actual arrow pointing in some direction.

Very helpful thanks, though I feel further than ever from understanding.

Is the problem an unfamiliarity with the mathematical concepts? Or just
with trying to make the transition from another point of view?

Best,

Bill P.

[From Rupert Young (971209.1100 UT)]

[From Bill Powers (971208.1601 MST)]

>Yes, it certainly is very hard. Aren't the inputs (to the sensation level at
>least) intensities ? If we are controlling a colour sensation aren't we
>controlling a combination of the r,g,b intensities ?

Yes, but what we're controlling are the weighted sums _derived from_ the
intensity signals. Like c1 := i*r + j*g + k*b, with c2 and other colors
being sums with different weights.

I'm having difficulty understanding what the weights are doing. Also how are
the weights determined ?

Won't c1 and c2 be controlling the _same_ environmental variables ? Maybe you
are saying this, that for each set of inputs (say each position in the retina)
there are multiple control systems controlling a different weighted sum of
those inputs ? If so, one of them will control successfully and the others
will have error ?

Yes. The _original_sum, 1.0*s1 + 1.0*s2, didn't stay constant, but the new
one did, didn't it? In other words, the system was now controlling 0.3*s1 +
0.8*s2, and keeping THAT sum constant. The system is controlling the
derived perceptual signal, not the individual s's.

Ah, you're probably right I'll try it tonight.

Is the problem an unfamiliarity with the mathematical concepts? Or just
with trying to make the transition from another point of view?

Mainly the latter, I think. I'm still thinking of the input intensities
defining a certain vector and detecting it's direction in vector space and
comparing its magnitude with a reference along the same vector.

···

--
Regards,
Rupert

[From Bill Powers (971209.0457 MST)]

Rupert Young (971209.1100 UT)--

Is the problem an unfamiliarity with the mathematical concepts? Or just
with trying to make the transition from another point of view?

Mainly the latter, I think. I'm still thinking of the input intensities
defining a certain vector and detecting it's direction in vector space and
comparing its magnitude with a reference along the same vector.

The input intensities don't define _any_ vector. Or rather, you can define
any vector you please within the space defined by all the degrees of
freedom that are present. In our example, we have two variables, s1 and s2,
which can take on any values whatsoever. But no direction is defined in
this 2-space until we write a function like w1*s1 + w2*s2. That defines a
direction in this space. The function w1*s2 - w2*s2 defines a _different_
direction in the same space. The directionality is created by the
perceptual input function where the weightings are applied. To answer
another question, the weighting is simply the "strength" of the synaptic
connections involved, or the number of branches into which a fiber
'arborizes' just before connecting to the dendrites of a succeeding neuron.

In the PCT theory of perception, particular perceptions are _created_ by
the nervous system, through altering the forms of perceptual input
functions. So in a real sense, the nervous system creates the world with
which it interacts.

Best,

Bill P.

[Martin Taylor 971209 10:40]

Bill Powers (971209.0457 MST) to Rupert Young (971209.1100 UT)--

I think the two of you are looking from opposite sides of the fence.

I'm still thinking of the input intensities
defining a certain vector and detecting it's direction in vector space and
comparing its magnitude with a reference along the same vector.

The input intensities don't define _any_ vector. Or rather, you can define
any vector you please within the space defined by all the degrees of
freedom that are present. In our example, we have two variables, s1 and s2,
which can take on any values whatsoever. But no direction is defined in
this 2-space until we write a function like w1*s1 + w2*s2. That defines a
direction in this space. The function w1*s2 - w2*s2 defines a _different_
direction in the same space. The directionality is created by the
perceptual input function where the weightings are applied.

The input value pair (s1, s2) defines a direction in a space. The perceptual
input function weight pair (w1, w2) defines a direction in a space. If
that particular set of sensory inputs is applied to that particular
perceptual input function, the value is a single number: w1*s1+w2*s2,
not a direction, not a vector, not a magnitude of a vector. It is
proportional to the correlation between the s pair and the w pair.
As the values s1 and s2 change, so does the correlation, and hence the
value of the perceptual signal output by that perceptual input function.

An infinite number of different perceptual input functions can look at
the same pair of sensory values. Call three of them (x1, x2), (y1, y2)
and (w1, w2). Each perceptual input function provides an output that is
a simple number: X = x1*s1+x2*s2, Y = y1*s1+y2*s2, W = w1*s1+w2*s2.
These numbers are unrelated to one another, in that the values of the
x's, the y's, and the w's can be anything at all. But you find that if
you try to change the values of X, Y, and W by modifying s1 and s2, you
can't change all three in arbitrary ways. You can change two of them
(for almost all choices of the x, y, and z values), but then you find
that the third has been fixed for you. It's called a limitation of the
"degrees of freedom."

There are two degrees of freedom for the s values. You can define those
two in lots of ways: as s1 and s2, as s1+s2 and s1-s2, as 2*s1-s2, s1+2*s2,
and so forth. But there are only two. You can never find three ways to
combine the values of s1 and s2 so that all three can be changed
independently. The perceptual input functions are just ways to combine
the values of the inputs (often more complicated than just by weighted
sums). With two inputs, you can't have more than two perceptual signals
brought to arbitrary values at the same time. So you can't _control_
more than two perceptual signals derived from just those two sensory
inputs. With three sensory inputs, you can control three perceptual
signals. For example, in colour space, you have what are called "red"
"green" and "blue" receptors, but you can control what we call "hue"
"saturation" and "brightness". If you are colour blind, you have fewer
degrees of freedom for perceiving colour, because you have fewer distinct
kinds of colour receptors.

The same applies throughout the hierarchy of control loops. The number
of different perceptual signals that can be controlled is limited by
the degrees of freedom available at the narrowest place in the system
of control loops (disregarding time-multiplexing effects, which are too
complex to go into here).

Hope this helps.

Martin

[From Bill Powers (971209.1605 MST)]

Martin Taylor 971209 10:40--

The input value pair (s1, s2) defines a direction in a space.

I suppose this is true for each pair, but the perceptual vector is the one
applied to _all_ input pairs.

The perceptual
input function weight pair (w1, w2) defines a direction in a space. If
that particular set of sensory inputs is applied to that particular
perceptual input function, the value is a single number: w1*s1+w2*s2,
not a direction, not a vector, not a magnitude of a vector. It is
proportional to the correlation between the s pair and the w pair.

Oops. The "w pair" is not an input; it's a fixed set of weights, and
doesn't correlate with anything.

As the values s1 and s2 change, so does the correlation, and hence the
value of the perceptual signal output by that perceptual input function.

Where are you getting this "correlation?" The weighs w1 and w2 are constants.

An infinite number of different perceptual input functions can look at
the same pair of sensory values. Call three of them (x1, x2), (y1, y2)
and (w1, w2). Each perceptual input function provides an output that is
a simple number: X = x1*s1+x2*s2, Y = y1*s1+y2*s2, W = w1*s1+w2*s2.

These numbers are unrelated to one another, in that the values of the
x's, the y's, and the w's can be anything at all. But you find that if
you try to change the values of X, Y, and W by modifying s1 and s2, you
can't change all three in arbitrary ways. You can change two of them
(for almost all choices of the x, y, and z values), but then you find
that the third has been fixed for you. It's called a limitation of the
"degrees of freedom."

What you say is true, but it skips over the pedagogical point about the
nature of an input function. In a given input function, the weights are
fixed at some values. As s1 and s2 change, they produce changes in the
magnitude of the weighted sum. The effect is as though the magnitude were
the projection of the vector from the origin to the point (s1,s2) onto a
vector with a direction set by w1 and w2, which are like direction numbers
of a line. The perceptual signal is like the magnitude of a vector with
fixed direction numbers w1, w2 in the s1,s2 space. All pairs of input
variables are perceived as if projected onto that line.

Best,

Bill P.

[Martin Taylor 971210 10:25]
Bill Powers (971209.1605 MST)

I think you think more clearly at 04:05 than at 16:05--or perhaps it's
just that in the wee small hours you are writing your own thoughts rather
than misreading others.

Martin Taylor 971209 10:40--

The input value pair (s1, s2) defines a direction in a space.

I suppose this is true for each pair, but the perceptual vector is the one
applied to _all_ input pairs.

Yes, as the next statement says.

The perceptual
input function weight pair (w1, w2) defines a direction in a space. If
that particular set of sensory inputs is applied to that particular
perceptual input function, the value is a single number: w1*s1+w2*s2,
not a direction, not a vector, not a magnitude of a vector. It is
proportional to the correlation between the s pair and the w pair.

Oops. The "w pair" is not an input; it's a fixed set of weights,

As stated.

and

doesn't correlate with anything.

It correlates with any other pair of numbers anyone cares to mention.

As the values s1 and s2 change, so does the correlation, and hence the
value of the perceptual signal output by that perceptual input function.

Where are you getting this "correlation?" The weighs w1 and w2 are constants.

The correlation is proportional to the dot product of the two vectors.
Alternately, it is proportional to the projection of one vector on the
other. ALternatiely, it is the direction cosine of the angle between
the vectors. I chose to use the word "correlation" as being likely to
be more familiar to Mr Young than the other, more geometric versions.

An infinite number of different perceptual input functions can look at
the same pair of sensory values. Call three of them (x1, x2), (y1, y2)
and (w1, w2). Each perceptual input function provides an output that is
a simple number: X = x1*s1+x2*s2, Y = y1*s1+y2*s2, W = w1*s1+w2*s2.

These numbers are unrelated to one another, in that the values of the
x's, the y's, and the w's can be anything at all. But you find that if
you try to change the values of X, Y, and W by modifying s1 and s2, you
can't change all three in arbitrary ways. You can change two of them
(for almost all choices of the x, y, and z values), but then you find
that the third has been fixed for you. It's called a limitation of the
"degrees of freedom."

What you say is true, but it skips over the pedagogical point about the
nature of an input function. In a given input function, the weights are
fixed at some values

As I pointed out.

As s1 and s2 change, they produce changes in the
magnitude of the weighted sum.

As I pointed out.

The effect is as though the magnitude were

the projection of the vector from the origin to the point (s1,s2) onto a
vector with a direction set by w1 and w2, which are like direction numbers
of a line.

Which is proportional to the correlation between S and W vectors.

The perceptual signal is like the magnitude of a vector with
fixed direction numbers w1, w2 in the s1,s2 space. All pairs of input
variables are perceived as if projected onto that line.

That's just the pedagogical point that I saw as giving trouble to Mr Young,
and the reason I made my posting--to eliminate that image, correct though
it may be.

He was seeing both the magnitude and the vector at the same time, whereas
the magnitude is of the _perceptual signal_ and both vectors are in
the space of physical observables sensed by the two sensors.
My attempt was to segregate these two kinds of space, so he could see
that what mattered was the correlation (angle, projection) between the
two vectors, and that the result was a _number_, not a magnitude of a
vector (though derviable by way of the magnitude of a vector if you
want to do so.)

Since your tone in commenting on my message seemed critical, but your
content was simply to repeat much of what I said except that you re-
introduced what I argued was the pedagogical difficulty, I fail to see
the point of your message.

Martin