[From Bill Powers (930913.1045 MDT)]

Avery Andrews (930913.0944) --

To put Gibson's "affordance" in the best light, we could

understand it as meaning that there is an objective reality of

some sort which provides the raw material from which perceptions

can be derived (and although Gibson didn't say so as far as I

remember, on which we can act). My problem with Gibson is that

when he starts giving examples of such affordances, they are

always in terms of other perceptions, which makes him look like a

conditional realist ( = someone who is a realist under some

conditions). When he says that "surfaces" or "ambient light"

afford the perception of visual objects, he is apparently saying

that surfaces and ambient light are not perceptions. In a way

they aren't -- but they are concepts derived from perceptual

interactions with the world and not knowable without the aid of

perceptions.

## ยทยทยท

--------------------------------------------------------------

..if a landscape contains smallish rocks, they afford throwing

to humans and chimpanzees, but not dogs, etc.

Oh, dogs can throw rocks, too, or at least balls (not so hard on

the teeth). The problem here is that while human beings do throw

rocks, they can do all sorts of other things with them, too, like

holding down pieces of paper, breaking up into gravel, carving

statues out of, piling up into trail markers, commemorative

cairns, and stocks of supplies for building roads, rolling down

hills, holding books upright --- the longer you think about it,

the more affordances a rock has.

This way of characterizing rocks in terms of human purposes in

which they may play a part is simply too superficial; it doesn't

take advantage of a more modern concept, which is _properties_.

The properties of a rock can be stated without reference to how

the rock will be used by an organism. Compressive strength,

tensile strength, density, chemical composition, and so forth

don't imply any particular use. The reason you can't pour a rock

into a cup is not that it doesn't afford pouring, but that the

rock does not have the properties of a liquid. A human being

determined to pour the rock into a (graphite) cup can act to

change the rock's affordances by heating it enough to melt it.

But that way of putting it takes us back to the days of alchemy

and before, when the idea of properties didn't exist and things

behaved as they did because of "principles."

----------------------------------------------------------------

Hans Blom, (9309813) --

I'll chime in on your comments to Tom Bourbon. It's worth while

putting in some effort to get the notations and diagrams properly

translated into each other.

Your version of Tom's diagram isn't quite the same, because in

Tom's diagram there is an explicit perceptual signal p

representing the spatial distance between cursor and target,

called "c - t". Also, the reference signal should go into the box

containing the comparator, not be added to the target position.

I'd change it like this:

-----------------------------<-----------

>handle B |handle A |

> > >

\|/ \|/ |

noise ----- ----- c ----- ----- ------ |

----->| + |----->| + |----->| | |- | | | |

----- ----- | | p | C | | | |

> c-t |--->| O |-->|handle|-->

target t | | | A | | |

--------------------------->| | |+ | | |

----- ----- ------

r [ = 0 ] /|\

------------------------------------------

In a linear system the order in which operations are done doesn't

make a difference, but in matching the diagram to the physical

situation or the presumed inner functions of the organism, I

prefer to try to maintain a 1:1 correspondence as nearly as

possible or convenient. Tom, does this diagram meet with your

approval now?

This left-ro-right arrangement is OK except in one regard: it

doesn't make it easy to see which parts of the system are in the

organism's environment and which are inside the organism. As

drawn, the environment sort of surrounds the active system, which

as redrawn consists of the perceptual function (c-t box) and the

COA box. The handle, the physical effects on the cursor and the

cursor itself, and the target are external to the organism.

Also, as shown above, the reference signal seems to come from

outside the organism. In engineering applications this is

perfectly appropriate: there's a knob by which the active

system's reference signal can be altered. But in PCT, the

reference signal is the output of a higher system inside the

organism, and shouldn't be associated with the environment at

all. Organisms don't have any reference signal inputs from their

environments. Note that the value of the reference signal isn't

necessarily zero, although you've shown it as equal to zero. A

nonzero reference signal will result in maintenance of the cursor

at a fixed distance from the moving target -- a nonzero value of

c - t.

In your diagram showing the adaptive part explicitly, some

changes also need to be made:

----------------------------------

>handle B |handle A |

> > >

\|/ \|/ |

noise ----- ----- c --- ----- -------- |

----->| + |----->| + |----->|-| | | | | |

----- ----- | | e | | e'| | |

>C>--->|int|-->|handle|---

target t | | | | | | |

--------------------------->|+| | | | | |

--- | --k-- --------

r [ = 0 ] /|\ \|/ /|\

----------------------------- ---------

>d(e^2)/dt| [A]

---------

I've collapsed the perceptual function and comparator into "C".

The output function is a pure integrator. The adaptive part of

the system has a perceptual function that converts the error to

the first derivative of the square of the error (or the absolute

value, it makes little difference). The reference signal for this

perception is zero, so it's omitted. The error signal is used to

raise and lower k, the multiplying constant in the output

function e' = k*int(e). The actual process inside [A] is a little

more complex:

If the squared error is increasing, or greater than some

threshold amount (I've used both with success), a random value of

"delta" between -d and d (arbitrary limits) is computed. This

value delta, times the absolute value of error, times a very

small scaling factor, is added to the value of k on every

iteration, whether or not the squared error is increasing. So the

value of k is always changing.

When the square of the error begins to increase, delta is changed

randomly; when the error is constant or decreasing, delta remains

the same. Thus if k is changing in a direction that is reducing

the error, it keeps on changing in that same direction because

delta is added on every iteration without being changed. However,

if the error begins to increase, delta is changed at random, so k

might begin either increasing or decreasing at a greater or

smaller rate, at random. As the absolute error declines, the size

of the positive and negative limits of delta becomes smaller, so

the changes in k become more gradual.

The result is that k will spend more time changing in the

direction that decreases the squared error than in the direction

that increases it. In a very short time, k will attain the

optimum value. It will still keep changing, but now all changes

will result in an increase in the error, so k simply does a

random walk in the vicinity of the optimum value.

This may seem a very elaborate way to achieve an elementary

result -- why not, as some have asked, just start changing k, see

if it is changing the right way, and if not change it the other

way until the best result (least squared error) is achieved?

There are two main reasons.

First, there are ongoing disturbances that make the error

fluctuate in unpredictable ways, so you can't tell on a single

trial if a reduction in squared error was due to a disturbance or

to an improvement of control. The systematic approach thus

doesn't have any advantages, and costs more computationally.

Second, this is only a simple application of a principle that

extends to much more complex optimizations. I have used exactly

this method to solve a system of 50 simultaneous linear

equations, in which the squared error was the sum of 50 squared

errors, one for each equation, and in which the random

adjustments were made in 50 coefficients for each of the 50

equations. Each coefficient had its own associated delta which

was chosen at random every time a reorganization was called for.

Convergence to a solution took a while, but seemed reasonably

efficient to me.

I have also used this same method to match a model to the

behavior of a real person. In this case, the squared error was

that between the model's handle positions and the real ones. I

got the same value of integration factor that I got by other

methods.

This method is powerful because it doesn't depend on any

assumptions to speak of. It just feels around in the hyperspace

looking for directions that cause the total squared error to

decrease. Local minima tend to be overcome because once in a

while there is a run of "bad luck" that drives the error toward

larger values. If the local basin isn't too large, the system

will eventually get over the hump and start searching elsewhere

for a new minimum. Of course if the minimum is absolute, the

system will eventually end up back at the same minimum.

I see no reason why this approach wouldn't work with a PID model

as well as a simple integrating control system. The nice thing

about it is that the number of parameters hardly seems to matter.

Each parameter forms one axis of a space, and as long as you can

define a zero-error condition, the random walk will find a

direction aimed at zero error, no matter how many axes there are,

even when there is random noise present. And you can leave this

system turned on all the time, because as the error gets close to

zero, so does the "velocity" of the moving point in hyperspace.

The parameters will remain close to the optimum values (for

reducing the error being monitored), but as soon as any

conditions change, the search for a better minimum will

automatically start again.

I particularly like this as a model for organismic reorganization

because it is so dumb. Very little by way of built-in mechanism

or computation is required, and nothing at all needs to be known

about the details of whatever is getting reorganized. This seems

to me a requirement on an inheritable reorganizing system that

can begin working as soon as life begins.

We use a simple integrating control system, by the way, for the

simple reason that it accounts for 99+% of the variance between

the model and the real behavior. A more complex model might give

a better fit over a larger range of circumstances, but in the

experiments we've been doing it isn't necessary. We do find

variations in the optimum value of k for a given individual when

disturbances with a wide range of difficulties (bandwidths) are

used. This implies that we need a nonlinear model to handle the

whole range of difficulty -- but that remains to be done.

Also, our model can be simple because the experimental situation

doesn't introduce any complex dynamics. Work in that direction

also remains to be done.

------------------------------

A question. You chose A = A - k (e). Does the minus sign mean

that, when the integral of the error becomes large, loop gain

is REDUCED?

This is a program step, not an equation. It is simply an

integrator with a negative coefficient. For a positive e and k, A

will go more and more negative on each iteration by the amount

k*e. And e is an absolute value or square when used for

reorganization. Tom, you'd better check over the details here.

Actually, loop gain might need to be increased or decreased when

error increases. The random component assures that the right

adjustment is made, on the average.

---------------------------------------------------------------

Best,

Bill P.