Affordance; CT diagrams

[From Bill Powers (930913.1045 MDT)]

Avery Andrews (930913.0944) --

To put Gibson's "affordance" in the best light, we could
understand it as meaning that there is an objective reality of
some sort which provides the raw material from which perceptions
can be derived (and although Gibson didn't say so as far as I
remember, on which we can act). My problem with Gibson is that
when he starts giving examples of such affordances, they are
always in terms of other perceptions, which makes him look like a
conditional realist ( = someone who is a realist under some
conditions). When he says that "surfaces" or "ambient light"
afford the perception of visual objects, he is apparently saying
that surfaces and ambient light are not perceptions. In a way
they aren't -- but they are concepts derived from perceptual
interactions with the world and not knowable without the aid of



..if a landscape contains smallish rocks, they afford throwing
to humans and chimpanzees, but not dogs, etc.

Oh, dogs can throw rocks, too, or at least balls (not so hard on
the teeth). The problem here is that while human beings do throw
rocks, they can do all sorts of other things with them, too, like
holding down pieces of paper, breaking up into gravel, carving
statues out of, piling up into trail markers, commemorative
cairns, and stocks of supplies for building roads, rolling down
hills, holding books upright --- the longer you think about it,
the more affordances a rock has.

This way of characterizing rocks in terms of human purposes in
which they may play a part is simply too superficial; it doesn't
take advantage of a more modern concept, which is _properties_.
The properties of a rock can be stated without reference to how
the rock will be used by an organism. Compressive strength,
tensile strength, density, chemical composition, and so forth
don't imply any particular use. The reason you can't pour a rock
into a cup is not that it doesn't afford pouring, but that the
rock does not have the properties of a liquid. A human being
determined to pour the rock into a (graphite) cup can act to
change the rock's affordances by heating it enough to melt it.
But that way of putting it takes us back to the days of alchemy
and before, when the idea of properties didn't exist and things
behaved as they did because of "principles."
Hans Blom, (9309813) --

I'll chime in on your comments to Tom Bourbon. It's worth while
putting in some effort to get the notations and diagrams properly
translated into each other.

Your version of Tom's diagram isn't quite the same, because in
Tom's diagram there is an explicit perceptual signal p
representing the spatial distance between cursor and target,
called "c - t". Also, the reference signal should go into the box
containing the comparator, not be added to the target position.
I'd change it like this:

        >handle B |handle A |
        > > >
       \|/ \|/ |
noise ----- ----- c ----- ----- ------ |
----->| + |----->| + |----->| | |- | | | |
      ----- ----- | | p | C | | | |
                            > c-t |--->| O |-->|handle|-->
target t | | | A | | |
--------------------------->| | |+ | | |
                             ----- ----- ------
r [ = 0 ] /|\

In a linear system the order in which operations are done doesn't
make a difference, but in matching the diagram to the physical
situation or the presumed inner functions of the organism, I
prefer to try to maintain a 1:1 correspondence as nearly as
possible or convenient. Tom, does this diagram meet with your
approval now?

This left-ro-right arrangement is OK except in one regard: it
doesn't make it easy to see which parts of the system are in the
organism's environment and which are inside the organism. As
drawn, the environment sort of surrounds the active system, which
as redrawn consists of the perceptual function (c-t box) and the
COA box. The handle, the physical effects on the cursor and the
cursor itself, and the target are external to the organism.

Also, as shown above, the reference signal seems to come from
outside the organism. In engineering applications this is
perfectly appropriate: there's a knob by which the active
system's reference signal can be altered. But in PCT, the
reference signal is the output of a higher system inside the
organism, and shouldn't be associated with the environment at
all. Organisms don't have any reference signal inputs from their
environments. Note that the value of the reference signal isn't
necessarily zero, although you've shown it as equal to zero. A
nonzero reference signal will result in maintenance of the cursor
at a fixed distance from the moving target -- a nonzero value of
c - t.

In your diagram showing the adaptive part explicitly, some
changes also need to be made:

        >handle B |handle A |
        > > >
       \|/ \|/ |
noise ----- ----- c --- ----- -------- |
----->| + |----->| + |----->|-| | | | | |
      ----- ----- | | e | | e'| | |
target t | | | | | | |
--------------------------->|+| | | | | |
                            --- | --k-- --------
r [ = 0 ] /|\ \|/ /|\
----------------------------- ---------
                                 >d(e^2)/dt| [A]
I've collapsed the perceptual function and comparator into "C".
The output function is a pure integrator. The adaptive part of
the system has a perceptual function that converts the error to
the first derivative of the square of the error (or the absolute
value, it makes little difference). The reference signal for this
perception is zero, so it's omitted. The error signal is used to
raise and lower k, the multiplying constant in the output
function e' = k*int(e). The actual process inside [A] is a little
more complex:

If the squared error is increasing, or greater than some
threshold amount (I've used both with success), a random value of
"delta" between -d and d (arbitrary limits) is computed. This
value delta, times the absolute value of error, times a very
small scaling factor, is added to the value of k on every
iteration, whether or not the squared error is increasing. So the
value of k is always changing.

When the square of the error begins to increase, delta is changed
randomly; when the error is constant or decreasing, delta remains
the same. Thus if k is changing in a direction that is reducing
the error, it keeps on changing in that same direction because
delta is added on every iteration without being changed. However,
if the error begins to increase, delta is changed at random, so k
might begin either increasing or decreasing at a greater or
smaller rate, at random. As the absolute error declines, the size
of the positive and negative limits of delta becomes smaller, so
the changes in k become more gradual.

The result is that k will spend more time changing in the
direction that decreases the squared error than in the direction
that increases it. In a very short time, k will attain the
optimum value. It will still keep changing, but now all changes
will result in an increase in the error, so k simply does a
random walk in the vicinity of the optimum value.

This may seem a very elaborate way to achieve an elementary
result -- why not, as some have asked, just start changing k, see
if it is changing the right way, and if not change it the other
way until the best result (least squared error) is achieved?
There are two main reasons.

First, there are ongoing disturbances that make the error
fluctuate in unpredictable ways, so you can't tell on a single
trial if a reduction in squared error was due to a disturbance or
to an improvement of control. The systematic approach thus
doesn't have any advantages, and costs more computationally.

Second, this is only a simple application of a principle that
extends to much more complex optimizations. I have used exactly
this method to solve a system of 50 simultaneous linear
equations, in which the squared error was the sum of 50 squared
errors, one for each equation, and in which the random
adjustments were made in 50 coefficients for each of the 50
equations. Each coefficient had its own associated delta which
was chosen at random every time a reorganization was called for.
Convergence to a solution took a while, but seemed reasonably
efficient to me.

I have also used this same method to match a model to the
behavior of a real person. In this case, the squared error was
that between the model's handle positions and the real ones. I
got the same value of integration factor that I got by other

This method is powerful because it doesn't depend on any
assumptions to speak of. It just feels around in the hyperspace
looking for directions that cause the total squared error to
decrease. Local minima tend to be overcome because once in a
while there is a run of "bad luck" that drives the error toward
larger values. If the local basin isn't too large, the system
will eventually get over the hump and start searching elsewhere
for a new minimum. Of course if the minimum is absolute, the
system will eventually end up back at the same minimum.

I see no reason why this approach wouldn't work with a PID model
as well as a simple integrating control system. The nice thing
about it is that the number of parameters hardly seems to matter.
Each parameter forms one axis of a space, and as long as you can
define a zero-error condition, the random walk will find a
direction aimed at zero error, no matter how many axes there are,
even when there is random noise present. And you can leave this
system turned on all the time, because as the error gets close to
zero, so does the "velocity" of the moving point in hyperspace.
The parameters will remain close to the optimum values (for
reducing the error being monitored), but as soon as any
conditions change, the search for a better minimum will
automatically start again.

I particularly like this as a model for organismic reorganization
because it is so dumb. Very little by way of built-in mechanism
or computation is required, and nothing at all needs to be known
about the details of whatever is getting reorganized. This seems
to me a requirement on an inheritable reorganizing system that
can begin working as soon as life begins.

We use a simple integrating control system, by the way, for the
simple reason that it accounts for 99+% of the variance between
the model and the real behavior. A more complex model might give
a better fit over a larger range of circumstances, but in the
experiments we've been doing it isn't necessary. We do find
variations in the optimum value of k for a given individual when
disturbances with a wide range of difficulties (bandwidths) are
used. This implies that we need a nonlinear model to handle the
whole range of difficulty -- but that remains to be done.

Also, our model can be simple because the experimental situation
doesn't introduce any complex dynamics. Work in that direction
also remains to be done.

A question. You chose A = A - k (e). Does the minus sign mean
that, when the integral of the error becomes large, loop gain

This is a program step, not an equation. It is simply an
integrator with a negative coefficient. For a positive e and k, A
will go more and more negative on each iteration by the amount
k*e. And e is an absolute value or square when used for
reorganization. Tom, you'd better check over the details here.

Actually, loop gain might need to be increased or decreased when
error increases. The random component assures that the right
adjustment is made, on the average.

Bill P.