Bewildering Selection

[From Bruce Abbott (941216.1500 EST)]

Peter Burke (941216.1000)

All of this discussion which is focussed on keeping the inputs
separate from the outputs, and not confusing the two in any
discussion, fails to conisder the gloss that is made when one
says that the outputs do not matter as long as the inputs are
controlled. Of course the outputs matter, because some of them
do not control the inputs, and some push the inputs further
away from the reference value. The question is, how are those
outputs selected. Its one thing to say they are selected
because they control the inputs, but it is another thing
entirely to show exactly how the error signal accomplishes this
proper selection of behavior. It may not be reinforcement that
selects the behavior, but the behavior is nevertheless selected
by some mechanism. Let's see some disucssion of this issue!

Peter, if I may anticipate the responses of our PCT guru and vice-gurus, they
will say that the outputs are NOT selected; instead, perceptual control
systems are modified or constructed so as to bring the relevant perceptual
variables to their reference levels, if possible. Yet such control systems,
once brought into being, usually exhibit only a limited set of behavioral
responses. I agree with you that it needs to be shown "exactly how the error
signal accomplishes this proper selection of behavior." One suggestion
offered by Bill Powers is that parameters, connections, etc. are scrambled at
random until a workable "solution" is found (i.e., system that achieves
control), but systematic strategies are also possible.

The behavior one observes as the organism searches for a solution are, of
course, all produced by perceptual control systems and thus have a potentially
discoverable "goal-orientedness" about them. I was hoping we could all come
to some agreement on what to call these "acts," so that we could then address
the question you raise, which is how do particular "acts" end up as consistent
aspects of the behavioral outputs that serve to bring a given perceptual
variable under control? How does it come to be, for example, that when
pushing against a rod jutting up at a right angle from the floor, one cat
always does it by rubbing against the rod with its left flank, whereas another
cat pushes the rod with its nose? Both "solutions" push the rod enough to
release the door and let the cat out.

Bill's view is, I think, that the higher-level "reorganizing" system, which
becomes active when certain "intrinsic" variables are not being well-
controlled, randomly varies the reference levels [and perhaps other
parameters] of lower-level systems to bring about orderly sequences of goal-
directed behaviors; those that "function," (bring the perceptual variable
under control) cause reorganization to cease, leaving the current parameters
of those lower-level systems in place.

This theory reminds me of Guthrie's idea that the last behavior to occur in
the puzzle-box prior to a change in the cat's situation (e.g., release from
the box) remains "selected." [Guthrie believed that reinforcers did NOT
"strengthen behaviors," but rather, served to change the situation and thus
freeze the current behavior into place.]

With you, I am concerned about the details of the envisioned process. Because
a single episode (push rod, walk out of box) is not likely to restore any
"intrinsic" variables to their reference levels, reorganization should, it
seems to me, continue. Why should this act now become more likely to be
observed, yet not occur immediately after the cat has been replaced in the
box? If reorganization is only partial at this point, what keeps the
reorganization system from reorganizing this partial solution away (via random
changes in lower-level system parameters)? If reorganization is complete, why
does the cat do all those useless things on the next trial prior to executing
the correct act once again? Can "reorganization" take place when an
"intrinsic" variable is not disturbed?

My feeling is that there may be many process by which control systems are
constructed and optimized, the proposed reorganization mechanism being one
possibility. My earlier question (as yet unanswered) about how I am able to
quickly compensate for changes in the environmental feedback function (I used
reversal as an example) was directed toward exploring this issue.