Peter Cariani (970214.1115 EST)
[From Bruce Gregory (970213.0650 EST)]
Peter Cariani (970212.1715 EST)
> I haven't seen Holland's recent book, but I do know his earlier work,
> and
> I think it's unwarranted to say that he "provides no mechanism for
> purposive
> > behavior of any sort." The genetic algorithm embodies "feedback to
structure"
> wherein those phenotypes that perform better (given some fitness
> criteria,
> which are effectively, goals and purposes) propagate their genotypes
> into
> the next generation. If we were to stretch this scheme to fit into the
> PCT bed (in true Procrustian fashion), the population of organisms is
> collectively
> controlling for the "perception" of enhanced fitness among its members.
I'm having trouble building this simulation. How exactly does the
population perceive enhanced fitness? What is its reference for
enhanced fitness? Where is this reference located? How is it set?
How does the population act to resist lowered fitness? What
determines the gain? How can I perform the Test?
By the way, I like PCT precisely because it _isn't_ a procrustean
bed. Too many of those around for my taste -- I'm not all that
comfortable with them
Just step back a bit. I was just taking issue with the notion that
genetic algorithms (GAs) aren't "purposive",
that they aren't in some sense adaptive systems
that have goals built into their structure.
Yes, the mapping of GA's into the PCT framework would certainly be a
force-fit.
Let's think about GA's (and adaptive systems in general) in terms of
an optimization process. (I don't in general advocate this perspective
because it eliminates the material world and reduces perception to a
mathematical function, but it works for our purposes right here....).
I'll use the metaphor of exploring the properties of a mathematical
function (we can keep telling ourselves, "it's only a metaphor,
it's only a metaphor").
One can have different strategies for minimizing some error function.
Some will involve computing a gradient and following the gradient
downwards to a minimum. These approaches will work when you have
smooth (effectively continuous) surfaces with not many local minima.
On the other hand, if the surfaces that one is trying to navigate
are very "rugged" and they have many local minima, and maybe they
aren't very smooth at all -- they may be discrete -- and maybe
there's not a great deal of "spatial" order to them, maybe the
space is not even "metrical", it's a space of discrete, unordered
states. In these cases, a shotgun or Monte Carlo -like approach
might yield better results -- it's not efficient, but there's a
better chance that one doesn't get hung up in a local minimum.
In this optimization metaphor, the goal is the minimization of some
function, the reference signal (goal) is a zero error, the perceptual
signal is the value of the error function (which is the difference
between some function that one is constructing -- "model" -- and some
function that is given by observation). Actions are the adjustment
of model parameters (hopefully so that errors are reduced).
A GA uses a procedure that is more like the shotgun approach than
directed error-reduction. It would be as if in a PCT model, instead of
computing the error and making the correction by a deterministic rule,
the organism or device had 20 different responses that it could try out
(in parallel) and determine whether the error was reduced. Let's say
that we have a device and it has 20 parameters that determine how its
parts interact, and therefore how it will respond to a given
(perceived) situation. There would be a mechanism for generating
variation in the responses (mutation, cross-over) that nevertheless
retained combinations of parameter values that proved useful (linkage),
so the search is somewhat constrained, though far less so than by a
deterministic rule. To the degree that the search is constrained,
it is "efficient"; to the degree that it is less constrained,
it is potentially more "robust" and "creative", albeit with less
efficiency.
In the device there is some means of measuring how well each alternative
response achieves a certain goal ("fitness function"), and the
alternatives are ranked according to how well they do. Some
fraction of the best alternatives are chosen, and their genetic
specification
becomes the basis for the mutations & combinations that form the next
generation of alternative responses.
So the main differences, as I see them, between GAs and control systems
lie in the nature of the adaptive landscapes and the corresponding
parameter spaces:
GAs: discrete, possibly nonmetrical, ill-defined, badly behaved, many
local optima
PCT: continuous, metrical, relatively well-behaved with relatively few
local optima
The problems faced by GAs demand more shotgun-like approaches, whereas
the problems
faced by PCT systems demand a highly efficient, fast, reliable response.
In the GA, it's the whole system that's doing the adapting, the whole
population, fitness
functions and all. Over time, the selection-mutation process ensures
that the
alternatives in the population will, on the average, improve, and this
is what makes
them adaptive.
If it's easier to think in terms of individual devices,
think about an immune system whose task is
to recognize a foreign invader. The space of molecular shapes that it
needs to
search is horribly ill-defined, and not simply ranked in terms of simple
metrics.
Many different dimensions interact, so it's hard to decompose the space
into smaller
search problems (although some of this can be done for local "active
sites"). There's
no deterministic way of adjusting antibodies to achieve the function of
recognition,
but there is a way of assessing how well they are achieving recognition
(binding affinity).
Many alternative responses are tested in parallel (as many as there are
species of
antibodies), and how well each one does determines its probability of
being in the
next generation. For this kind of problem, a genetic approach isn't bad.
(My criticism
of genetic algorithms is that the problem domains that it is best suited
for are
these kinds of real world, ill-defined situations where real, physical
processes are
at work; GA's are usually applied, though, to formal domains that are
usually more
efficiently dealt with via other strategies. In other words, GA's, like
many other
adaptive paradigms, have been taken over by applied mathematicians for
use on formal
problems rather than used as design principles for building devices to
adaptively
interact with the real world. They've been reduced to just another
optimization process.)
I hope this helps....
Peter