Truly random; evolution model progress

[From Bill Powers (940928.0850 MDT)]

RE: "truly random" etc.

Lewis LaRue (940928.0855 EDT)

A random move is two things: random, and a move. If it is a move, it
is a move from somewhere.

It's true that in a series of random numbers, each new number is a
change from the previous number. But given a range of random numbers,
with a maximum and a minimum, any "move" could be from the present
number to any other number within the range. There is no restriction
saying that the moves have to be small in relation to the total span. In
the E. coli type of random change, the difference between successive
positions in which we find the organism are dictated by swimming speed;
all the random variations do is change the direction of swimming. The
change of position remains small for any one random change..

For each axis, this is equivalent to picking a random rate of change of
position, rather than a random position. The "move" is from an old rate
of change to a new rate of change, not from an old position to a new
position.

Martin Taylor (940928.1350) --

But in real organisms, most phenotypic expressions, such
as height or eye colour, seem either to be responsive to many genes, or
to have only a small discrete number of possible (non-lethal) states.
The action of mutations in the former kind mimic the effect of
mutations that have small effects on single-gene characteristics.

Yes, that expresses the difference I'm talking about. The length of the
beak in a finch doesn't seem to vary much from one generation to the
next, in comparison with the total possible range. If there were n genes
affecting beak length, by your theory the probability of the largest
change would be the probability that all n of the contributing genes
would change in the same direction. The most probable change would be
smaller than the maximum possible change (unless the contributions were
multiplicative).

A similar effect, but perhaps smoother, would be obtained if the genes
regulated not the beak size directly, but the rate of manufacturing some
protein, the concentration of which led to developing a beak of a given
size. So a gene could jump back and forth randomly among a set of
possible alleles, with no restrictions as to the size of the jump, but
the result would be a smooth and much smaller change in the
concentration of the protein that actually determines beak size. This
would automatically give us the restricted range of changes in the final
product that is observed, while allowing changes from one allele to
another to remain "truly random."

By "truly random" all I mean is that one random value is unrelated to
preceding or following values (by any algorithm we know about). This is
the sort of output we get from a pseudo-random number generating
algorithm. When I smooth such outputs by integration, for use in
tracking experiments, I get smoothly changing disturbance tables which
are "sort of" random, but in which changes from one number to the next
have a limited range.

You propose a kind of organism that absorbs nutrients at some rate
determined by genetics and by the available concentration of nutrients
in the organism's environment. The organism doesn't excrete, and grows
at a rate proportional to its rate of intake. At some genetically
determined size, the organism splits into two (possibly mutated)
clones. There is some criterion for the death of an organism, without
offspring: you suggest that a growth rate less than some threshold
should lead to death. And the nutrient is replenished at some steady
rate. Am I right so far?

Pretty close, and I'm still working on the details. Actually the
organism could be considered to "excrete" since its growth in volume is
a factor Kg times the concentration of nutrient. The implication of a
small Kg is that only some of the nutrient is actually used in growth.

There was a technical hitch with my first idea of setting a minimum
volume for survival; I went to a minimum amount of absorbed nutrient
instead. Using a minimum volume ended up not eliminating any organisms,
because after a cloning, the volume simply halves and growth continues
from there. If I set the minimum volume at less than half the maximum
volume, no organism ever dies; if greater than half, they all die. A
minimum growth rate (equivalent to a minimum rate of absorption of
nutrients) works much better.

Without running it, I can't be sure of the outcome of this, but it
seems to me likely that in the absence of mutation, the population will
grow to a maximum and then all will die simultaneously.

Yes. I'm still playing with the constants to get that to occur at
reasonable levels of replenishments of nutrients, for a reasonable level
of population. I have to limit the population because each organism is
described by a structure that uses about 30 bytes, meaning I can't have
more than about 2000 organisms at maximum (withough getting fancy with
memory allocation). So I want the initial equilibrium levels to be
around 500 organisms at the point where they all die.

There has to be some variation, either in the susceptibility to death
or in the feeding rate. So long as the population consists of clones,
nothing interesting will happen. The interesting things happen when
there is a variety of genomes, as has been pointed out before.

Yes. That's next, after I get all the organisms to grow and die
simultaneously, with reasonable numbers. I've had a first try at it.

You have several possible genes in your initial proposal. These
control (a) feeding rate at a given nutrient concentration, (b) growth
rate for a given nutrient intake, (c) size at split, and (d) minimum
growth rate below which the organism dies. There could be a couple
more, in particular a pair that affect the actions of (a) and (b): (e)
effect of size on feeding rate at a given concentration (big ones might
be more/less greedy), and (f) effect of size on growth rate for a given
nutrient intake (could be small ones grow quickly and big ones slowly,
like people).

These are all good ideas, to be introduced one at a time at the
appropriate stage. One idea I had was that the rate of absorption of
nutrients, beside depending on nutrient concentration, should depend on
surface area. Growth itself increases volume faster than surface area.
I'd have to get something in there about metabolic requirements which
increase with volume, so the rate of growth in a constant nutritive bath
would go as 1/radius. Bigger organisms would have a harder time
ingesting enough to sustain growth and metabolism. Needless to say, this
amount of detail will come in much farther down the road.

I think you need some kind of nonlinear effect to make it possible for
more than one "species" to exist in a stable population. I'm not sure
about this, either, but you could try it out, and the (e-f) genes
provide that possibility.

Agree. Once I get the simplest simulation working, and get a feel for
the effects of the parameters, a lot of variations can be introduced.

Would it not be better to have the "selection-control" variable be
internal to the organism, rather than a sensed environmental quantity?

Well, sensing the concentration of nutrients would generate a variable
inside the organism, wouldn't it? I can imagine messenger molecules that
represent that concentration, but it's harder to think of how "rate of
growth" could be sensed. Why not just sense the variable on which rate
of growth depends, the amount of nutrient crossing the cell wall? But
your idea can certainly be tried.

What about Gactual-Gmin (actual growth rate minus growth rate below
which death occurs)?

That seems more external to the operation of the system than sensing
nutrient concentrations would be. The organism dies if the nutrient
absorption rate is below some level, but does the organism need to have
an internal representation of that level? It's easiest just to let the
organism die when that level is reached.

I set the mutation rate to be constant in my simulations, whereas you
change the mutation rate as a function of some perceptible variable,
but the effect of a mutation is no different in the two situations.

In your model, a mutation adds or subtracts a small increment to the
survival value of a gene. In the sense I defined above, that's not
"truly random." A truly random method would be to select a new survival
value at random from anywhere within the range from 0 to 1.0, without
regard to the previous value. That's why I said you are already using
the E. coli type of mutation. You're varying the rate of change of
survival value randomly, not the survival value itself.

The question is how much fast the e-coli approach works, and how much
more safe from rapid environmental change is an e-coli-mutating
population than one that always has the same stable mutation rate.
Does the different between them carry over into a high genetic
dimension, since we can be pretty well assured that it works in a 2-D
genetic space?

We'll see when we get there.

Yes, why not? For a reasonable size of population it could take quite
a bit of compute time, but if you can afford it for a few overnight
runs, why not do it?

Remember that I'm programming in C, which is quite a lot faster than
Hypercard. I get a thousand or so reproductions per second, even with
cleaning out the casualties after every time interval.

Incidentally, the "parametric evolution" approach I'm taking leads to a
very different way of dealing with generations. The generations (with
random variations introduced) are no longer synchronous. A new
generation occurs whenever a cell reaches the critical size for
division, so there will actually be divisions going on all the time even
though one cell might go through 10 to 100 time cycles before dividing.

In my first try with no variability, the population grew in steps at an
exponentially increasing rate. As soon as I added about 30% variability
in the size at which division takes place, the steps were replaced by a
smooth curve.

ยทยทยท

----------------
Sleep study:

Yes, I would like to see your results. I've had a hangup with my own
approach, mainly that when I loaded all the data you sent onto my hard
drive, my operating system went kaflooey -- it took forever to find a
file among the thousands present, and after a while it just gave up the
ghost and corrupted its allocation table. I had to transfer everything
new off the hard disk onto floppies, reformat, and reload from my tape
backup, which thank Newton I have now. I'm going to have to (a) compress
the files, and (b) load them into a few large files accessed with an
index of offsets. That's sort of discouraging, so I've been doing other
things that are more fun. But when my courage returns, I'll do a backup
and try again.

It's too bad that we got so much data, and that it's so thinly scattered
among so many different conditions. I would have preferred to have far
fewer variables, and explore them in more detail. But -- we didn't get
to design the experiment. I'll take what you were able to get,
gratefully.
-----------------------------------------------------------------------
Best to all,

Bill P.