Degrees of freedom

[Martin Taylor 920824 15:40]
(Bill Powers 920824.1100)

I think that when you shift types as you go up levels, the degrees-of-
freedom problem takes on some new aspects. You already showed how you
could get 18 degrees of freedom out of 9, just by adding first
derivatives. Couldn't we get 9 more by adding second derivatives, and
so on? The question now is, what ELSE can be added that takes us out
of the domain defined by our original concept of intensities ->
sensations -> configurations?

The answer to the last question there is "Nothing." At least in respect of
the number of available degrees of freedom. Certainly adding a second
derivative gives you 9 more, but you need another independent time sample
to get it. You can get as many degrees of freedom as you want by taking
sufficiently many independent time samples.

Bandwidth really isn't the problem here -- we could keep all
variations well within the available bandwidth and still have a
problem to solve.

Well, in a sense it is THE problem, but when you get non-linear you can't
interchange time and frequency freely, so you need some kind of surrogate
for bandwidth. But for heuristic discussion, bandwidth will do.

All nine reference levels can thus vary independently with
respect to oscillation frequency.

In addition, the amplitude of oscillation can be varied independently
for all nine systems. There's an outer envelope set by bandwidth; the
bandwidth and momentary frequency sets the maximum rate of variation
in amplitude, but within that envelope, there is complete freedom. Now
we have nine more dimensions.

No you don't, if I understand your layout. You only have the selection of
values within the original nine dimensions. You do change the information
rate if you change the power level (SNR) within a dimension, but you don't
add any degrees of freedom.

So the same nine systems
can be used to construct an infinity of different patterns of
variation at the event level. We now have not only nine derivatives
(at least) at an instant, but a time-spanning characteristic called
amplitude, and a time-spanning pattern characteristic that can extend
indefinitely through time.

Sure. There are an infinity of possible settings of any single variable, too.
But that variable would have only one degree of freedom instantaneously, and
2*W degrees of freedom per second, where W is the equivalent rectangular
bandwidth of the variation of that variable (in a linear system).

[Discussion of making melodic phrases, which can be compared with ones
in memory]

We've now created a conceptual space in which various
components of a perception can change while holding other components
the same. We can define axes arbitrarily in such spaces, can't we?

Sure, but no more informationally independent ones than the original set
of notes allows- 2NWT where N is the number of simultaneously hearable notes
and T is the duration of the phrase. (I assume chords are allowed in your
phrases). You can define an infinite number of different axes, but measurements
on one won't be independent of measurements on another.

(Parenthetically: I think exactly this does happen in any "real" control
hierarchy, and it is what allows the neural architecture to deviate from
strictly reciprocal connections in which reference links parallel perceptual
links between higher and lower ECSs.)

When we
think of levels of perception in which temporal patterns are the
variable, the idea of simultaneity no longer applies. We can solve a
control problem that requires the nine systems to achieve their
SPACE-TIME RELATIONSHIPS and so on. These are actually solutions to
conflicts that would arise if we demanded that all nine systems reach
zero error at once. These solutions may explain why higher levels

Yep, that solution to the problem is directly implicit in the degrees of
freedom argument. If there were no ouput bottleneck in degrees of muscular
freedom, the patterns WOULD be simultaneously achievable and there would be
no conflict. It is the limitation of available df for control that leads
to all conflict situations (or most, anyway; I'm not prepared to stake my
reputation on "all").

Even if we think of only nine letters on a keyboard with one finger
for each letter, typing the word "keyboards" is impossible to do with
a linear combination of finger-presses. We can clearly control each
finger simultaneously and independently with nine control systems, but
when we try to do so, the keyboard won't respond and we won't see the
word "keyboards" on the screen. We'll see whichever letter was hit
ahead of the others by a millisecond.

So here, the df bottleneck is in the external world. The keyboard has one
instantaneous df, and as many df per second as it will accept keystrokes.

And with only the nine fingers plus two more degrees of freedom (x and
y), we can type everything that it is possible to type. How many
degrees of freedom have I used in saying all this so far?

I find it easier with ten. But I don't know how many df you have used. A
rough estimate would be to count the letters, but that could be either an
understimate or an overstimate. It could be an underestimate because you
may have done a lot of erasing and retyping, which does not show in the post,
and it could be an overestimate because of the sequential redundancies among
the letters, which remove from you the choice of which key to strike on
some occasions if you want to make sense. But the number of characters is
a reasonable order-of-magnitude estimate.


[Martin Taylor 940824 18:20]

Bill Powers (I didn't note the date when I captured the quotes)

The following was written some weeks ago, but was not posted when I
went away. It isn't a tutorial on degrees of freedom, but there is
some aspect of that in it. Mainly it is a justification for the
claim that there are far more degrees of freedom for input than for
output in the human.



What is a degree of freedom? One really can't talk about such a question
out of context. A SYSTEM has so many degrees of freedom, which can in many
cases be expressed in an infinite variety of ways. For example, if a
system can be totally described by the values of x and y, it has two
degrees of freedom. But x is not necessarily one of them, unless you
choose to select it as such. You might choose x+y as one. If you have
already chosen x as one, you have no more free choice. You might choose
x^2+y^2 as one, and x/y as another. And again, you have no more free choices.

When you are dealing with time series, each sample could potentially
represent a new set of degrees of freedom. So you could have x1, y1, x2, y2
as four degrees of freedom for two samples, or you might choose x1, x2-x1,
y1, y2-y1. But you can't have x1, x2, and x2-x1 in the same set of
choices, because any one of the three is determined by the other two.
In other words, you cannot have independent degrees of freedom for
position and for velocity in the same time series.

In the hand, arm, and shoulder I count 27 mechanical degrees of freedom.
This is 108 for all four limbs, which are structurally analogous. That
is for position control. However, each joint can also be used
simultaneously for force control (pushing, squeezing), velocity control
(movement speed), and acceleration control. This gets us to 432 degrees
of freedom.

Here is how I counted the degrees of freedom for physical movements.

I count 23 df for one arm and hand, and I think that is generous, because I
am assuming there that each of the finger joints can be bent independently.
A rare ability, if anyone has it. That gives 16 for the fingers and thumb
allowing 4 for the thumb because the bottom knuckle can move two ways.
Then there are two for the wrist and two for the elbow (or 3 and 1, because
you can't rotate them independently and one is for rotation that involves
both). Finally, 3 for the shoulder. It would be more reasonable to take the
top two finger joints as representing one df. But let it go at 23. I don't
see where you get 27, but I don't have a problem with miscounting by a
factor of 2 or 3, so let it pass.

For one leg, I get about 12 degrees of freedom (I allow one per toe here,
instead of three per finger and four for the thumb, since I find it hard to
imagine anyone being able to independently manipulate each toe joint!).
In this way, I get 70 for the limbs, still within a factor of 2 of your count,
so no problem. I then add about 25 somewhat arbitrarily for mouth and
vocal articulators, and about 10 for trunk movements. That comes to around
105 (in my original count, I did as you did, and took legs as equivalent to
arms, which gave about 120 in all. But 100 or 200 makes no difference
to the argument. The real number would not give so many independent
degrees of freedom, except possibly for circus contortionists.

Now in your count, you multiply the count by a factor of three to allow
for independent sets of df for velocity and acceleration. But you can't
have them independently of each other and of position. If something is at
place x0 at time t0 and at place x1 at time t1, then it has average velocity
(x1-x0)/(t1-t0) in the interim. By fixing the second position, you have
lost one df from the velocity. The same applies to acceleration. There
are no possibilities for adding degrees of freedom by measuring velocity
and acceleration. Each measure of one eliminates a degree of freedom in
another. So we are left with the same number as before, though they may
be distributed somewhat differently over time. In terms of the perceptual
hierarchy, the sensory layer limits the available degrees of freedom.
The available df can be reinterpreted at different layers, but not
independently. The new layer constitutes another set of choices for the same
number of degrees of freedom (or possibly fewer, but never more).

Now, bandwidth. You limit the output bandwidth to 1 Hz, when we know
that for 108 of these degrees of freedom the bandwidth is at least 2.5
Hz in a closed-loop system.

This is far from obvious to me. I assume that the 108 df refer to the
27x4 that you allow the arms and legs. You are asserting that the hip
rotation, the bending of the little toe, the rotation of the thumb, and
all of the other joints can simultaneously and independently be moved
at a rate of 2.5 Hz. All I can say is that you could make a LOT of
money as a one-man band!

The limit of 2.5 Hz bandwidth we see in
position control systems is imposed mostly by the mass and viscosity of
the movable parts, which also affects the effective bandwidth of the
sensors and should be considered part of the environment (a constraint
on the speed with which perceptions can change).

I can see how this might affect the bandwidth of the kinaesthetic sensors,
but not how it might affect the bandwidths of sensors for external events.
In any case, I didn't include the kinaesthetic sensors in the orginal
count, though you are welcome to add them if you want. It won't make
a 1% difference in the count.

For acceleration
control, the rise-time is about 9 milliseconds (tendon reflex) for a
bandwidth of about 15 Hz, and for velocity control it is about the same.
So the spinal control loop outputs involve about 100 * 2.5 + 200 * 15 Hz,
or 3250 df/sec.

Rise-time and bandwidth are not necessarily reciprocal entities. Rise
time is affected by the amplitude of an excursion (or vice-versa), which
has nothing to do with bandwidth. You have to deal with the continuous
rate of change of the thing you are dealing with. Acceleration may
well change from value X to value Y in 9 msec, but how many times per
second can it alternate between these values?

Even if you rule out acceleration and velocity control, we know that
clonus tremors are possible in all major muscles. They occur at a rate
of about 10 Hz, with an amplitude limited not by the speed of muscle
contraction but by the mass of the limbs. This would still leave us with
at least 1000 df/sec.

Now bandwidth. The rate for degrees of freedom is not the rate at which
something can oscillate. It is the rate at which such an oscillation can
change, perhaps, but it is not the rate of oscillation. If a motion can be
controlled at any time to oscillate at any chosen rate up to a frequency f,
then the number of degrees of freedom for that motion are 2f per second. If
it can oscillate only at frequency f, the number of degrees of freedom is one,
corresponding to the amplitude of that oscillation--and that is not one per
second; it is one, period. The clonus tremors come close to this. The
eyes have a tremor at around 60 Hz, but they don't have a 60 Hz bandwidth
for control of pointing direction. The bandwidth for that is around 2-5 Hz
in x and in y.

I don't limit the output bandwidth to 1 Hz. I said (as you quoted) that
some movements can be controlled at 10 times that rate. I estimated
(somewhat generously, I think) that the AVERAGE for all muscle groups
is about 1 Hz. After all, how fast can you control, say, a lean to the
right independently of a twist of the shoulder; how fast can you independently
change the rate at which you circle your hand on your stomach while at a
different changing rate pat your head with the other while separating and
closing up and curling the fingers on each hand independently? How fast can
you change the rate at which one eye moves to the right independently of the
change of rate of the other eye moving to the left? Most of the output
df per second come from fingers and mouth. The mouth might even contribute
100 or more df per second in speaking, and though it is hard to move the
finders fast and independently, even for a trained musician, I could
go along with 5 df/sec per finger, for another 50 df/sec. But you won't
get much more than another 50 total out of the legs and arms and trunk.
I think it is a considerable overestimate to allow 200 df/sec, which
corresponds to an average 1 Hz bandwidth, within a factor of 2.

Now input. There are two parts to your argument: that the sensors are
slower than I suggested, and that the environment is coherent. I'll deal
with each in turn, but preface it by saying that I believe, as I have
previously said, that the environmental coherence argument is correct.

There are, as I mentioned in earlier postings, some 10^6 fibres in
each optic nerve, already a reduction by a factor of 10^2 from the number
of individual receptors in the retina. There are some 3 x 10^4 fibres
in each auditory nerve, and I don't know how many sensory receptors of
different kinds in the skin. It would be most improbable if, in addition
to the reduction of a factor of 10^2 already achieved by passive statistics
in the retina, further passive statistical analysis could attain another
factor of 10^2 reduction in the df of the visual system. But I allowed
that--in other words, an overall reduction by a factor of 10^4, using
purely statistical analysis of the coherence of the receptor output,
permitted by the coherence of the environment. No computerized technique
for reduction of visual data has done this well, but maybe someday we will
learn how.

That leaves at least 10^4 df for sensory input AT ANY ONE MOMENT. How
fast can they be modulated?

You estimate 10-100 Hz per df, but this is
a great overestimate. The white-light critical flicker frequency of
about 25 Hz is the frequency at which the detection of flicker is just
barely possible.

Something over 50 Hz, actually, but a factor of 2 is unimportant to
the argument.

The 0.707 response point (which we also used to measure
output bandwidth) must occur at a frequency far lower than that,
probably around 2 Hz, based on a rise-time of sensory response to
brightness changes of around 0.05 sec.

Actually, the dropoff is quite sharp. At normal room illumination, the
0.707 bandwidth is around 20 Hz. At very low light levels it might be as low
as 5 Hz. A change in light level of a factor of 10^4 makes a factor
of about 3 change in CFF. An interesting point is that the response is
nearer that of a bandpass filter paralleled with a low-pass filter than
that of a low-pass filter alone, but the low-frequency cutoff of the
band-pass element is around 5 Hz, so it makes very little difference when
we are talking orders of magnitude.

This frequency goes down with the
size of objects.

Yep. For a light as small as 0.5 minarc, at normal room intensity, CFF is
around 12 Hz. This is as small as to cover about 1 retinal cone, perhaps
two or three. It is about the limit of resolution. For a patch 1 minarc
diameter, the CFF goes up to near 20 Hz, and at 10 minarc, 35 Hz.
Since the shape of the modulation sensitivity function seems not to
vary much on the high side under a variety of test conditions, it is
probably not unreasonable to assume the 0.707 bandwidth to be in general
about a factor of 2.5 to 3 lower than the CFF. This would give a bandwidth
of around 4Hz per optic fibre, or 8 * 10^6 df/second for one optic nerve.

Remember that sensors are intensity receptors; all
else, even moment detection, is derived from intensity variations.

It might be more nearly correct to say that they are intensity CHANGE
receptors, since most of them adapt quite nicely to steady levels. The
truth is a bit in between. As noted above, the visual modulation
sensitivity function looks rather like a low-pass filter paralleled with
a bandpass filter. Temporal edge-sharpening, in other words.

Even if no further df were allowed for the other senses, we still wind up
with a sensory capability of over 10^7 df/sec. This is a little larger
than the 200 df/sec available for output, in contradistinction to your

So by making estimates in a different way, we end up with about the same
df/sec for inputs and outputs. The enormous difference that you imagine
(and need for your argument) is simply not there.

Environmental coherence

I believe the environmental coherence argument to be correct in principle,
as I have acknowledged. I even wrote up the same argument in a commentary
on J.G.Taylor's theory in 1973 (S. African J. Psychol, vol 3 p23-45).
In that paper, I suggested that there might be a reduction by three or four
orders of magnitude between the number of receptors and the number
of coherent patches in the environment. The ratio may not be as large as
I claimed, but it is significant. That coherence is part of what permits
the reduction by a factor of 100 between the visual receptors and the
optic nerve. It permits passive statistical analysis of the scene,
and allows efficient use of the channel capacity of the optic nerve.
This is important for an animal that needs to move its eyes, because
the smaller and lighter the optic nerve, the faster the eye can move.
If one looks at the topographic representation of the retina in the visual
cortex, one finds that the central fovea is represented almost 1:1, but
the ratio in the periphery (most of the retina) is worse than 1:100.
The central fovea is the only area in which one can really make out
the shapes of things.

There's a lot of processing done in the retina on the data coming from
the peripheral sensors (i.e. those outside perhaps 2 degrees from centre).
That processing does something useful, or it wouldn't be done. What it
does cannot be accurately described in a few words, if indeed it is
accurately known. But one thing it does is to provide exquisite
sensitivity in the periphery to changes in the state of motion in
a local region (not acceleration, but motion where there was none,
or stillness over an area that was full of motion.

On input, once the passive statistical reduction brings the degrees of
freedom down to a number substantially lower than the number of receptors
would suggest, the resulting compressed data are available to the Perceptual
Input Functions of the lowest level of control systems (ECUs). These
are the transform templates I discussed in the Pattern and Transform
working paper. Each one potentially represents a degree of freedom, if
they are orthogonal in respect of the sensory input. Their outputs
may well not be orthogonal, if the changing environment is (as we both
argue) coherent over space and time. But that is NOT the issue.

The issue at hand is in the selection of which degrees of perceptual
freedom to control at any moment. Each PIF represents a potentially
controllable perception, and if they are orthogonal FUNCTIONS of the
sensory input, then an adequate output system could manipulate the
environment so as to satisfy all their references independently. That
the environment does not PASSIVELY change so much is irrelevant.

I have no idea how many of the possible df available from the sensory
systems are actually implemented in the relations among the lowest-level
PIFs. It may be that there are no more orthogonal functions than there
are output df. The transform generated by the set of lowest-level PIFs
may be far from spanning the available space. If that were so, it would
be correct to say that there would be no need for an alerting function.
There would be no time-division multiplexing. The only multiplexing
that would be applicable would be the multiplexing involved when each
lower-level control system gets is reference level from several at the
next higher level, which happens all the time.

But is it likely that the number of lowest-level PIFs is so small? That's
a matter of opinion, until we know some way of identifying individual PIFs
within the nervous system.