Dag's ads; Info about dist in perception (or CEV)

[From Bill Powers (949412.0700 MDT)]

Dag Forssell (940411 etc)..

Two birds with one stone.

Bill Leach said

The real problem is not that business will not accept PCT but
rather that business wants some compelling evidence that this
is not just another "scam" and that it will work.

That would bear more emphasis in your advertising materials. Most
popular schemes are explained with the "try it, you'll like it"
approach. The PCT approach appeals to facts, reason, and
understanding. PCT is offered as a way of making sense of what
actually happens between people; when you understand how it makes
sense, you don't need formulas to tell you what to do. That ought to
appeal to engineers.

···

-------------------------------------------------------------
Martin Taylor (940411.1111) --

To get this post down to its present length I have had to delete
large gobs dealing with various aspects of your 7-hour product. I'm
trying to get away from incidental details and stick with the main
big issues. Some things have become clear while writing this, so I
hope the omitted materials will not matter.
----------------------------------
Martin, your reasoning about simultaneous integral and differential
equations boggles my mind.

The output is not a function of the current value of the
perceptual signal and the reference signal, except in the
special case that it is so defined. If the output function is
an integrator, it can take ANY value, given only the current
values of p and r.

If

p = o + d, and

o = k*integral(r - p),

that is all you need to say. Surely, you will admit that an integral
is a "function." The integral does not reach back through time to
give previous values of p an effect on present values of o. Those
previous values of p had their effects when they occurred, which are
still reflected in the present value of o. The second equation above
could equally well be written

do/dt = k*(r - p)

o = integral(do/dt)dt

The present value of r and p determine the present velocity of o:
how much o is going to change during the current dt. Differential-
integral equations describe the _present_ relationships that hold in
a system.

The p that you substitute from p = o + d is identically the p in
do/dt = k*(r - p), save for whatever transport lags are present in
the input function and comparator. If those lags are present, you
can still solve the continuous differential equations (in principle)
with Laplace transforms, using the multiplier
exp(-s*tau) to represent the present-time effect of a pure delay
tau.

Only when you fail to let the discrete approximations go to the
limit is there any problem with past and present. The whole point of
the calculus is to convert from a representation of the world in
terms of a succession of static discrete values (in which finite
derivatives can't exist) into a continuous world describable
completely in present time. Many confusing aspects of physical
processes are cleared up by this transition. But apparently, Zeno
lives!

Then the question is how important is current "p" in
determining the value of "o". And the answer, if there is ANY
transport lag, is "not at all." All of the current value of o
is determined by past values of r and p.

To say that the present value of o is _attributable_ to past values
of r and p is correct, from the viewpoint of an observer with
memory. However, in a device with a delay followed by an
integration, the present value of o is _determined_ by its
immediately previous value and the present value of a signal
emerging from the delay device: do/dt = f(delayed signal). At the
time the delayed signal has its effect on do/dt, some other value
exists, in general, at the input to the delay device. The present-
time effect on o does not depend on the signal at the input to the
delay device, but on the signal at the output of that device.
-------------------------------------------------
If you're going to take the point of view of an external observer,
you must look simultaneously at the inputs and outputs of delay
devices or integrating devices. The output of a delay device is
occurring now; at that same time, there is a different input to the
delay device, one that will not show up until it has passed entirely
through the device, at which time it will become the output of that
device and become the next function's input. The input to an
integrator is affecting the rate of change of its output. Unless the
integrator also contains a transport lag, this effect of input on
output is instantaneous: the rate of change of output is
proportional to the _present_ value of input to the integrator.
-------------------------------------------
[Picked up from later]

Anyway, what all this says is that for REAL control systems,
and for models that provide results that fit their actions
well, you cannot use the simplification of pretending that

p = o + d
o = f(r-p)

describes the relationships of the signals around the loop if p
is the same in both equations and f is a point-time (impulse)
function. You have to treat the values as time-varying
waveforms, and allow for convolution with non-impulse functions
(which can include time-lags).

Here is what the equations mean:

                        r
                        >
---------> p ---- [comp] -- e
               > >
               Fi Fo
               > >
               qi ------------- o
               >
               d

Clearly, p = Fi(o+d), and

o = Fo(r - p).

And just as clearly, the variable p in both equations is precisely
the single variable indicated by the arrow in the diagram. That
variable, even if we write it p(t), can have only one value at a
time, that time being NOW. The same is true for all the other
variables, d(t), o(t), e(t), or r(t). The "t" in all those
expressions means NOW. However you write the mathematical
representation of this system, the representation must produce the
values of the variables that exist NOW. Even if Fo were defined as
SUM-over-tau{e(t-tau)*f(tau)}, as in the Artificial Cerebellum, only
the _present_ value of e enters that function, and only the
_present_ value of output comes out of it. If you start thinking of
the variables in this loop as scattered all over different times,
you are only going to push deeper into thickets of confusion.
Everything that has to do with time-dependent effects takes place
INSIDE the functions, not outside them. The variables all coexist in
present time.
------------------------------------------

... if it happens to be true that information about the
disturbance appears in perception, then FROM A PROPAGANDA
POINT OF VIEW, PCT becomes harder to distinguish from S-R
approaches to psychology.

[that statement of mine] is in no way close to suggesting that
you emphasize differences that don't really exist. Quite the
opposite. It emphasizes the importance of helping people to
see the differences that are real.

One of the differences that is real is the assumption that something
about the nature of the disturbing variable, called the "stimulus"
in S-R theory, must be known to the control system in order for it
to "respond" as it does. If you say there is information in the
perceptual signal about the disturbance, and mean "the disturbing
variable" as we have always meant in PCT or "the stimulus" as it is
described in S-R theory, then you are calling our thesis propaganda,
where we believe it to be a highly important fact.

The point with which I am in sympathy is that people who don't
understand PCT might be inclined not to see the differences
that do exist, if they take a superficial look at information
analyses. Consider--even after all this time, Rick can't see
the difference between my approach to control using information
theory and an S-R approach.

This is strictly because when you and Rick use the term
"disturbance" you are referring to different variables. Both of you,
I must say, doggedly assume that the other is talking about your own
definition, when in fact you are talking about variables at
different places in space, physically distinct from each other, and
differently related to the operation of the control system.

Your use of the term "disturbance" boils down to "a change in the
controlled variable." Rick's boils down to "the set of all
independent physical variables capable of causing a change in the
controlled variable." Obviously, the perceptual signal DOES contain
information about the former because the relationship is direct, and
just as obviously, it DOES NOT contain information about the latter
because the relationship is indeterminate.

You spend a long time castigating me for not using
"disturbance" in a specific way. But the way I use it
coincides with the wordings that we worked out over a long
discussion period. It is the output of the disturbance
function in your diagram of 940405.1400.

The way you use it coincides exactly with the way you decided to use
it before we ever talked. What we worked out was that we would
reserve the word "disturbance" to mean "disturbing variable," and
use another word, like "fluctuation", to refer to changes in the
controlled variable itself. You have ignored that agreement, not I.

The "output of the disturbance function" is not an observable
variable; it is a computational fiction used for convenience in
diagramming. If the disturbing variable is a force, it acts on the
mass of the object associated with the controlled variable along
with the force produced via the feedback path. The only observable
variable is the acceleration of the mass, which is a time-function
of position. If the controlled variable is position, then only the
second integral of the acceleration is observed. Whatever
contribution there is from the disturbing variable, it is
measureable only as a change in the controlled quantity, the
position. So your definition of disturbance boils down, as I said,
to "a change in the controlled quantity." That is why I wanted to
use a term like "fluctuation" for your version of disturbance. The
state of a fluctuation in the controlled quantity is totally
ambiguous with respect to the number and direction of disturbing
forces. You can get exactly the same fluctuation from an infinite
variety of different disturbances and paths by which they act. So it
is impossible for the perceptual signal to contain information
related to any specific disturbance. The only information in the
perceptual signal is about the state of the CEV.
----------------------------------------------------

According to PCT, p(t) = P(f(o(t)) + h(d(t))), where p is the
perceptual signal, P the perceptual function, f the feedback
function, o the output, h the disturbance function, and d the
disturbing variable. This is an example of a function of the
form X = F(S+N).It is wrong to say of such a function that X
_inherently_ is independent of S or of N.

Ah, I begin to see. We can show that for any KNOWN disturbance,
there is a correlated fluctuation in the CEV, the amount of
correlation depending inversely on the quality of control. By a
formal definition of information, we can therefore calculate how
much information related to that KNOWN disturbance is contained in
the fluctuations of the CEV. Some of that information will appear in
the perceptual signal.

Your assumption is that this information, the part of it that is due
to the disturbance, is important in the process of control. In
effect, this says that the control system must know at least
SOMETHING about the disturbance in order to control. But that is a
non-sequitur. The existence of this calculable amount of information
does not automatically mean that this information is used in
control. It is present, and to an external observer (or a higher-
level system) it might indicate the presence of one or more external
disturbances acting through unknown paths. So it may provide some
information to an observer or a higher-level system.

But the information that the control system uses is the state of the
CEV itself, regardless of what put it in that state. The more
accurately the control system can know the state of the CEV, the
more accurate will its actions be in effecting control. If there is
any information about the disturbance in the perceptual signal, this
information amounts to noise, because it makes the undisturbed state
of the CEV more difficult to determine. If the disturbance were
suddenly removed, the control system would immediately restore the
CEV exactly to its reference level. If the reference signal changed,
the CEV would be brought immediately to its new reference level. All
that the disturbance can do is interfere with the process of
control. If it "provides information," that is unwanted information,
like the information in a radio signal that tells you a lightning
storm is nearby when you are trying to hear the news.

This interpretation says that even if we can calculate the amount of
information in the perceptual signal due to the effective
disturbance, that information is simply noise with respect to the
process of control. What the control system needs is information
about the state of the CEV, and that information consists of the sum
of information about effects of the output and effects of
disturbances (or better, the difference), processed by the physical
laws governing the CEV. It is information about the effects of
output on the controlled variable that enables control to work;
information about the disturbance makes information about the output
effects harder to obtain.

This fits with the fact that the more information about the
disturbance we find in the perceptual signal, the WORSE we find
control to be. If it is the information about the disturbance that
is used for control, then the opposite should be true: in the best
control systems, we should find the most information about the
disturbance being passed through the perceptual signal (which is to
say, in the CEV represented by the perceptual signal). What we do
find is that such information, even if it exists, only interferes
with the process of control.

Control is not just the ability of a system to resist disturbances
of its CEV. It is its ability to bring the CEV to any reference
condition within the possible range and maintain it there. In the
complete absence of disturbances, this control is at its best.
Disturbances only reduce the degree of control.
--------------------------------------
We have now gone about 390 degrees around the Sun since this
discussion first developed. Perhaps if I had been able to think of
the present arguments it would not have gone on so long. We have
confused _presence_ of calculable information in the perception due
to the disturbance with the _use_ of this information in enabling
control to happen. This is exactly the mistake made by S-R
psychology: if presence of a stimulus leads to correlated action,
then the stimulus must be what is causing the action. This overlooks
the fact that what we call behavior is essentially always the
behavior of a controlled variable, and controlled variables change
significantly only when reference signals change. Actions don't
cause these changes; the changes are brought about when external
forces tending to change the controlled variable are supplemented
with just the actions needed to create the intended changes. We
customarily speak of disturbances as if they always make errors
worse. But they have just as much chance of making them smaller. The
control system's actions merely make up the difference so the
controlled quantity remains where it is intended to be.
------------------------------------

A control system can operate successfully in an environment
where a hundred simultaneous causes are acting on its
controlled variable in completely unpredictable ways, through
a thousand unknown physical linkages.

Do you really believe "completely unpredictable" in that? Will
it work if the control system has a loop delay of L, and the
disturbance might go from +R to -R and back in time L/2? (R =
control range)

Why do you pick the least important part of what I say to quibble
with? The point is that disturbances are multiple and act through
multiple paths, with neither the nature of the disturbing variables
nor the physical linkages connecting them to the controlled variable
being knowable by the control system. To this you can add that each
disturbance is unpredictable "outside the control bandwidth," which
merely exaggerates the problem and is by no means the main
difficulty. What the control system can know is confined to the CEV.
------------------------------

Your arguments about information theory would be more
interesting if they were based on work you had actually done.

Well, I tried that first. I gave you a theorem, which you said
was nonsense _a priori_ because it talked about information.

I don't consider a theorem to be "work done" no matter what it talks
about. I'll accept it as meaning something when it's shown to fit
data. Theorems hide multitudes of assumptions about real systems,
many of which can easily be wrong.
----------------------------------------

(5) Does knowledge of the perceptual signal, the output >function

form and parameters, and the feedback function form

and parameters convey any information about the disturbance
waveform?

If the answer to (5) is "Yes" then, because the answers to (1)
to (4) are "No," the answer to (6) HAS TO BE "Yes."

But the answer to 5, as stated, is NO. You have to know in addition
the waveform of the reference signal and the form of the comparator,
as well as the form of the function through which the disturbance --
known to be single -- has an effect on the controlled variable. In
other words, you have to know all the functions and the value of the
reference signal before you can deduce the waveform of the
disturbance. As long as you have more than one unknown, all unknowns
remain indeterminate. If there is only one unknown, then your
knowledge of all the other aspects of the system permit it to be
calculated.
----------------------------------------------------

The faster and more accurate the use of the information by the
output, in countering the disturbance, the less you see in the
perceptual signal. This is basic, isn't it? I don't know why
you get so hung up on this point.

How does that information get to the output, if not via the
perceptual signal? This certainly is basic, but you're the one
waving the wand and chanting abracadabra. The information is there
in the perceptual signal, but it gets whipped away before we can
measure it, is that it? This reminds me of the Keynesian "pause"
during which savings are taken out to be invested, and then are
restored so they can be spent on the increased production, all while
time stands still.

Any information about the disturbance in the perceptual signal is
suppressed by successful control; what does get through simply
worsens control. The only information that matters in control is
information about the CEV.
----------------------------------------------------------------
Best,

Bill P.

<Martin Taylor 940412 18:00>

Bill Powers (949412.0700 MDT)

Martin, your reasoning about simultaneous integral and differential
equations boggles my mind.

The output is not a function of the current value of the
perceptual signal and the reference signal, except in the
special case that it is so defined. If the output function is
an integrator, it can take ANY value, given only the current
values of p and r.

If

p = o + d, and

o = k*integral(r - p),

that is all you need to say. Surely, you will admit that an integral
is a "function." The integral does not reach back through time to
give previous values of p an effect on present values of o. Those
previous values of p had their effects when they occurred, which are
still reflected in the present value of o.

This boogles MY mind. How can you write the second-last sentence and
IMMEDIATELY follow it with the contradictory last sentence? If the
past values of p are (as they in fact are) still reflected in o, how
can you say the function does not give previous values of p an effect
on present values of o?

And you seem to say this in order to contradict my statement that

If the output function is
an integrator, it can take ANY value, given only the current
values of p and r.

Do you really intend to deny this?

The second equation above
could equally well be written

do/dt = k*(r - p)

No it couldn't. The integral equation entails the differential one,
but the reverse is not true. It misses precisely the constant of
integration, which is where the effects of past values of p are found.

The present value of r and p determine the present velocity of o:
how much o is going to change during the current dt.

But not what the present value of o is, and the present value of o is
what affects the CEV in opposition to the present value of what-you-
don't-have-a-word-for-and-don't-want-to-call-the-disturbance. (I'll
call it the "wydhawf." The wydhawf is the value of the output of the
disturbance function.)

Differential-
integral equations describe the _present_ relationships that hold in
a system.

Of course they do. If you think I imply they don't, you misread me badly.

The p that you substitute from p = o + d is identically the p in
do/dt = k*(r - p), save for whatever transport lags are present in
the input function and comparator.

It is that, but the "o" in the differential equation doesn't plug into
the other equation. It isn't a straight substitution. I said that in
the previous form, the "p" was not the same. In those equations you
were going around the loop from p to p. In this form you are going from
o to o, and the "o" in p=o+d has a value that is the integral of the
derivative form, and therefore includes the effect of all past o, which
includes the effect of all past p.

If those lags are present, you
can still solve the continuous differential equations (in principle)
with Laplace transforms, using the multiplier
exp(-s*tau) to represent the present-time effect of a pure delay
tau.

Laplace transforms are indeed the easiest way to deal with convolutions
such as occur in control loops. At least in linear systems. Most
analyses of control systems use them, but the thought has been expressed
here that they are too complicated to introduce into this discussion. If
you want to do so, I'm happy. We might get things a bit clearer.

Here is what the equations mean:

                       r
                       >
---------> p ---- [comp] -- e
              > >
              Fi Fo
              > >
              qi ------------- o
              >
              d

Clearly, p = Fi(o+d), and

o = Fo(r - p).

I think this is another example of a notation affecting the perception of what
lies behind the notation. Instead of seeing the dotted lines and letters
as representing links, visualize them as having heights or thicknesses
that correspond to the value of the signal at the moment when you are looking
at the figure. Then visualize that as a movie.

Let's change the reference signal by a step. What does e do? It goes
up by a step, instantaneously. Now, a microscopic moment after the step,
what value does o have? Almost exactly the same as before the step,
nicht wahr? Over time, the value of o goes up, initially linearly. If
you want, let's assume that the feedback function (represented by a dotted
line, but normally a function over time) is just an instantaneous link,
as is Fi. So p goes up initially linearly. Here are some graphs of this
much.

···

-
ref |
                           >
  _________________________|_future

                            -
error |
                           >
  _________________________|_future

output
                            .
  _________________________._future

perception
                            .
  _________________________._future

Now, the increasing perception reduces the error, so it reduces the rate
at which output changes. Eventually, the rate of change of output goes
down to zero, as the perceptual signal comes up to the reference. The
rates are all based on current values, but the levels are not. They
are based on what has happened in the past. And that doesn't depend on
things changing in steps. Always, when you have an integral function,
the value of the output depends on past values of the input.

Even if Fo were defined as
SUM-over-tau{e(t-tau)*f(tau)}, as in the Artificial Cerebellum, only
the _present_ value of e enters that function, and only the
_present_ value of output comes out of it. ...
Everything that has to do with time-dependent effects takes place
INSIDE the functions, not outside them.

Yes, that's been at the heart of what I've been talking about. What goes
into the input now may not affect what comes out of the output until much
later, or it may affect the output almost immediately. It happens inside
the function, and what happens is represented in f(tau), whether it be an
immediate impulse function (not physically realizable), a unit step
(an integrator) or a unit step after a delay (an integrator with transport
lag). What comes out NOW is what is used now at the next stage in the
loop. In a physical systes, the output NOW cannot be affected by the
input NOW. There's always some delay, however small.

We are each trying to get the other to see that the variables that are
effective NOW are those that exist NOW at the various places in the loop.
I think we agree on that, but not on its implications. To me, the
implication is that what exists NOW in the loop can in no way be affected
by what exists NOW at any other place in the loop, and may be most strongly
affected by what existed some time ago at a different place in the loop.

------------------------------

Consider--even after all this time, Rick can't see
the difference between my approach to control using information
theory and an S-R approach.

This is strictly because when you and Rick use the term
"disturbance" you are referring to different variables. Both of you,
I must say, doggedly assume that the other is talking about your own
definition, when in fact you are talking about variables at
different places in space, physically distinct from each other, and
differently related to the operation of the control system.
...
The way you use it coincides exactly with the way you decided to use
it before we ever talked. What we worked out was that we would
reserve the word "disturbance" to mean "disturbing variable," and
use another word, like "fluctuation", to refer to changes in the
controlled variable itself. You have ignored that agreement, not I.

I remember it differently, but no matter. I think we need a word for the
output of the disturbance function, which is in the dimensions of the
CEV, just as is the output of the feedback function. If C is the present
value of the CEV, then

C = F(o) + H(d).

This is a form of equation frequently dragged into CSG-L discussions,
and so it is worth being able to talk about its various terms.

What I want a word for is the value of H(d). It isn't the fluctuation
of the CEV, because that is dC/dt, which is the sum of the derivatives
of the other two, if the effect is linear. Above, I used "wydhawf,"
but since there is already a word for "disturbing variable," why not
use the more familiar "disturbance," as I had (apparently wrongly) thought
we previously agreed? But if you really want to use two words for one
thing, I'll try to remember to use "wydhawf" in future--if I can remember
it tomorrow. It has no prior usage, so far as I am aware.

You follow this with a discussion of why the wydhawf is an unnecessary
construct, a computational fiction. I don't think it is unnecessary,
any more than is the error signal, which may not exist as a measurable
quantity in a real control system, such as one in which the output
function is based on a differential input Op-Amp. The wydhawf, as I
once before described it, is the effect the disturbance would have, were
it not opposed by the output effect (which is exactly as unobservable).
I think it a useful construct. And it does represent precisely the
disturbance in the tracking studies, as represented in a computable
way on the monitor screen.

The final thing I want to do in this message is to quarrel with one of
your statements, because the error in it is basic. I quote it out of
context, because the context is irrelevant to the problem:

just as obviously, it DOES NOT contain information about the latter
because the relationship is indeterminate.

It is quite wrong to say that an indeterminate relationship eliminates
information. It may, but more commonly it reduces rather than eliminates
information. I quote from my previous posting:

This is an example of a function of the
form X = F(S+N).It is wrong to say of such a function that X
_inherently_ is independent of S or of N.

For any function F, if the probability distribution of N is known (not
the value of N), then the probability distribution of S given X is a
function of the actual value of S. So knowledge of X reduces the
uncertainty about S, even though the relationship is indeterminate
since the value of N is uncertain. In everyday signal terms, X is
an observation of noise plus a possible signal. How big was the
signal? Was it there at all (non-zero)? You never know for sure,
but X provides information about it. The same holds if N is partitioned
into N1, N2, N3, ..., Nk.

All observations--all perceptions--are of this form.

That's all I want to say about the posting for now.

Until later.

Martin

<[Bill Leach 940412.22:11 EST(EDT)]

[Bill Powers (949412.0700 MDT)]

If it "provides information," that is unwanted information,
like the information in a radio signal that tells you a lightning
storm is nearby when you are trying to hear the news.

Bill, that is a classic line. I believe that one line satisfies any of
the "information in the disturbance" questions that I have had in a "real
live analogy"... those vague ones that "sorta nag" at you.

-bill