Causality does not imply correlation

Richard_Kennaway · May 14, 2009, 4:25pm

[From Richard Kennaway (2009.05.14.1558 BST)]

A new slogan I thought up.

I've been looking in the SEM literature to see how it would treat data like those I showed in http://www2.cmp.uea.ac.uk/~jrk/temp/opdfig.png. However, most of the work I've looked at so far turns out to have some exclusion that puts the example outside its scope.

Judea Pearl's book "Causality" discusses only acyclic causal dependencies.

Lacerda, Spirtes, et al "Discovering causal models by ICA" allows causal dependencies, and uses ICA (independent component analysis) to discover cyclic causal models. However, the method makes a fundamental assumption that where variables are uncorrelated, there can be no causal influence of one on the other.

Another paper: Richardson and Spirtes "Automated Discovery of Linear Feedback Models" includes this:

"Under the assumption that all conditional independencies found in the observational data are true for structural reasons rather than because of particular parameter values, the algorithm discovers causal features of the structure which generated the data."

which I think is a statement of the same assumption that causality implies correlation. But we know from control systems that this is not true. Output and disturbance both physically affect the perception, but are uncorrelated with it.

I found a Matlab package for doing ICA, (http://www.cis.hut.fi/projects/ica/fastica/) so I downloaded it so see what it would make of the data generated by my simulation. Because ICA assumes that the signals it's trying to detect in the data are non-Gaussian, I made both the "known" disturbance and the reference follow sawtooth waveforms of incommensurable frequencies. I then generated 100000 timesteps and collected the output, perception, known disturbance, and reference, and passed those to FastICA.

It detected two significant signals in the data. When I correlated them with the data, one signal was correlated with O and D, the other with P and R.

In hindsight, that's not surprising. Apart from the noise introduced by the unknown disturbance, the data really only have those two degrees of freedom. But the two signals that ICA detected have no fundamental physical existence.

The algorithms for detecting causality from ICA calculations, as far as I understand them (I haven't found downloadable code) are going to say that the only causal relations are between O and D, and between P and R. If I tell the algorithms that D and R come from outside the system, then they will say that D causes O and R causes P.

There must be a paper in this.

···

--
Richard Kennaway, jrk@cmp.uea.ac.uk, http://www.cmp.uea.ac.uk/~jrk/
School of Computing Sciences,
University of East Anglia, Norwich NR4 7TJ, U.K.

rsmarken · May 15, 2009, 7:52pm

[From Rick Marken (2009.05.15.1250)]

Richard Kennaway (2009.05.14.1558 BST)

A new slogan I thought up. [Causality Does Not Imply Causation] …

There must be a paper in this.

Yes, and I’m writing it;-) I’m currently preparing to have a student collect data and we hope to have the data analyzed and the paper ready by the end of Fall (it will take a while to get IRB approval). But I would give anything (almost;-)) to use your new slogan as the title! How about co-authorship?

Best

Rick

···

–
Richard S. Marken PhD
rsmarken@gmail.com

rsmarken · May 16, 2009, 12:40am

[From Rick Marken (2009.05.15.1740)]

[Rick Marken (2009.05.15.1250)]

Richard Kennaway (2009.05.14.1558 BST)

A new slogan I thought up. [Causality Does Not Imply Causation] …

There must be a paper in this.

Yes, and I’m writing it;-)

Oops. The title I would like to use is the one you actually suggested: Causality Does Not Imply Correlation. It’s so far out I can’t even write it correctly.

Best

Rick

···

–
Richard S. Marken PhD
rsmarken@gmail.com

Richard_Kennaway · May 19, 2009, 5:32pm

A simpler example occurred to me of a causal relationship without correlation. Every bounded function on reals has zero correlation with its first derivative.

For example, the current through a capacitor is proportional to the rate of change of applied voltage, and if "cause" means anything at all, the voltage causes the current.

···

--
Richard Kennaway, jrk@cmp.uea.ac.uk, http://www.cmp.uea.ac.uk/~jrk/
School of Computing Sciences,
University of East Anglia, Norwich NR4 7TJ, U.K.

Bill_Powers1 · May 19, 2009, 6:26pm

[From Bill Powers (2009.05.19.1155 MDT)]

A simpler example occurred to me of a causal relationship without correlation. Every bounded function on reals has zero correlation with its first derivative.

For example, the current through a capacitor is proportional to the rate of change of applied voltage, and if "cause" means anything at all, the voltage causes the current.

However, if the capacitor is discharging through a resistor, the current drain through the resistor, while being caused by the voltage applied to the resistor, is causing the voltage on the capacitor to decline.

Note that there are two functional pathways here forming a closed loop. Does that suggest a generalization? Are all closed loops acausal?

dE/dt = I/C

E = IR

from which we get two results:

dE/dt = (E/R)/C

therefore

dE/dt = E/(RC) (current is eliminated),

or

d(IR)/dt = I/C

therefore

dI/dt = I/RC (voltage is eliminated)

Since either E or I can be eliminated, neither is causal. The behavior of either variable, given the initial conditions, is a property of the loop, not of either part of it.

Best,

Bill P.

···

At 06:32 PM 5/19/2009 +0100, Richard Kennaway wrote:

rsmarken · May 19, 2009, 7:23pm

[From Rick Marken (2009.05.19.1220)]

Bill Powers (2009.05.19.1155 MDT)]

A simpler example occurred to me of a causal relationship without correlation. Every bounded function on reals has zero correlation with its first derivative.

For example, the current through a capacitor is proportional to the rate of change of applied voltage, and if “cause” means anything at all, the voltage causes the current.

However, if the capacitor is discharging through a resistor, the current drain through the resistor, while being caused by the voltage applied to the resistor, is causing the voltage on the capacitor to decline.

Note that there are two functional pathways here forming a closed loop. Does that suggest a generalization? Are all closed loops acausal?

I would say that each component of the loop is, indeed, causal. It’s just that the causal connection between variables in the loop will not should up as a correlation between the variables under certain circumstances (in particular, when the input is determined by a “second” disturbance – as in Richard’s demo – or a noise disturbance – as in our tracking tasks). So in the compensatory tracking experiment, the closed loop is represented by the two (linearized) simultaneous equations:

o = - k1 (r-i) and
i = k2 (o+d+dn)

So what Richard shows by simulation and what we have shown in real data is that the causal relationship between input (i) and output (o) does not show up as a correlation between i and o, which it should if r is constant; and which it does in simulations where dn (the “second” or noise disturbance) does not exist.

I think causality does imply correlation if the open loop model of behavior were correct. In that case o = k1(i) + e and a low correlation between i and o would imply that there is either no causal relationship between i and o or there is a large noise component (e) obscuring the relationship between i and o. So in the compensatory tracking task, one could say that the causal effect of i on o doesn’t show up as a correlation because there is a lot of noise (e) contributing to the output (o). But in my “repeated disturbance” version of the experiment, the nearly perfect correlation between outputs on two trials with the same disturbance shows that nearly all the variance in o is “predictable”; the causal relationship between i and o implies a very high correlation between i and o; but the correlation doesn’t show up, confirming Richard’s mantra: causality does not imply correlation.

Best

Rick

PS. I’m still working on the multiple regression version of the test. It’s close but no cigar yet, largely due ot my lousy programming skills. But I’ll get it.

···

At 06:32 PM 5/19/2009 +0100, Richard Kennaway wrote:
–
Richard S. Marken PhD

rsmarken@gmail.com

Bill_Powers1 · May 20, 2009, 8:47pm

[From Bill Powers (2009.05.20.1335 MDT)]

Martin Taylor 2009.05.20.15.05]

Maybe Richard K. can tell us what Martin was talking about. I don't think the angle he was talking about is the slope of the regression line, though that is what he seemed to be saying.

Oh, WOW! Do you ever misunderstand!!! After all the explanation I went through, and yet you can say something like that! How can this be possible?

You're always so surprised to find that your words have been ambiguous. Could it be that your explanations rather than your readers lack something? Or that you answered before understanding the question?

When I raised the question of the slope of the regression line, as I recall what my concern was then, I was wondering how the standard deviations and the correlation related to it since they appear in the equations for the regression line. You brought up the interpretation of the correlation coefficient as the cosine of the angle between two data vectors (which is what the wiki article you just cited seems to be about). I naturally assumed that you meant that this angle was the angle I was enquiring about, the angle whose tangent is the slope of the regression line.

The equation in question is (according to my Mathematics Manual)

Y = r[xy]*(Sigma[y]/Sigma)*(X - Xbar) + Ybar

where I've spelled out the symbol for sigma and used [xy] etc. to indicate subscripts. The slope of the regression line is

b = r[xy]*(Sigma[y]/Sigma),

which as you see contains the correlation coefficient and the two standard deviations.

According to the Wiki article (and you), r[xy] is the cosine of the angle between two vectors made up of various values of x and y. However, you're now boggling loudly at how anyone could have thought you meant the angle was related to the slope given as above. Since the latter is the angle I was enquiring about, I don't really see anything to boggle at but your answer, which apparently was a non-sequitur.

Clearly, in the limit where there is no randomness in the relationship at all, r[xy] is 1, and Sigma[y] and Sigma are zero, so to find the slope of the regression line requires something more than knowing that the cosine of some angle in hyperspace is 1, making the hypothetical angle zero. That still leaves you with 0/0 to cope with. I'm sure that when you go through all the derivatives approaching all the limits, you will find that the slope of the regression line is independent of the magnitude of the random noise added to the underlying relationship, while the correlation decreases as the noise magnitude increases. So whatever the "angle" related to the correlation may be, it is not the angle whose tangent is the slope of the regression line. I got the definite impression that you were saying it WAS that angle.

If you had said the opposite originally we would not be having this delightful discussion.

Best,

Bill P.

MartinT · May 21, 2009, 3:02am

[Martin Taylor 2009.05.20.23.00]

[From Bill Powers (2009.05.20.1335 MDT)]

Martin Taylor 2009.05.20.15.05]

Maybe Richard K. can tell us what Martin was talking about. I don't think the angle he was talking about is the slope of the regression line, though that is what he seemed to be saying.

Oh, WOW! Do you ever misunderstand!!! After all the explanation I went through, and yet you can say something like that! How can this be possible?

You're always so surprised to find that your words have been ambiguous. Could it be that your explanations rather than your readers lack something? Or that you answered before understanding the question?

When I raised the question of the slope of the regression line, as I recall what my concern was then, I was wondering how the standard deviations and the correlation related to it since they appear in the equations for the regression line. You brought up the interpretation of the correlation coefficient as the cosine of the angle between two data vectors (which is what the wiki article you just cited seems to be about). I naturally assumed that you meant that this angle was the angle I was enquiring about, the angle whose tangent is the slope of the regression line.

Right, and it wasn't. It had no relation to the slope of the regressions line. None.

According to the Wiki article (and you), r[xy] is the cosine of the angle between two vectors made up of various values of x and y. However, you're now boggling loudly at how anyone could have thought you meant the angle was related to the slope given as above. Since the latter is the angle I was enquiring about, I don't really see anything to boggle at but your answer, which apparently was a non-sequitur.

???

Martin

Bill_Powers1 · May 21, 2009, 9:46am

[From Bill Powers (2009.05.21.0322 MDT)]

Martin Taylor 2009.05.20.23.00 --

I naturally assumed that you meant that this angle was the angle I was enquiring about, the angle whose tangent is the slope of the regression line.

Right, and it wasn't. It had no relation to the slope of the regressions line. None.

Yes, that's clear, but you still don't seem to realize what I'm saying. This was before your "hyperspace triangles" discussions. We were talking about the meaning of correlation. I had notice that the slope of the regression line was given in my reference book as

sigma[y]
b = r[xy]---------
sigma

and I didn't see offhand how to compute b for the case of perfect correlation, in which r[xy] is 1 and the other two factors are -- what? Zero?. I was saying that the regression line's slope indicating how y was related to x was unrelated to the correlation, meaning that different correlations could be found in data sets with the same slope of the regression line. You seemed to arguing that the correlation did describe that slope. I evidently misunderstood you, but you never seemed to realize that I was talking about the slope, while you were talking about something else. The discussion was about something like "degree of relatedness" and you were saying that correlation shows the relationship between two variables even if there is no noise. The "relationship" you were thinking of (degree of relatedness) was evidently not the "relationship" I was asking about (the function relating y to x).

According to the Wiki article (and you), r[xy] is the cosine of the angle between two vectors made up of various values of x and y. However, you're now boggling loudly at how anyone could have thought you meant the angle was related to the slope given as above. Since the latter is the angle I was enquiring about, I don't really see anything to boggle at but your answer, which apparently was a non-sequitur.

???

By non-sequitur I mean you were answering a question I hadn't asked. It may be true that the correlation can be interpreted as the cosine of some hypothetical angle between two vectors in hyperspace, but the angle I was asking about was arctan(b), so your answer was irrelevant to the question.

Best,

Bill P.

Bill_Powers1 · May 20, 2009, 6:40pm

[From Bill Powers (2009.05.20.1109 MDT)]

Rick Marken (2009.05.19.1220) --

I think causality does imply correlation if the open loop model of behavior were correct. In that case o = k1(i) + e and a low correlation between i and o would imply that there is either no causal relationship between i and o or there is a large noise component (e) obscuring the relationship between i and o.

I think you have the final word on this. I've been confused by something Martin Taylor said about correlation coefficients showing the relationship between variables -- the cosine of the angle between them. I had always understood that correlations show how much noise there is in comparison with any regular relationship. I'm now leaning back toward that view. Maybe Richard K. can tell us what Martin was talking about. I don't think the angle he was talking about is the slope of the regression line, though that is what he seemed to be saying.

So in the compensatory tracking task, one could say that the causal effect of i on o doesn't show up as a correlation because there is a lot of noise (e) contributing to the output (o). But in my "repeated disturbance" version of the experiment, the nearly perfect correlation between outputs on two trials with the same disturbance shows that nearly all the variance in o is "predictable"; the causal relationship between i and o implies a very high correlation between i and o; but the correlation doesn't show up, confirming Richard's mantra: causality does not imply correlation.

That all makes excellent sense to me now. If the regression equation is y = ax + b, that is the underlying relationship of x to y, partially obscured by noise when the correlation is less than 1. Adding random noise does not change the slope or intercept of the relationship.

Best,

Bill P.

MartinT · May 20, 2009, 7:13pm

[Martin Taylor 2009.05.20.15.05]

[From Bill Powers (2009.05.20.1109 MDT)]

Rick Marken (2009.05.19.1220) --

I think causality does imply correlation if the open loop model of behavior were correct. In that case o = k1(i) + e and a low correlation between i and o would imply that there is either no causal relationship between i and o or there is a large noise component (e) obscuring the relationship between i and o.

I think you have the final word on this. I've been confused by something Martin Taylor said about correlation coefficients showing the relationship between variables -- the cosine of the angle between them. I had always understood that correlations show how much noise there is in comparison with any regular relationship. I'm now leaning back toward that view.

There's no contradiction. It's just a different way of looking at the relationship.

Maybe Richard K. can tell us what Martin was talking about. I don't think the angle he was talking about is the slope of the regression line, though that is what he seemed to be saying.

Oh, WOW! Do you ever misunderstand!!! After all the explanation I went through, and yet you can say something like that! How can this be possible?

Rather than get entwined in another of these interminable efforts to explain, or get Richard so entwined, I just point you to the relevant part of the Wikipedia article on correlation: <Correlation - Wikipedia;

And of course causality doesn't imply correlation, not does correlation imply causation. Nevertheless, when there is causation, one often finds correlation, and where there is correlation, one always finds either common influence or influence of the cause of one vaiable on the cause of the other.

As is so often the case on CSGnet discussions, there is a great danger here of confusing a probability less than unity with a probability equal to zero, or imperfect information with zero information.

Martin

rsmarken · May 20, 2009, 7:54pm

[From Rick Marken (2009.05.20.1250)]

Martin Taylor (2009.05.20.15.05) –

And of course causality doesn’t imply correlation

This might be obvious to you but I think that it would come as a pretty big surprise to most psychologists, especially those testing the statistical significance of the correlations they obtain in their research. When, based on the observed value of r obtained in research, researchers fail to reject the null hypothesis, they are failing to reject the idea that there is no relationship (causal or otherwise) between the variables. So when r is used as a basis for inference in statistical hypothesis testing, it is definitely assumed that causality implies correlation. That’s why my finding of no correlation between input and output in a tracking task is so troubling to conventional psychologists; it implies that input is not the cause of output. In conventional, open-loop psychology, it is definitely assumed that causality implies correlation and, therefore, that no correlation between input and output implies no causality.

Best

Rick

···

–
Richard S. Marken PhD
rsmarken@gmail.com

BRUCE_STEPHANIE_ABBO · May 20, 2009, 8:10pm

[From Bruce Abbott (2009.05.20.1610 EDT)]
Bill Powers (2009.05.20.1109 MDT) --

Rick Marken (2009.05.19.1220)

I think causality does imply correlation if the open loop model of
behavior were correct. In that case o = k1(i) + e and a low correlation
between i and o would imply that there is either no causal relationship
between i and o or there is a large noise component (e) obscuring the
relationship between i and o.

BP: I think you have the final word on this. I've been confused by something
Martin Taylor said about correlation coefficients showing the relationship
between variables -- the cosine of the angle between them. I had always
understood that correlations show how much noise there is in comparison with
any regular relationship. I'm now leaning back toward that view. Maybe
Richard K. can tell us what Martin was talking about. I don't think the
angle he was talking about is the slope of the regression line, though that
is what he seemed to be saying.

Bill, you might want to have a look at the Wikipedia entry for correlation,
which explains this cosine business. Here's the link:

I'm wondering whether the "cause does not imply correlation" issue relates
to something called a "suppressor variable" in multiple correlation. Perhaps
Richard K. could comment on that, too.

Bruce A.

MartinT · May 22, 2009, 4:09am

[Martin Taylor 2009.05.21.17.54]

[From Bill Powers (2009.05.21.0322 MDT)]

Martin Taylor 2009.05.20.23.00 –
  I naturally assumed that you meant that
this angle was the angle I was enquiring about, the angle whose tangent
is the slope of the regression line.
Right, and it wasn’t. It had no relation to the slope of the
regressions line. None.
Yes, that’s clear, but you still don’t seem to realize what I’m saying.
This was before your “hyperspace triangles” discussions. We were
talking about the meaning of correlation. I had notice that the slope
of the regression line was given in my reference book as
       sigma[y]
b = r[xy]---------
       sigma[x]
and I didn’t see offhand how to compute b for the case of perfect
correlation, in which r[xy] is 1 and the other two factors are – what?
Zero?. I was saying that the regression line’s slope indicating how y
was related to x was unrelated to the correlation, meaning that
different correlations could be found in data sets with the same slope
of the regression line.

Then why did you even mention: “I’ve been confused by something Martin
Taylor said about correlation coefficients showing the relationship
between variables – the cosine of the angle between them. I had always
understood that correlations show how much noise there is in comparison
with any regular relationship. I’m now leaning back toward that view.
Maybe Richard K. can tell us what Martin was talking about. I don’t
think the angle he was talking about is the slope of the regression
line, though that is what he seemed to be saying.”

To answer the question you say you asked: the other two factors are the
sqrt (variance of y) and sqrt (variance of x). That has nothing to do
with noise.

You seemed to arguing that the correlation did describe
that slope.

An interpretation that is extremely strange, since it is almost a
kindergarten piece of knowledge (in learning statistics) that the
correlation between two variables is independent of the slope of the
regression line, except that positive correlation means the regression
line has positive slope and negative correlation means the regression
line has negative slope. I wonder why you, who knew that, and who knew
I knew that, would contemplate the possibility that I would argue the
opposite.

This seems all of a piece with something that has been going on for
more than a decade: the validity of a mathematical statement seems to
depend not on the statement, but on who makes it. If Richard Kennaway
says that when the reference value is constant, output is a function of
disturbance, it’s a valuable insight. If I say it, I am a closet S-R
theorists and a fifth columnist. If I discuss the relation of the
correlation between two sets of data and the angle between the vectors
that represent the two datasets, and never mention the regression line,
and then take issue with a suggestion that I think this angle is the
slope of the regression line, I am ambiguous. This sort of thing has
been going on for so long (a decade and a half), and happens with such
high probability when I point out some implication of perceptual
control (whether it’s previously been accepted or not), that from time
to time I get VERY frustrated. It should not matter who writes the
equations or points out the implications. At some point, I will be
quite likely to please Rick in particular, and reorganize so that my
reference value for propagating and extending my understanding of PCT
will not result in outputs toward CSGnet.

It’s not long since I specifically suggested the possibility that if
there are ambiguities in my writing, it might be considered
uncharitable that every time it is the silly possibility that is taken
to be the meaning I intended.

I end with a quote from a message entitled “Alice and Kafka” that I
posted 11 years ago. I came across it serendipitously when looking for
where I got the idea for using the vector angle approach to correlation
in analyzing the maximum possible correlation between the disturbance
and the perceptual signal (see
for the actual argument; interestingly, this 1998 argument uses the
point that causality does not imply correlation, using Richard
Kennaway’s observation of a couple of days ago that a function and its
derivative are causally related but uncorrelated).
It’s not the only time I have been led to make similar comments over
the years before or after 1998. But the frequency of this kind of
interaction seems to have been more frequent and more probable in the
last year or so, and to tell the truth, I’m getting really fed up with
it. I have felt it necessary yesterday to tell a new researcher who
wants to learn about PCT to avoid CSGnet (I put him onto Warren
Mansell’s web site and will lend him some reading material).
[Martin Taylor 980228 22:40]
Martin

···

http://www.mmtaylor.net/PCT/Info.theory.in.control/Control+correl.html
Apparently, the PCT world is one in which what matters is not what is
said but who said it, where functions go from output to input, where
functions of several arguments are automatically multiple-valued, and
where what is true on Tuesday is false on Thursday.
I do not care to continue to try to discuss matters using technical and
logical arguments in a world in which such arguments are invalid. And
I do not care to substitute personal for technical argument. When
you and Rick can come to an agreement on which terminology I should use,
when you can distinguish a function’s arguments from its value, and
its output from its input, when you understand the difference between
“X is true” and “only X is true,” then perhaps there might be merit in
continuing.

MartinT · May 22, 2009, 4:17am

[Martin Taylor 2009.05.22.0010]

[From Rick Marken (2009.05.20.1250)]

Martin
Taylor (2009.05.20.15.05) –

And of course causality doesn’t imply correlation

This might be obvious to you but I think that it would come as a pretty
big surprise to most psychologists, especially those testing the
statistical significance of the correlations they obtain in their
research. When, based on the observed value of r obtained in research,
researchers fail to reject the null hypothesis, they are failing to
reject the idea that there is no relationship (causal or otherwise)
between the variables. So when r is used as a basis for inference in
statistical hypothesis testing, it is definitely assumed that causality
implies correlation. That’s why my finding of no correlation between
input and output in a tracking task is so troubling to conventional
psychologists; it implies that input is not the cause of output. In
conventional, open-loop psychology, it is definitely assumed that
causality implies correlation and, therefore, that no correlation
between input and output implies no causality.

You may be right. I have very little respect for the mathematical
intuition of a high proportion of psychologists. People who seriously
use statistical significance and “the null hypothesis” have even less
of my respect.

But don’t forget the flip side of this, that the fact of a non-zero
correlation between A and B does imply either that one variable leads
to the other (is one of the possibly many influences on the other,
where influence is often labelled “causation”), or that there is
something else, X, that influences both variables. Lack or correlation
between A and B does not imply lack of influence between them, but the
existence of non-zero correlation does imply the existence of
influence, whether it is of A on B or of X on both A and B.

Martin

rsmarken · May 24, 2009, 1:02am

[From Rick Marken (2009.05.23.1800)]

Bruce Abbott (2009.05.20.1610 EDT)]

I’m
wondering whether the “cause does not imply correlation” issue relates
to something called a “suppressor variable” in multiple correlation.

Martin Taylor (2009.05.22.0010)

But
don’t forget the flip side of this, that the fact of a non-zero
correlation between A and B does imply either that one variable leads
to the other (is one of the possibly many influences on the other,
where influence is often labelled “causation”), or that there is
something else, X, that influences both variables. Lack or correlation
between A and B does not imply lack of influence between them, but the
existence of non-zero correlation does imply the existence of
influence, whether it is of A on B or of X on both A and B.

What you both are suggesting is that the causal path from input to output can be recovered (from the near zero correlation between input and output that is observed in a closed loop task) by taking into account a possible “suppressor” or “third” variable that influences both input and output. This is a great suggestion because, if true, it would mean that the finding of a low correlation between input and output in a closed loop task does not demonstrate Kennaway’s mantra that “causality does not imply correlation” because the input-output correlation (implied by the causal effect of input on output) could then be recovered using standard statistical techniques, such as partial correlation (a first cousin to multiple regression). So causality would imply correlation as long as you know how to find the correlation that corresponds to the causality, which involves using what is basically multiple regression (the general linear model).

So I did a little research to test this and discovered, to my great glee, that it is apparently not true; even taking a “suppressor” variable into account, causality does not imply correlation in a closed loop task. But at first it looked like you guys might be right, which threw me into a brief depression because it meant that I would have to abandon using Kennaway’s mantra as the title of my next paper. But it all turned out well in the end and I even discovered a great new way to demonstrate the behavioral illusion.

Here’s how I did my research:

What we find in a closed-loop compensatory tracking task is the following:

r.di~0.0 r.io~0.0
d--------------->i------------->o
---------------------------------->

         r.do ~-0.99

where d is the disturbance variable, i is the input variable (cursor-target) and o is the output (in my case the mouse) variable, r.di is the correlation between disturbance and input, etc. It’s that near zero correlation between i and o (r.io~0) that’s the problem. The input is all the subject can see so it must be the cause of the subjects outputs; and control theory says that it is the cause of outputs. So i and o should be highly correlated but they are not.

Bruce and Martin suggest that the problem may be a “suppressor” or “third” variable that is correlated with both i and o. The obvious possibility is d, which is correlated with both i and o. So I did a partial correlation analysis which determine what the correlation between i and o would be if d (the suppressor or third variable) were held constant. The symbol for this partial correlation is r.io|d. It’s the correlation between i and o with d held statistically “constant”.

I won’t describe the calculations for the partial correlation; there are sites on the net that will calculate it for you based no your input correlations (I used http://faculty.vassar.edu/lowry/par.html). The correlations that I put into the analysis were obtained in my “Nature of Control” compensatory tracking task (http://www.mindreadings.com/ControlDemo/BasicTrack.html). Here are the correlations from my first run:

r.io = .003

r.di = .095
r.do = -.995

The resulting partial correlation is:

r.io|d =.98

This is where I started to get depressed. The correlation goes from .003 (when the disturbance is ignored) to .98! It looks like by taking the “third” variable (d) into account, the causal relationship between i and o is revealed in the partial correlation. Before gathering my few fans around me and getting out my emergency bottle of hemlock I decided to do another run just to make sure. Here are the results:

r.io = .23

r.di = .05

r.do = -.98

As you can see, I was quite shaken so my controlling wasn’t as good this time, as evidenced by the higher r.io and lower r.do correlations. Plugging these values into the partial correlation analysis I found:

r.io|d =.91

So, again, the partial correlation analyses (“partialing out” the disturbance effect on o) seemed to pick up the causal relationship between i and o.

As I was lifting the hemlock to my lips (for some reason the one fan I could find didn’t beg me to stop) an equation appeared in my mind’s eye:

o = k (r-i)

This is the causal relationship between i and o according to PCT. What this says is that when r is constant (as it is in the tracking task) the causal relationship between i and o is negative! So I realized that the causal relationship between i and o that is “revealed” in the partial correlation is precisely the opposite of the actual causal relationship between these variables. Indeed, if the causal relationship between i and o were positive (as per the partial correlation, r.io|d) there would be no control; there would be positive feedback.

Then, dropping the bottle of hemlock and shouting “Eureka” I realized that what the partial correlation analysis was producing was a version of the behavioral illusion!! The partial correlation, r.io|d, represents the causal effect of o on i, which is positive linear, not the causal effect of i on o, which is negative (and not necessarily linear).

At least this is what I think is going on. This is where Richard Kennaway comes in. I think I need a proof that the partial correlation, r.io|d, is actually a representation of the causal link from o to i (the feedback connection i = g(o)) and not a representation of the causal link from i to o.

I bet these results turn on the fact that i and o are in a closed loop: i causes o and o causes i. But all we observe with correlation is the relationship between i and o. The partial correlation analysis is based on the general linear model which, in this case, assumes:

o = k1 i + k2 d

Partial correlation solves for k1, which is equivalent to r.io|d. So the analysis assumes one way causality. But in the tracking task there is circular causality. So when this open-loop causal analysis is applied to a closed-loop situation, the result is a version of the behavioral illusion: the observed k1 (which is r.io|d) does not reflect the causal connection from i to o (as is implied by the formula) but, rather, the causal connection from o to i (which this analysis assumes does not exist).

I think if Richard Kennaway can come up with a nice, simple (so I can understand it) analysis of what’s going on here it would make a great paper entitled: Causation Does Not Imply Correlation. Because that’s what’s going on here. The negative causative connection from i to o does not show up in the observed correlation between i and o, even when d is “partialled out” (in which case what shows up is a positive correlation reflecting the causal connection between o and i). The idea would be to show that statistical models, like the general linear model, which assume an open loop connection between variables, gives misleading results when applied to an analysis of behavioral variables that occur in a closed loop of causation.

Best regards

Rick

···

–
Richard S. Marken PhD
rsmarken@gmail.com

MartinT · May 24, 2009, 3:55am

[Martin Taylor 2009.05.23.23.14]

[From Rick Marken (2009.05.23.1800)]

Bruce Abbott (2009.05.20.1610 EDT)]

I’m
wondering whether the “cause does not imply correlation” issue relates
to something called a “suppressor variable” in multiple correlation.

Martin Taylor (2009.05.22.0010)

But
don’t forget the flip side of this, that the fact of a non-zero
correlation between A and B does imply either that one variable leads
to the other (is one of the possibly many influences on the other,
where influence is often labelled “causation”), or that there is
something else, X, that influences both variables. Lack or correlation
between A and B does not imply lack of influence between them, but the
existence of non-zero correlation does imply the existence of
influence, whether it is of A on B or of X on both A and B.

What you both are suggesting is that the causal path from input to
output can be recovered (from the near zero correlation between input
and output that is observed in a closed loop task) by taking into
account a possible “suppressor” or “third” variable that influences
both input and output.

I made no such suggestion, nor can I see where you can get anything
remotely relating to it in what I wrote.
Let me repeat:

If there is zero correlation between A and B, you can’t say whether
there is any causal connection between A and B. There may be, or there
may not be.
The example used in Wikipedia to demonstrate that "if the variables
are independent then the correlation is 0, but the converse is not true
"
is to consider y = x^2, tested over a range of x symmetric about zero.
With or without noise, y has zero correlation with x. A better example
is the one I used to compute the maximum possible correlation between p
and d in the presence of control

···

http://www.mmtaylor.net/PCT/Info.theory.in.control/Control+correl.html

even taking a “suppressor” variable into account,
causality does not imply correlation in a closed loop task.

But at first it looked like you guys might be right, which
threw me into a brief depression because it meant that I would have to
abandon using Kennaway’s mantra as the title of my next paper. But it
all turned out well in the end and I even discovered a great new way to
demonstrate the behavioral illusion.

Here’s how I did my research:

What we find in a closed-loop compensatory tracking task is the
following:

r.di~0.0 r.io~0.0

d--------------->i------------->o

---------------------------------->
         r.do ~-0.99
where d is the disturbance variable, i is the input variable
(cursor-target) and o is the output (in my case the mouse) variable,
r.di is the correlation between disturbance and input, etc. It’s that
near zero correlation between i and o (r.io~0) that’s the problem. The input is all
the subject can see so it must be the cause of the subjects outputs;
and control theory says that it is the cause of outputs. So i and o
should be highly correlated but they are not.

Bruce and Martin suggest that the problem may be a “suppressor” or
“third” variable that is correlated with both i and o.

The obvious possibility is d, which is correlated with
both i and o. So I did a partial correlation analysis which determine
what the correlation between i and o would be if d (the suppressor or
third variable) were held constant. The symbol for this partial
correlation is r.io|d.
It’s the correlation between i and o with d held statistically
“constant”.

I won’t describe the calculations for the partial correlation; there
are sites on the net that will calculate it for you based no your input
correlations (I used http://faculty.vassar.edu/lowry/par.html).
The correlations that I put into the analysis were obtained in my
“Nature of Control” compensatory tracking task (http://www.mindreadings.com/ControlDemo/BasicTrack.html).
Here are the correlations from my first run:

r.io = .003

r.di = .095

r.do = -.995

The resulting partial correlation is:

r.io|d =.98

This is where I started to get depressed. The correlation goes from
.003 (when the disturbance is ignored) to .98! It looks like by taking
the “third” variable (d) into account, the causal relationship between
i and o is revealed in the partial correlation. Before gathering my few
fans around me and getting out my emergency bottle of hemlock I decided
to do another run just to make sure. Here are the results:

r.io = .23

r.di = .05

r.do = -.98

As you can see, I was quite shaken so my controlling wasn’t as good
this time, as evidenced by the higher r.io and lower r.do correlations. Plugging
these values into the partial correlation analysis I found:

r.io|d =.91

So, again, the partial correlation analyses (“partialing out” the
disturbance effect on o) seemed to pick up the causal relationship
between i and o.

As I was lifting the hemlock to my lips (for some reason the one fan I
could find didn’t beg me to stop) an equation appeared in my mind’s eye:

o = k (r-i)

This is the causal relationship between i and o according to PCT. What
this says is that when r is constant (as it is in the tracking task)
the causal relationship between i and o is negative! So I realized
that the causal relationship between i and o that is “revealed” in the
partial correlation is precisely the opposite of the actual causal
relationship between these variables. Indeed, if the causal
relationship between i and o were positive (as per the partial
correlation, r.io|d)
there would be no control; there would be positive feedback.

Then, dropping the bottle of hemlock and shouting “Eureka” I realized
that what the partial correlation analysis was producing was a version
of the behavioral illusion!! The partial correlation, r.io|d, represents the
causal effect of o on i, which is positive linear, not the causal
effect of i on o, which is negative (and not necessarily linear).

At least this is what I think is going on. This is where Richard
Kennaway comes in. I think I need a proof that the partial correlation,
r.io|d, is actually
a representation of the causal link from o to i (the feedback
connection i = g(o)) and not a representation of the causal link from i
to o.

I bet these results turn on the fact that i and o are in a closed loop:
i causes o and o causes i.

But all we observe with correlation is the relationship
between i and o. The partial correlation analysis is based on the
general linear model which, in this case, assumes:

o = k1 i + k2 d

Partial correlation solves for k1, which is equivalent to r.io|d. So the analysis
assumes one way causality. But in the tracking task there is circular
causality. So when this open-loop causal analysis is applied to a
closed-loop situation, the result is a version of the behavioral
illusion: the observed k1 (which is r.io|d) does not reflect the causal connection
from i to o (as is implied by the formula) but, rather, the causal
connection from o to i (which this analysis assumes does not exist).

I think if Richard Kennaway can come up with a nice, simple (so I can
understand it) analysis of what’s going on here it would make a great
paper entitled: Causation Does Not Imply Correlation. Because that’s
what’s going on here. The negative causative connection from i to o
does not show up in the observed correlation between i and o, even when
d is “partialled out” (in which case what shows up is a positive
correlation reflecting the causal connection between o and i). The idea
would be to show that statistical models, like the general linear
model, which assume an open loop connection between variables, gives
misleading results when applied to an analysis of behavioral variables
that occur in a closed loop of causation.

Best regards

Rick

–

Richard S. Marken PhD

rsmarken@gmail.com

rsmarken · May 24, 2009, 6:45pm

[From Rick Marken (2009.05.24.1145)]

Martin Taylor (2009.05.23.23.14) –

Rick Marken (2009.05.23.1800)–

What you both are suggesting is that the causal path from input to
output can be recovered (from the near zero correlation between input
and output that is observed in a closed loop task) by taking into
account a possible “suppressor” or “third” variable that influences
both input and output.

I made no such suggestion, nor can I see where you can get anything
remotely relating to it in what I wrote.

Yes, I can see that that’s the case. Sorry, misread you. Nevertheless, it led to a very interesting (to me) analysis that showed the problem of using open loop models to analyze data obtained from closed loop systems.

even taking a “suppressor” variable into account,
causality does not imply correlation in a closed loop task.

Or anywhere else.

Wow. Well, that would explain why you don’t seem to like doing research. But why do all these other people do research? Perhaps you mean that the causality may be obscured by noise so it won’t show up in a correlation. But that’s the point of analyses like partial correlation and multiple regression. These analyses are aimed at finding the variables that account for the unexplained variance (noise) that is presumably preventing the causal relationship from being seen in the correlation. If causality did not imply correlation (which is simply an observed relationship between variables) then there would really be no reason to do research.

Since when control is good, d and o are highly (negatively) correlated,
you can use an analysis very like the one in the page referenced above
to show what the maximum (negative) correlation can be between i and o.
It is roughly -1/CR where CR is the control ratio (negative because the
output is compensating for changes in the disturbance that are
reflected in the small deviations of the input from the reference).

I just ran a control model to see what the relationship is between -1/CR and r.io. Here’s my results:

-1/CR -1.0 -.07 -.028

r.io -.99 -.999998 -.999999992

It does not look to me like r.io is proportional to -1/CR.

Perhaps the problem is that these results were collected from a model without noise (that second disturbance that Richard Kennaway mentioned). It turns out that this noise is crucial to getting model results with r.io close to 0.0 (as is observed in the human studies). So I repeated the runs of the control model with a dash of noise (about 15% of the disturbance amplitude) added and adjusting the gain to get measures of control (-1/CR) equivalent to those found with the noiseless model. Here are the results:

-1/CR -1.0 -.07 -.028 -.014

r.io -.05 -.15 -.08 .12

These r’s are averages of several runs that produced nearly the same CR values; I added an extra -1/CR value to show that the observed r.io starts going positive more often (due to the greater influence of noise on the controlled variable, I think) when control gets very good (which is what a large CR value means). These results look a little better for your -1/CR idea; r.io does tend to be less negative as control improves. I think it might help if you included noise (second disturbance) in your derivation since the -1/CR rule doesn’t even come close to working unless there is noise in the system. You rule should, perhaps, include a term representing the assumed signal to noise ratio for the variance in the controlled variable (i)

One last little interesting nail in the coffin for conventional methodology. If behavior were actually noise-free, the correlations you would get in a closed-loop control task, like the compensatory tracking task, would look like this:

r.io = -.997
r.do = -.999
r.di = .996

If you put there correlations into a partial correlation analysis, the partial correlation between i and o (r.io|d) goes to -.5; so the partial correlation incorrectly reduces the correlation between i and o when d is “factored out” statistically; in fact, r.io|d should approach -1.0 when d is factored out because i is the actual cause of o. Also interesting is that the partial correlation between d and o (r.do|i) goes down slightly, to -.867; in fact, it should approach 0.0 when i is factored out since it is i, not d, that is the actual cause of o. Finally, the partial correlation between d and i, r.(di|o) goes to -.001, which also makes no sense; if the effect of o on i is factored out then the correlation between d and i should approach 1.0.

What this all suggests to me is that conventional statistical analysis, which includes partial correlation and multiple regression, will give misleading or incorrect results when applied to the study of the behavior of a closed loop system. This, I believe, is because these analyses are based on a open-loop causal model of behavior. What is interesting is that these analyses give the wrong results even if all the proper variables are measures and the measurements are nearly noise free. You’ve just gotta have the right model… or the will to imagine that a closed-loop system can be turned into an open-loop one by preventing it’s output from having an effect on its input;-)

Best regards

Rick

···

–
Richard S. Marken PhD
rsmarken@gmail.com

MartinT · May 24, 2009, 10:01pm

[Martin Taylor 2009.05.24.14.52]

[From Rick Marken (2009.05.24.1145)]

Martin Taylor
(2009.05.23.23.14) –

Rick Marken (2009.05.23.1800)–

What you both are suggesting is that the causal path from input to
output can be recovered (from the near zero correlation between input
and output that is observed in a closed loop task) by taking into
account a possible “suppressor” or “third” variable that influences
both input and output.

I made no such suggestion, nor can I see where you can get anything
remotely relating to it in what I wrote.

Yes, I can see that that’s the case. Sorry, misread you. Nevertheless,
it led to a very interesting (to me) analysis that showed the problem
of using open loop models to analyze data obtained from closed loop
systems.

even taking a “suppressor” variable into
account,
causality does not imply correlation in a closed loop task.

Or anywhere else.

Wow. Well, that would explain why you don’t seem to like doing
research. But why do all these other people do research? Perhaps you
mean that the causality may be obscured by noise so it won’t show up in
a correlation.

No. I mean exactly what I said, what the Wikipedia article points out,
and what Richard Kennaway demonstrated with his capacitor example.

···

http://www.mmtaylor.net/PCT/Info.theory.in.control/Control+correl.html

One last little interesting nail in the coffin for conventional
methodology. If behavior were actually noise-free, the correlations you
would get in a closed-loop control task, like the compensatory tracking
task, would look like this:

r.io = -.997

r.do = -.999

r.di = .996

Bill_Powers1 · May 24, 2009, 11:25pm

[From Bill Powers (200io./05.24.1720 M<DT)]

[Martin Taylor 2009.05.24.14.52]

Are we talking about the same
thing? I’m assuming your “i” is perception.

“i” is the “input” to the controlled variable – that
is, it’s the disturbance as we usually define it. I think this happened
because of using someone else’s equation and preserving the notation.
Richard Kennaway introduced “d2”, a second disturbance of the
CV that is not taken into account but which causes random fluctuations in
the controlled variable and destroys the perfect correlation otherwise
expected.

Best,

Bill P.