A TABLE SHOWING THE RELATIONSHIP BETWEEN

SEVERAL DESCRIPTIVE STATISTICS*

## ···

____________________________________________

r r2 k2 k E

____________________________________________

1.00 1.00 .00 .00 100 %

.9995 .999 .001 .032 97 %

.9987 .997 .003 .054 95 %

.995 .99 .01 .099 90 %

.954 .91 .09 .299 70 %

.90 .81 .19 .435 56 %

.87 .756 .244 .493 51 %

.865 .748 .252 .50 50 %

.80 .64 .36 .60 40 %

.75 .56 .44 .66 34 %

.71 .50 .50 .70 30 %

.65 .42 .58 .76 24 %

.60 .36 .64 .80 20 %

.55 .30 .70 .83 17 %

.50 .25 .75 .87 13 %

.45 .20 .80 .89 11 %

.40 .16 .84 .92 8 %

.35 .12 .88 .94 6 %

.30 .09 .91 .95 5 %

.25 .06 .94 .97 3 %

.20 .04 .96 .98 2 %

.15 .02 .98 .99 1 %

.10 .01 .99 .995 0 %

.00 .00 1.00 1.00 0 %

DEFINITION AND INTERPRETATION OF THESE STATISTICS**

All of these measures describe two variables (X, Y)

within a particular sample:

r is a correlation (or coefficient of

correlation) which describes the linear association of

one variable with another. It can also be

characterized as "... a relative measure of the degree

of association between two series " of values for two

variables. It varies between 1 (perfect positive

correlation) to -1 (perfect negative correlation).

The closer this measure is to a perfect correlation

the more confidence one has in "predicting" the values

of one variable from another variable.

r2 is a measure of "explained" variance (or

coefficient of determination) which describes "shared"

variation or the amount of variance that one variable

is "explained" by the other variable or the proportion

of the sum of y2 that is dependent on the regression

of Y on X. The larger the numerical value of this

measure the more confidence one has in "predicting"

the values of one variable from another.

k2 is a measure of "unexplained" variance (or

coefficient of nondetermination) which describes

"unshared" variation or the amount of variance that

one variable is NOT "explained" by the other variable

or the proportion of the sum of y2 that is independent

of the regression of Y on X. The smaller the

numerical value of this measure the more confidence

that one has in "predicting" the values of one

variable from another.

k is a measure (called coefficient of

alienation) which describes the lack of linear

association of one variable with another or the ratio

of the standard error of estimate to the standard

deviation of the variable. The smaller the numerical

value of this measure the more confidence one has in

"predicting" the values of one variable from another.

E this measure is computed by (1-k)100 and is called

an "index of forecasting efficiency" (Downie and

Heath, 1965: 226) and indicates the "improvement" for

a prediction by knowing the coefficient of correlation

(r) for two variables as contrasted with knowing

nothing about the linear association of the two

variables. For example, with a coefficient of

correlation of .71 one can "predict" the values of one

variable from another 30% better (on the average) than

one could "predict" those values WITHOUT any knowledge

of the relationship between the two variables OR one

has decreased the size of the "error of prediction" by

30% (on the average) by knowing that the correlation

of the two variables is .71.

REFERENCES

Arkin, Herbert and Raymond R. Colton. 1956.

Statistical Methods. College Outline series, Forth

Edition, Revised.

Downie, N. M. and R. W. Heath. 1965. Basic Statistical

Methods. Second Edition. New York: Harper and Row.

______________________________

*compiled by Charles W. Tucker with the encouragment and

assistance of the Control Systems Group CSG-L @ UIUCVMD

(especially Gary Cziko) and the comments of Jimy Sanders.

Other comments appreciated - N050024 AT UNIVSCVM.BITNET

**It should be noted that these descriptions and

interpretations, especially those involving

"predictions" are limited to a particular sample; if

another sample is not a random sample from the same

population then predictions about the other variable

("Y") will be unpredictably worse than the original

sample.