An astrologer among the PCTers

[From Bruce Gregory (97111610 EST)]

Bruce Abbott (971119.1550 EST)]

>Rick Marken (971118.0910)

Nice gag Rick, but I saw through it at once. Bill assures me
that Bruce Abbott really does understand PCT. This is obviously
your attempt to make him look like a died in the wool S-R man. I
wasn't taken in for a minute.

All we have learned at this
stage of the research is that pay does have an influence on task rating, an
influence that emerges on average over a range of values of a host of other
variables, and that ratings tend to be higher when pay is lower. Given that
this unexpected result is what was predicted by cognitive dissonance theory,
it supports (_not_ "confirms") the theory. Given this result, followup
research would then begin to identify other variables (including subject
variables) whose values influence the nature of this relationship.

I recognize your fine hand at work Rick. We should be
sympathetic toward the rich because they hate their work so
much. On the other hand, the poor are drawing immense psychic
benefits. They should be paying _higher_ taxes. I can hear those
darkies singing in the fields. God! Those _were_ the good days!

Bruce

[From Bruce Abbott (971119.1550 EST)]

Rick Marken (971118.0910)

All psychological theories are, like PCT, theories of individual
behavior. These theories purport to describe processes in the
individual that give rise to visible behavior. Yet nearly all of
these theories are tested against group data. If the group averages
go (significantly) in the "right" direction (based on qualitative
"predictions" derived from the theory) then the theory is confirmed.
For example, if the average rating of liking for a task is higher
for a group that was paid little for doing a task than for a
group that was paid a lot, then a prediction of cognitive dissonance
theory is confirmed. Never mind that many individuals rated the high
pay task higher than the low pay task -- disconfirming the theory
at the individual level. It's the average results that matter.

The statistical theory behind this method is as follows. The basic logic of
an experiment is to manipulate some variable and observe some other variable
thought to be affected by it, while holding all other variables constant.
(The variable manipulated by the experimenter is called the independent
variable; the variable observed while the independent variable is being
manipulated is called the dependent variable.) In the experiment Rick
describes, the amount paid to a participant for doing the task is the
independent variable and the participant's rating of how well he or she
liked performing the task is the dependent variable. Efforts would have
been made hold constant all other variables that might have affected this
dependent measure. If participants rate the task higher when they are paid
a small amount than when they are paid more, then the only possible
explanation for this result (since all other variables have been held
constant) is the difference in pay.

Unfortunately, individuals are, well, individuals. They differ in many
ways, from height and weight to hair color to the detailed functioning of
their brains. Most of these individual differences cannot be eliminated,
and some have a strong impact on whatever behavior is being measured in an
experiment. In the cognitive dissonance experiment, for example,
individuals may like the task to a greater or lesser degree for reasons
having nothing to do with how well they were paid for doing it. If one
person likes the task better than another, to what extent is this due to
their individual differences on those extraneous variables and to what
extent is this due to the difference in the amount they were paid for doing
the task? We can't tell.

So this is where statistical theory comes in. If participants have been
assigned _at random_ to the two groups (one receiving the lower pay and one
receiving the higher pay), statistical theory says that the two groups will
on average be essentially equivalent on any extraneous variable you care to
mention, most of the time. Let's assume that the independent variable
(amount paid) has _no effect_ on liking for the task. Even if randomization
fails to equalize the two groups on some extraneous variable that affects
liking, statistical theory allows one to evaluate the probability that a
given difference between the two groups in average liking for the task will
emerge _by chance_. If this probability turns out to be low enough, then
one may reject chance as the explanation for the observed difference. For
example, if the observed difference between the two groups in their average
ratings of the task could have occurred by chance less than one time in 20
experiments, then one might be willing to conclude that this is not that one
time (the odds are 19 to 1 in your favor). If the observed difference is
unlikely to have occurred by the chance failure of extraneous variables to
balance out across groups, then the only remaining explanation for the
difference is that it is due to an influence of the independent variable.
Such a difference is said to be "statistically significant."

The conclusion would be that the amount of pay given for performing the task
does influence liking of the task. There is a small chance that this
conclusion is incorrect (it may indeed have occurred through a failure of
randomization to balance extraneous variables across groups), but a
significant finding usually means followup research in which such a fluke
would quickly become evident.

So, in all liklihood, ratings tend to be higher when pay is low than when
pay is high. Does this mean that every individual would rate the task more
highly if paid low than if paid high? No. The influence of pay on liking
in any given individual probably will depend on a host of factors (those
pesky individual differences again). At certain values of those other
factors pay may have a strong influence. At other values it may have little
or no influence, or even a reverse influence. All we have learned at this
stage of the research is that pay does have an influence on task rating, an
influence that emerges on average over a range of values of a host of other
variables, and that ratings tend to be higher when pay is lower. Given that
this unexpected result is what was predicted by cognitive dissonance theory,
it supports (_not_ "confirms") the theory. Given this result, followup
research would then begin to identify other variables (including subject
variables) whose values influence the nature of this relationship.

The fact that some individuals in the high-pay group gave higher ratings
than some in the low-pay group has no bearing on supporting or disconfirming
the theory, because in these individual cases the benefits of averaging
across cases are lost (extraneous variables may be at work in addition to
any effect of the independent variable in the individual case). A low
rating by an individual in the low-pay group may actually be higher than it
would have been if the same individual had been in the high-pay group instead.

I bring this up because I have been reading about research on
addiction and virtually all of this research tests theories of
individual behavior using group data. Thus, practitioners, who
deal with _individual_ addiction problems, have only group-data
based conclusions to guide their work. This stikes me as being
a rather unfortunate state if affairs.

What has been missing in traditional statistical analyses (and this is now
being corrected) is some measure of the strength of the effect observed,
relative to variation contributed by extraneous variables. Given a large
enough sample size, it is possible for independent variables having
relatively weak effects on some dependent measure to yield statistically
significant differences. Are these effects strong enough in most
individuals to have practical implications for treatment? Quite possibly
no. An analysis of the strength of the effect is needed in order to answer
that question, and too often in the past this has not been done. But that's
not the fault of the experimental design.

In addition, practitioners have treated results from such group-based
designs as if they applied universally, but in fact the effectiveness of a
variable on some measure may depend strongly on other, individual
differences. Again, this is not an inherent fault of the experimental
design. The problem of individuals responding differently to a treatment
than "the average person" is a real one but it simply comes down to
identifying all the effective variables and their interactions, and this
requires additional experiments in which these other variables are assessed
for their effects. Inability to predict individual response comes down to
incomplete knowledge of the relevant factors, not inappropriateness of a
given experimental design. Group-based designs may foster
overgeneralizations of the type Rick points out (e.g., "Pices are
sensitive") but if so, that is the fault of those who overinterpret the
results and not of the designs themselves. For an average effect of an
independent variable to emerge, the variable must influence behavior at the
individual level to a large enough extent, across a large variety of
combinations of subject variables, to create a statistically significant
difference across groups or treatments. It may not have that same effect on
everyone, but it certainly does in enough cases to make it worth noting -- a
nice catch in the casted net.

Regards,

Bruce

[From Rick Marken (971119.1400)]

Bruce Gregory (97111610 EST) re: "Bruce Abbott (971119.1550 EST)"

Nice gag Rick, but I saw through it at once.

Darn! I was just about to post a reply to my little canard to make
it even more convincing, but I guess that's no longer necessary. I
want to apologize to the real Bruce Abbott for making him look like
a pompous apologist for the statistical S-R approach to understanding
human nature. I'm sure he's busy right now writing up that paper on
how to test for the variables controlled by individual organisms;-)

Best

Rick

···

--
Richard S. Marken Phone or Fax: 310 474-0313
Life Learning Associates e-mail: rmarken@earthlink.net
http://home.earthlink.net/~rmarken

[From Bill Powers (971119.1428 MST)]

Bruce Abbott (971119.1550 EST)--

I'm not much surprised, but I am disappointed at your defense of the
application of group statistics to individual characteristics. You've been
part of this discussion for some years now; haven't you heard anything
that's been said about this subject by various members of this group?

You say

For an average effect of an
independent variable to emerge, the variable must influence behavior at the
individual level to a large enough extent, across a large variety of
combinations of subject variables, to create a statistically significant
difference across groups or treatments. It may not have that same effect on
everyone, but it certainly does in enough cases to make it worth noting -- a
nice catch in the casted net.

The problem is that what you end up with is a characteristic of a group,
not of an individual. At any given time, some members of the group may show
the dependent effect and while others do not. At a different time,
different members may show the effect, while a different group does not.
There is absolutely no evidence that this effect is shown consistently over
time by any individual. To show that, you would have to study each
individual over time. And if you did study the individual over many trials,
you would soon find that _all_ individuals caught on to the fact that a low
evaluation led to high pay.

There is absolutely no evidence that what a person says he feels about a
task is what he feels about the task. When pay is involved, a person is
likely to give the response that he thinks will earn the most money. A
balanced design will do the experimenter no good: if the subjects figure
(individually or in collusion outside the experiment) that the most
unlikely evaluation will give the greatest pay (otherwise why would this
experiment be done?) the answers will be skewed just as if more pay really
went with less satisfaction.

But all that's a minor quibble compared with the most glaring fault of this
supposed "evidence in favor of the theory." In fact, the theory is
disconfirmed by every individual who did not behave as predicted. It's no
good to protest, "Yeah, but look at all the people who _did_ behave as
predicted." If the theory can't predict _which_ people will behave as
predicted and which will not, it's no theory at all -- at least it's not a
theory of individual behavior. Newton's Universal Law of Gravitational
Attraction would be disproven by a single valid observation of a mass that
fell upward. Pointing to all objects that fall downward wouild not save the
law: it would no longer be universal. And not being universal, it could not
be applied in any specific case until you had discovered what was different
about the mass that fell upward.

Of course you can counter that by saying that if one object in a million
behaved anomalously, you would still be justified in using the Law whenever
you liked, the number of failures being insignificant. It might not be a
truly Universal law, but it would certainly be a practically usable one.

So the question is, what kind of law is it that comes out of this study? Is
it one of those findings that is so robust that one could use it in dealing
with individuals with little risk of a wrong prediction? To answer that
question we have to look at the results of the experiment. Rick has
evidently looked at them; perhaps you might want to double-check him. And
Richard Kennaway, who has analyzed statistical findings on some detail,
might do us the favor of commenting on this.

What we need to know are the chances of predicting incorrectly a person's
evaluation of a task for which the pay is high or low. To determine this,
we need to know how many people behaved in the expected way, and how many
in the other way. Those who understand statistics can then compute the
chances that a prediction will be wrong for a given individual.

And Bruce, why this knee-jerk defense of what is obviously a heinous
scientific crime?

Best,

Bill P.

[From Bruce Abbott 971119.2000 EST)]

Bill Powers (971119.1428 MST) --

Bruce Abbott (971119.1550 EST)

I'm not much surprised, but I am disappointed at your defense of the
application of group statistics to individual characteristics. You've been
part of this discussion for some years now; haven't you heard anything
that's been said about this subject by various members of this group?

It's not a defense of the application of group statistics to individual
characteristics, and that's where you and others on CSGnet are missing the
boat. It's a defense of the appplication of group statistics to the
identification of influences on behavior (casting nets). I just wanted to
point out that this use of group methods is legitimate, even though it can
be taken only so far. It would be easy to conclude on the basis of Rick's
piece (and your followup to it) that these methods are worthless for this or
any other purpose. "The method that can most reliably produce an unending
output of superstitions is known as Analysis of Variance" is a case in
point. The problem lies not with the method but with the fact that its
results are often misinterpreted.

You say

For an average effect of an
independent variable to emerge, the variable must influence behavior at the
individual level to a large enough extent, across a large variety of
combinations of subject variables, to create a statistically significant
difference across groups or treatments. It may not have that same effect on
everyone, but it certainly does in enough cases to make it worth noting -- a
nice catch in the casted net.

The problem is that what you end up with is a characteristic of a group,
not of an individual. At any given time, some members of the group may show
the dependent effect and while others do not. At a different time,
different members may show the effect, while a different group does not.
There is absolutely no evidence that this effect is shown consistently over
time by any individual. To show that, you would have to study each
individual over time.

What is shown in such an evaluation is that behavior differed on average
across the levels of the independent variable, under the conditions of the
study, when other factors are unlikely to account for the difference. This
indicates that the independent variable very likely has some influence on
the behavior that was observed and recorded, in the sample of individuals
tested. It does not indicate that this variable was effective in every
individual tested, nor does it show that this variable would have an effect
consistently over time in the same individual. Nevertheless, it is a
variable having an influence on this dependent measure in enough
individuals, at the time of testing, to be detected by the procedure.

And if you did study the individual over many trials,
you would soon find that _all_ individuals caught on to the fact that a low
evaluation led to high pay.

Let's at least get the experiment right. A low evaluation did not lead to
low pay. Low pay tended to lead to a higher evaluation, after the fact.

There is absolutely no evidence that what a person says he feels about a
task is what he feels about the task. When pay is involved, a person is
likely to give the response that he thinks will earn the most money. A
balanced design will do the experimenter no good: if the subjects figure
(individually or in collusion outside the experiment) that the most
unlikely evaluation will give the greatest pay (otherwise why would this
experiment be done?) the answers will be skewed just as if more pay really
went with less satisfaction.

I agree that we can't be sure that what a person tells you he or she feels
is what he or she really feels, but in this case there is no biasing
incentive that would lead the participants to lie about it more in one
condition than in another. When they were asked to rate (among other
things) how well they liked the work, they had already been paid and were
not expecting more. Anyway, this is aside from the statistical argument.

But all that's a minor quibble compared with the most glaring fault of this
supposed "evidence in favor of the theory." In fact, the theory is
disconfirmed by every individual who did not behave as predicted.

In this study there is _no_ individual who can be said _not_ to behave as
predicted. As I pointed out in my post, if a person in the low-pay
condition gave a low liking-rating, there is no way to tell whether it
wouldn't have been even lower in the high-pay condition (baring a floor effect).

It's no
good to protest, "Yeah, but look at all the people who _did_ behave as
predicted." If the theory can't predict _which_ people will behave as
predicted and which will not, it's no theory at all -- at least it's not a
theory of individual behavior. Newton's Universal Law of Gravitational
Attraction would be disproven by a single valid observation of a mass that
fell upward. Pointing to all objects that fall downward wouild not save the
law: it would no longer be universal. And not being universal, it could not
be applied in any specific case until you had discovered what was different
about the mass that fell upward.

Again, my argument was not in favor of using group data to predict
individual behavior. What I argued is that group procedures can be used to
identify variables that are effective (in at least most of those tested in
the experiment, under the test conditions). As to whether a theory is a
theory or "no theory at all" if it cannot predict individual behavior, I
suggest that by your definition atmospheric physics offers no theory at all
of the weather.

Of course you can counter that by saying that if one object in a million
behaved anomalously, you would still be justified in using the Law whenever
you liked, the number of failures being insignificant. It might not be a
truly Universal law, but it would certainly be a practically usable one.

I'm not arguing that the discovery of a relationship between levels of a
variable and average performance constitutes a universally-applicable Law of
behavior, although clearly you are imagining that I am. (As a reality check
you might try re-reading my post. You won't find any such argument there.)
However, I did provide a valid reason why the apparent "exceptions" found by
looking at individual scores within a group do not say anything about the
possible effect of the independent variable on that individual. Apparently
you overlooked this.

So the question is, what kind of law is it that comes out of this study? Is
it one of those findings that is so robust that one could use it in dealing
with individuals with little risk of a wrong prediction? To answer that
question we have to look at the results of the experiment. Rick has
evidently looked at them; perhaps you might want to double-check him. And
Richard Kennaway, who has analyzed statistical findings on some detail,
might do us the favor of commenting on this.

What came out of this study was a relationship -- confirmed in a large
number of followup studies -- between two variables that can be observed in
more-or-less normal college students when other factors are statistically
equated. A lot more would have to be known about other interacting factors
before one would be in a position to use this information for individual
prediction.

What we need to know are the chances of predicting incorrectly a person's
evaluation of a task for which the pay is high or low. To determine this,
we need to know how many people behaved in the expected way, and how many
in the other way. Those who understand statistics can then compute the
chances that a prediction will be wrong for a given individual.

This assumes that this variable alone would be used for this purpose,
without any further investigation of conditions under which, in the
individual, it would be expected to influence the person's rating of the
task in the predicted way. Geez, Bill, I _agree_ that it probably isn't
much use by itself. I never argued that it would be.

And Bruce, why this knee-jerk defense of what is obviously a heinous
scientific crime?

I am not defending any "heinous scientific crimes," Bill. I certainly do
agree, however, that there has been a lot of knee-jerking going on in
response to my post. At the beginning of your post you said you were
disappointed in me. If anyone has a right to feel disappointed, I do. I
expected a more careful reading on your part of what I had to say than you
evidently gave. You merely used my post as an excuse to trot out all your
old arguments against a position I wasn't even defending.

Regards,

Bruce

[From Rick Marken (971119.1900)]

Bruce Abbott 971119.2000 EST) --

It's not a defense of the application of group statistics to
individual characteristics, and that's where you and others
on CSGnet are missing the boat. It's a defense of the
appplication of group statistics to the identification of
influences on behavior (casting nets).

What do you mean by "influences on behavior"? The only thing
group statistics identifies is influences on _group_ behavior;
they don't identify any influence on the behavior of any
individual in the group. The "cognitive dissonance" experiment
I described reveals nothing about the behavior of an individual.
But it was used as a test of "cognitive dissonance" theory -- a
theory of _individuals_. The theory says that an individual will
experience cognitive dissonance under the conditions of the
experiment and behave in a particular way. But the experiment did
not show that each individual behaved according to the predictions
of the theory; it showed that the group averages did (statistically).
Yet the experimenter (and you) say that the results of this study
_support_ the cognitive dissonance theory of individuals.This is
the "criminal approach to science you are defending.

It would be easy to conclude on the basis of Rick's piece (and
your followup to it) that these methods are worthless for this or
any other purpose.

Thanks for the help but I think I made it very clear that I was
saying only that group methods are worthless if one is interested
in testing theories of _individuals_. In case that wasn't clear,
let me say that I am a big fan of statistical sampling methods
for the study of groups and I think the Republican objection to
the use of such methods in the next census is another sexample of
their pathetic greed and stupidity.

"The method that can most reliably produce an unending output
of superstitions is known as Analysis of Variance" is a case in
point. The problem lies not with the method but with the fact
that its results are often misinterpreted.

The only correct interpretation of the results of an ANOVA is
in terms of groups; the ANOVA tells you the probabity that the
observed ratio of between to within groups variance (F) was drawn
from a population of _groups_ with a mean F ratio of 1.0. That's it.
It says nothing at all about the individuals that make up these
groups. Use of this group approach to test theories of individuals
is (you got it) criminal.

Again, my argument was not in favor of using group data to predict
individual behavior. What I argued is that group procedures can
be used to identify variables that are effective.

Effective at what? Influencing group behavior? If so, fine. But
every research report I've seen concludes that these variables
influence _individual_ behavior. Besides, these experiments are
being done to test _theories of individual behavior_ which is
completelty inappropriate.

As to whether a theory is a theory or "no theory at all" if it
annot predict individual behavior, I suggest that by your definition
atmospheric physics offers no theory at all of the weather.

And an even better example would be electronics, which cannot predict
the behavior of individual electrons. But did you notice how well
these theories (particularly electronics) predict the "group" data.
If the people doing the group experiments in psychology were able
to predict their statistical results -- means, variances, proportions,
etc -- as well as electronics can predict the mean
rate of movement of the electrons in a wire, then they could
dress up and play real scientist. As it is, they'll just have
to keep wearing strips and stay the hell out civilized society;-)

Wurst

Rick

···

--

Richard S. Marken Phone or Fax: 310 474-0313
Life Learning Associates e-mail: rmarken@earthlink.net
http://home.earthlink.net/~rmarken/

[From Bruce Abbott (971120.0020 EST)]

Rick Marken (971119.1900)

Bruce Abbott 971119.2000 EST) --

It's not a defense of the application of group statistics to
individual characteristics, and that's where you and others
on CSGnet are missing the boat. It's a defense of the
appplication of group statistics to the identification of
influences on behavior (casting nets).

What do you mean by "influences on behavior"? The only thing
group statistics identifies is influences on _group_ behavior;
they don't identify any influence on the behavior of any
individual in the group.

The independent variables under test in group-based designed to not exert
their influences (if any) on groups. Rather, they exert their influences on
individuals, and relatively consistent influences add up to produce group
differences in mean performance on the dependent variable. That is,
differences in group behavior emerge from relatively consistent differences
in individual behavior across the levels of the independent variable. The
variables identified as effective in such designs are thus variables that
influence individual performance.

Imagine that you had mixed two dozen small batches of cookie dough, each
with a somewhat different mix of ingredients. You randomly assign half of
the batches to one of two conditions and the other half to the other
condition. One condition calls for baking the cookies at 300 degrees F and
the other at 375 degrees F. (Both batches are baked for 20 minutes.) After
baking the cookies, you measure the brownness of each cookie on a
quantitative scale. You then compute the average brownness of the cookies
baked under each temperature condition. You find that the cookies baked at
375 are on average a darker brown than those baked at 300, and that this
difference is statistically significant, and conclude that baking
temperature affects the brownness of the cookies. However, within each
group, some cookies are lighter or darker than others. Thus it is easy to
see that the color-response to baking at a given temperature varies somewhat
with the composition of the cookie.

I conclude that cookies baked at the higher temperature tend to darken more
than those baked at the lower temperature. I do not conclude that this is
necessarily true of every cookie (composition also appears to be an
important factor in determining the influence of baking temperature on
darkness), but it is clear that baking temperature does have a clear, if
somewhat variable effect.

You conclude on the same evidence that nothing has been learned about the
influence of baking temperature on cookie brownness. Somehow the baking
temperature has produced a difference in the average brownness of the two
groups without exerting any effect at all on any individual cookies. It is
a miracle!

The "cognitive dissonance" experiment
I described reveals nothing about the behavior of an individual.

No, it reveals nothing about the behavior of any _particular_ individual,
which is a different thing altogether. The baking experiment does not
reveal anything about how brown a particular cookie composition will get,
but it _does_ indicate that individual cookies within the range of
compositions investigated tend to get browner when baked at the higher
temperature. Although not always effective in particular cases, if you want
your cookies browner, you would be well advised to cook them at the higher
temperature. Further experimentation would probably reveal what
compositional parameters yeild the greatest effect of temperature on browning.

But it was used as a test of "cognitive dissonance" theory -- a
theory of _individuals_. The theory says that an individual will
experience cognitive dissonance under the conditions of the
experiment and behave in a particular way. But the experiment did
not show that each individual behaved according to the predictions
of the theory; it showed that the group averages did (statistically).
Yet the experimenter (and you) say that the results of this study
_support_ the cognitive dissonance theory of individuals.This is
the "criminal approach to science you are defending.

Cognitive dissonance theory showed that such effects are common enough
across individuals to yield reliable differences in group behavior. I don't
think that the theory asserts that every individual will be affected to the
same degree -- there are always individual differences that would have to
be taken account of. Given individual variation, the next step would be to
determine what additional, as yet unidentified factors lead to these
individual differences in the effectiveness of the payment variable. In
fact, if I recall, cognitive dissonance theory suggested other factors that
would be expected to interact with pay.

"The method that can most reliably produce an unending output
of superstitions is known as Analysis of Variance" is a case in
point. The problem lies not with the method but with the fact
that its results are often misinterpreted.

The only correct interpretation of the results of an ANOVA is
in terms of groups; the ANOVA tells you the probabity that the
observed ratio of between to within groups variance (F) was drawn
from a population of _groups_ with a mean F ratio of 1.0. That's it.
It says nothing at all about the individuals that make up these
groups. Use of this group approach to test theories of individuals
is (you got it) criminal.

As an inferential procedure, ANOVA cannot tell you whether observed
differences in mean performance under different experimental conditions is
with certainty a result of the differences in experimental conditions.
However, it does provide a basis for drawing the conclusion that differences
in mean performance are not due to chance differences in extraneous
variables. The only remaining factor to explain these differences is then
the independent variable. As argued above, it does tell you something about
the likely typical effect of that variable on the dependent measure, an
effect that can only be produced by influencing behavior at the level of
individuals.

Again, my argument was not in favor of using group data to predict
individual behavior. What I argued is that group procedures can
be used to identify variables that are effective.

Effective at what? Influencing group behavior? If so, fine. But
every research report I've seen concludes that these variables
influence _individual_ behavior. Besides, these experiments are
being done to test _theories of individual behavior_ which is
completelty inappropriate.

You will have to demonstrate where my argument fails. Simply reasserting
your unsupported claims is not sufficient. It would also be nice if you
would provide _some_ sort of analysis of the procedure (as I did) that at
least seems to support your conclusion.

As to whether a theory is a theory or "no theory at all" if it
annot predict individual behavior, I suggest that by your definition
atmospheric physics offers no theory at all of the weather.

And an even better example would be electronics, which cannot predict
the behavior of individual electrons. But did you notice how well
these theories (particularly electronics) predict the "group" data.
If the people doing the group experiments in psychology were able
to predict their statistical results -- means, variances, proportions,
etc -- as well as electronics can predict the mean
rate of movement of the electrons in a wire, then they could
dress up and play real scientist. As it is, they'll just have
to keep wearing strips and stay the hell out civilized society;-)

That is _not_ a better example -- so far as we know, electrons are basically
simple and essentially identical. Predicting their behavior is a piece of
cake. Neither human beings nor hurricanes are either, and physicists are
evidently no better at predicting what an individual hurricane will do than
psychologists are at predicting what an individual person will do. The
physicist knows more about the basic mechanisms that create and drive
hurricanes than psychologists do about the basic mechanisms that underly
human performance, but the main problem is the complexity of the systems and
incomplete knowledge about the current states of all relevant variables.

Bruce

[Hans Blom, 971120d]

(Bill Powers (971119.1428 MST)) to (Bruce Abbott (971119.1550 EST))

I'm not much surprised, but I am disappointed at your defense of the
application of group statistics to individual characteristics.
You've been part of this discussion for some years now; haven't you
heard anything that's been said about this subject by various
members of this group?

Imagine that you were perfectly able to predict the future, yet the
future was not fully predetermined and you were still free to choose
your actions. You would know, for instance, what the stock market
would do the coming week and which lottery ticket would show up with
the winning number tomorrow. You can easily come up with even more
desirable predictions. No doubt most people would consider themselves
to be far better in control than we are now.

In control engineering terms, we would say that this situation
corresponds with the case where the controller fully knows the
"environment function" and where there are no disturbances. In fact,
controllers can be designed to operate with zero error in such cases.

Thus, it is important (for the quality of control) to know the
environment function. Regrettably, we can't in real life: we cannot
predict the future -- almost anything might change in unforeseeable
ways. Yet, the more foreseeable/predictable changes are, the better
we will be able to control if we also have the means to influence the
future course of affairs. Think of it as being able to greatly
decrease the signal to noise ratio in the path from action to
perception.

Thus humans invent ways to forecast the future. We cannot _know_ the
environment function, but we can make a more or less accurate guess
of what it is. And the more accurate that guess is, the better we are
able to control.

Similarly with human individuals. If we know him/her very, very well,
we will usually be able to accurately predict his/her behavior even
on the basis of very little data. "I'm going to mail this letter",
she says, and already your mental eye can see her leave the house
through the front door, carefully closing it behind her back. You see
her take a right turn at the sidewalk. You see the movements of her
body and of her long blue overcoat. You see the size of her steps.
Your imagination could follow her to the mailbox, see her carefully
slide the letter in, and follow her path back home the other way
around the block. It's not that she is an inflexible robot that
always behaves the same way: you can even see that if she meets one
particular neighbor on her way, she'll take a few minutes to exchange
gossip. You imagine a lots of other if/thens as well.

Why do you see all that? Because you've frequently done those same
things with her, and you've done them _her way_, not imposing your
reference levels on her. Thus you know how she does things.

Not so with complete strangers: no familiarity, no model. Still, if
it is important to us, we would like to have the best possible model
of them. A _perfect_ model is impossible, of course. Yet, we know a
lot: he is a human, with all that that implies. We may also know that
he's from a certain culture and a certain social level. And we know
that an _arbitrary_ unknown individual is an _average_ individual
from the class of all people of whom we have the same information.
That's a funny type of knowledge, however: it is the best possible
_guess_, as statistics proves -- and as anyone who's into betting
knows from experience. In control terms, it's the guess that will on
average (!) present you with the best signal to noise ratio. If you
know more about the individual, the accuracy of your prediction about
his behavior will increase.

In a control context, the significance of all this is to improve the
quality of control, which is inherently a measure of an _average_ (or
integral) of some sort, such as a sum of squared errors. The more we
can attribute our perceptions to our actions according to a known
environment function, the more we're in the position of someone who
knows in advance which lottery ticket to buy.

That's an attractive proposition. No wonder people have investigated
all kinds of tricks -- including statistics -- that allow us to come
closer to the ideal of being able to predict the future. You can
object and demonstrate that it's impossible. And you would be right,
of course: the future is unpredictable. Yet, even beating the odds
just a little makes quite a difference in the long term if no one
else can do as well...

Greetings,

Hans

[From Bill Powers (971120.0430 MST)]

Bruce Abbott 971119.2000 EST)--

It's not a defense of the application of group statistics to individual
characteristics, and that's where you and others on CSGnet are missing the
boat. It's a defense of the appplication of group statistics to the
identification of influences on behavior (casting nets).

It's an identification of (1) _apparent_ influences on (2) _group_ behavior.

If this apparent influence on group means were as strong as the apparent
influence of cooking temperature on cookie brownness (which you gave as an
example in another post) I would object far less. _All_ of the cookies get
more done than they started out, even at the lower temperature. The normal
result is for the cookies to vary in brownness across temperatures far more
than individuals vary from the mean brownness at each temperature. But this
is not normal for ANOVA results from psychological experiments. What you
end up with in most psychoological experiments is the equivalent of some
cookies getting _less_ done than when they were when they were raw dough.

In psychological experiments, what we usually observe when the independent
variable is changed is that some subjects show an increase in the dependent
variable, some show no change, and some show a decrease. If there are
slightly more people who show an increase than a decrease, the customary
conclusion is stated as if _all_ of the people showed a _slight_ increase.
If the increase is highly significant, it is simply said that there is an
increase, without mentioning its magnitude. The obvious assumption is that
there was in fact an effect in every person, but that uncontrolled
variables introduced random noise that masked the effect in any individual.

The problem is that there is absolutely no way to prove that that
assumption is justified. It is just as likely that some individuals show
the effect and others show the opposite effect, while the rest are not
measurably affected at all. However, to interpret the results this way
would make use of the hypothesis being tested untenable, because invariably
the hypothesis is stated as if it is meant to be true of _all_ people. If
you conclude that job satisfaction is inversely related to pay, you can't
blame others for interpreting this to mean that the opposite relation
doesn't occur with almost equal frequency. If you see this large variation
in the supposed effect, you have two choices: the hypothesis is true of all
people, but random variations make it difficult to verify, or the
hypothesis is false for about as many people as it is true.

Incidentally, I was asking about the actual numbers in that experiment for
real, not rhetorically. Can someone supply them?

I just wanted to
point out that this use of group methods is legitimate, even though it can
be taken only so far. It would be easy to conclude on the basis of Rick's
piece (and your followup to it) that these methods are worthless for this or
any other purpose. "The method that can most reliably produce an unending
output of superstitions is known as Analysis of Variance" is a case in
point. The problem lies not with the method but with the fact that its
results are often misinterpreted.

The method invites and almost demands misinterpretation. How about this
conclusion: when you give low pay for a job, the people who pride
themselves on doing a job for its own sake report higher statisfaction than
others who have no opinion on that subject. When you give high pay, _other_
people who consider monetary reward to be the index of satisfaction report
higher satisfaction than the others report, including the others would
would report greater satisfaction for lower pay. So the results are not
indicating any general human characteristics; the differing rates of pay
are selecting different subpopulations. While it may be true that there is
an inverse relation between pay and job satisfaction, that is true only of
some people, and for the rest it is false or the exact opposite of the
truth. And since you have no way to detect these subpopulations in advance,
the hypothesis is, in practice, useless for dealing with individuals.

In Rick Marken's special issue of the American Behavioral Scientist, I
published a little simulation study in which 4000 control systems produced
some degree of effort to produce some amount of reward in the presence of
varying costs (disturbances). The individuals had reference levels for the
desired amount of reward that were distributed over a range around a mean
value. For the whole population, the result was an apparent increase of
effort with increasing reward. With 4000 samples, this was a highly
signficant relationship.

This apparent relationship followed from the fact that individuals with
higher reference levels for reward had to work more to get the higher
levels of reward. But for _every individual_, the actual relationship was a
strong _decrease_ in effort with increasing reward. As the obtained reward
increased toward the reference level (due to random decreases in costs) the
behavior sharply _decreased_. So the apparent group relationship between
independent and dependent variables was as wrong as it could get as an
indicator of individual characteristics.

This paper, beside being about PCT, was intended as a cautionary tale. The
_apparent_ relationship you get from varying IVs and measuring DVs over a
population is NO INDICATOR AT ALL of the actual relationship between IV and
DV for any individual in the group, and can be completely wrong for ALL of
them, as in my example. There is simply no substitute for measuring
individual characteristics, one at a time.

The problem is that what you end up with is a characteristic of a group,
not of an individual. At any given time, some members of the group may show
the dependent effect and while others do not. At a different time,
different members may show the effect, while a different group does not.
There is absolutely no evidence that this effect is shown consistently over
time by any individual. To show that, you would have to study each
individual over time.

What is shown in such an evaluation is that behavior differed on average
across the levels of the independent variable, under the conditions of the
study, when other factors are unlikely to account for the difference. This
indicates that the independent variable very likely has some influence on
the behavior that was observed and recorded, in the sample of individuals
tested. It does not indicate that this variable was effective in every
individual tested, nor does it show that this variable would have an effect
consistently over time in the same individual. Nevertheless, it is a
variable having an influence on this dependent measure in enough
individuals, at the time of testing, to be detected by the procedure.

But there is no reason to think that it is a correct indicator of any
individual's characteristics, as I tried to show in my ABS paper. You're
echoing the standard rationale for statistical studies, and that standard
rationale is simply wrong.

If you're dealing strictly with populations, as would be the case for
educational institutions, insurance companies, government programs, and
market research organizations, then you don't care _why_ any observed
population characteristic exists. You simply take advantage of it, to your
profit. That's where casting nets is a valid approach. But if your success
in predicting population effects leads you to think you've learned
something about human nature, you're simply deluded. You've learned nothing
about any individual. To understand individual characteristics, you have to
use the method of testing specimens, and this means making and testing
models. If people share anything in common, it is at the level of
organization, not behavior. We are all control systems, but what we
control, and how, and to what ends, is almost infinitely variable.

And if you did study the individual over many trials,
you would soon find that _all_ individuals caught on to the fact that a low
evaluation led to high pay.

Let's at least get the experiment right. A low evaluation did not lead to
low pay. Low pay tended to lead to a higher evaluation, after the fact.

I haven't read the report, but even accepting your correction, what you say
is most likely false. Low pay did NOT lead to a higher evaluation after the
fact. It also led to a LOWER evaluation after the fact. You're simply
ignoring the counterexamples, treating them as meaningless statistical
fluctuations. That's the very assumption to which I'm objecting.

If you had to repeat this experiment with a single individual again and
again, the individual would perceive any pattern that existed. If there
were no pattern, that is if job performance were equally likely to be
followed by high or low pay, I would expect the individual to abandon any
initial bias. If there were a pattern -- an inverse relationship
predominated -- I would not be surprised if the pattern were discovered and
used to increase pay.

But all that's a minor quibble compared with the most glaring fault of this
supposed "evidence in favor of the theory." In fact, the theory is
disconfirmed by every individual who did not behave as predicted.

In this study there is _no_ individual who can be said _not_ to behave as
predicted. As I pointed out in my post, if a person in the low-pay
condition gave a low liking-rating, there is no way to tell whether it
wouldn't have been even lower in the high-pay condition (baring a floor
effect).

Then there is no individual who could be said to behave as predicted, either.

Again, my argument was not in favor of using group data to predict
individual behavior. What I argued is that group procedures can be used to
identify variables that are effective (in at least most of those tested in
the experiment, under the test conditions). As to whether a theory is a
theory or "no theory at all" if it cannot predict individual behavior, I
suggest that by your definition atmospheric physics offers no theory at all
of the weather.

That's a bad example: there is only one global weather system, and the
predictiveness of the theory can be judged only over many forecasts.
Anyway, modern weather theory is based more and more on models, which makes
it into an example of testing specimens.

I'm not arguing that the discovery of a relationship between levels of a
variable and average performance constitutes a universally-applicable Law of
behavior, although clearly you are imagining that I am. (As a reality check
you might try re-reading my post. You won't find any such argument there.)

What I find are statements like "low pay is followed by a high evalution."
If that doesn't sound like a universally-applicable law of behavior, I
don't know what does. How would that sound if you described the results
more truthfully? "Sometimes people will give a high evaluation to a job if
the pay is high, and sometimes they won't, and I can't tell you when the
one or the other result will occur."

What came out of this study was a relationship -- confirmed in a large
number of followup studies -- between two variables that can be observed in
more-or-less normal college students when other factors are statistically
equated. A lot more would have to be known about other interacting factors
before one would be in a position to use this information for individual
prediction.

In _all_ more or less normal college students? Obviously not. And anyway,
"confirming" this relation for more populations shows only that it exists
in populations. It is quite possible that the actual relationship in every
individual is different from or even opposite to the population effect. And
don't tell me that's impossible: I've proven that it's possible.

What we need to know are the chances of predicting incorrectly a person's
evaluation of a task for which the pay is high or low. To determine this,
we need to know how many people behaved in the expected way, and how many
in the other way. Those who understand statistics can then compute the
chances that a prediction will be wrong for a given individual.

This assumes that this variable alone would be used for this purpose,
without any further investigation of conditions under which, in the
individual, it would be expected to influence the person's rating of the
task in the predicted way. Geez, Bill, I _agree_ that it probably isn't
much use by itself. I never argued that it would be.

Then can we drop the idea that this experiment somehow "supports" the idea
of cognitive dissonance?

I am not defending any "heinous scientific crimes," Bill. I certainly do
agree, however, that there has been a lot of knee-jerking going on in
response to my post. At the beginning of your post you said you were
disappointed in me. If anyone has a right to feel disappointed, I do. I
expected a more careful reading on your part of what I had to say than you
evidently gave. You merely used my post as an excuse to trot out all your
old arguments against a position I wasn't even defending.

Perhaps I do you an injustice. Are you now agreeing that population studies
can't tell us anything of use about individual characteristics?

Best,

Bill P.

[From Bruce Gregory (971120.1010 EST)]

Bill Powers (971120.0430 MST)]

Bruce Abbott 971119.2000 EST)--

Bruce Abbott has provoked Bill into providing one of the
clearest and must lucid treatments of the value of studies of
groups. (No offence to Phil, I love his book.)

Thanks Bruce.

the Other Bruce

[From Bill Powes (971120.0655 MST)]

Hans Blom, 971120d --

Imagine that you were perfectly able to predict the future, yet the
future was not fully predetermined and you were still free to choose
your actions. You would know, for instance, what the stock market
would do the coming week and which lottery ticket would show up with
the winning number tomorrow. You can easily come up with even more
desirable predictions. No doubt most people would consider themselves
to be far better in control than we are now.

True, but this is a fairy-tale. Don't get me wrong: I'm not saying that
predicting the future is useless, especially when we're talking about
population effects. But I think people have relied too much on planning and
prediction, which (in the real world) are very limited in their capacity to
forecast correctly. It is at least as important to be able to deal with
disturbances as they arise, even when one does not know they exist and has
not anticipated them. It's nice to have your airline ticket all ready to be
collected as you reach the gate, but you can't predict that you're going to
drop it so you have to hold up the line while you retrieve it and turn it
the right way around. Nevertheless, you _can_ retrieve it and restore
everything to its proper state. If you relied entirely on prediction, you'd
be helpless the moment something unpredicted happened.

Thus, it is important (for the quality of control) to know the
environment function. Regrettably, we can't in real life: we cannot
predict the future -- almost anything might change in unforeseeable
ways. Yet, the more foreseeable/predictable changes are, the better
we will be able to control if we also have the means to influence the
future course of affairs. Think of it as being able to greatly
decrease the signal to noise ratio in the path from action to
perception.

I think of it more as being able to _slightly_ increase the signal-to-noise
ratio concerning mainly _cognitive_ aspects of experience (the kind you
feature in your examples). It's too easy to overlook the uncertainty in our
predictions, and blot from memory all the times when failure of a
prediction has got us into more trouble than it was worth. The fact that it
may be crucially important to make a correct prediction has nothing to do
with how correct our predictions will actually be. The more important it is
to predict correctly, the higher will be the cost when the prediction is
wrong.

Thus humans invent ways to forecast the future. We cannot _know_ the
environment function, but we can make a more or less accurate guess
of what it is. And the more accurate that guess is, the better we are
able to control.

That's true, but this doesn't mean that all control can be substantially
improved by making good predictions. If you can already control in the
presence of the kind of disturbances that actually occur, and do it so well
that any further improvement would be of negligible importance, investing
in a lot of predictive machinery would be a waste of resources. I think
this is pretty much the case for the lower five or six levels of control in
the HPCT hierarchy.

Don't forget that even after you have predicted what is likely to happen,
and have selected the appropriate action to take, you still must turn that
selection into the actual action that has the required effect. And that
requires present-time closed-loop control, because even your muscles act in
unpredictable ways, gaining and losing sensitivity to neural signals
according to recent use. And of course the details of the external world
keep shifting in countless ways, so you have to be able to vary your
actions according to the current external circumstances, only a small part
of which you can sense. You can't plan how you're going to turn the
steering wheel before you take the automobile trip.

Similarly with human individuals. If we know him/her very, very well,
we will usually be able to accurately predict his/her behavior even
on the basis of very little data. "I'm going to mail this letter",
she says, and already your mental eye can see her leave the house
through the front door, carefully closing it behind her back. You see
her take a right turn at the sidewalk. You see the movements of her
body and of her long blue overcoat. You see the size of her steps.
Your imagination could follow her to the mailbox, see her carefully
slide the letter in, and follow her path back home the other way
around the block. It's not that she is an inflexible robot that
always behaves the same way: you can even see that if she meets one
particular neighbor on her way, she'll take a few minutes to exchange
gossip. You imagine a lots of other if/thens as well.

All these examples of prediction involve the highest levels of perception
and cognition, as well as large degrees of uncertainty. You may predict
that she is going to mail a letter, and that she might meet a friend on the
way, and lots of other things, but you can't predict that it is going to
start raining, or that she will have to dodge a car, or that she will
remember something she meant to buy and make a side-strip to a store. You
can make only vague and general predictions at this level, which means that
the kinds of variables you can control are also vague and general. You can
carefully time your actions so that just as she returns you will greet her
at the door with a glass of wine -- only to find yourself waiting, half an
hour later, while she chats with the postman at the mailbox. How
thoughtless of her, to behave in a way you hadn't counted on!

It's nice to be able to predict the future, but in my opinion we trust our
predictions too much and tend to forget the failures, at the expense of
learning how to deal with life as it happens.

Best,

Bill P.

[From Bruce Gregory (971120.1050 EST)]

One example of the perils of inferring anything about
individuals from group statistics is the apparent penchant of
Americans for a President from one party and a Congress
dominated by the other party. While the American People have
this preference, very few voters do. Split ticket voting is
quite rare.

Bruce

[From Richard Kennaway (971120.1500 GMT)]

Bill Powers (971119.1428 MST):

So the question is, what kind of law is it that comes out of this study? Is
it one of those findings that is so robust that one could use it in dealing
with individuals with little risk of a wrong prediction? To answer that
question we have to look at the results of the experiment. Rick has
evidently looked at them; perhaps you might want to double-check him. And
Richard Kennaway, who has analyzed statistical findings on some detail,
might do us the favor of commenting on this.

You're all doing fine, and I have appreciated the discussion. I'm
(finally!) putting the finishing touches to my paper on correlations and I
hope to be sending it off to Science before the end of the month, though
tracking down some references may take longer. The recent discussions here
have given me some useful ideas for stating its conclusions clearly and
forcefully. Here's what the current version says about the sort of
statistical argument discussed in this thread:

    Suppose that the bivariate data arise from taking some number of
    individuals, and obtaining from each individual some number of pairs
    $(x,y)$. The set of all the data will have a certain correlation $c$
    between $x$ and $y$. The set of data from the $i$th individual will
    have a correlation $c_i$. What relation may hold between $c$ and the
    individual $c_i$? What relation may hold between the regression line
    for the whole data and the regression lines for individuals?

    No relation need hold at all. To visualise why this is, imagine the
    scatterplot of the whole set of data. If $c$ is positive, this will
    have the general shape of an oval in the $xy$ plane whose long axis
    has positive gradient, as in Figure~\ref{figure-contours}. Each
    individual's data will consist of some subset of those points.
    Clearly, it is possible to cover the oval with smaller ovals whose
    eccentricities and long axis directions bear no relationship to each
    other nor to those properties of the whole oval.

    The moral of this is that no argument can be made from a population
    correlation to the correlation for any individual. The population
    correlation is a property only of the population, and not of any
    individual in it; {\it any} relationship between the population
    variables is consistent with {\it any} individual relationship.
    See~\cite{Powers} for an experimental example in which the
    relationship between an independent variable and a dependent
    variable was, for every individual in a population, the opposite of
    the relationship of the population averages of the variables.

Of course, when the number of data points for each individual is 1, as it
appears to be in the pay/satisfaction study, the argument from population
to individual reaches even greater heights of silliness.

\cite{Powers} is the Am.Beh.Sci. paper that Bill Powers mentioned, but I
haven't seen it yet. Can someone give me the exact reference? The library
here only has volumes 10-28 (1966/67-84/85). I have some of the PCT
anthologies at home -- is it republished there?

Bruce Abbott (971119.2000 EST)

What came out of this study was a relationship -- confirmed in a large
number of followup studies -- between two variables that can be observed in
more-or-less normal college students when other factors are statistically
equated. A lot more would have to be known about other interacting factors
before one would be in a position to use this information for individual
prediction.

Can you give me references for some of these studies? The government took
my guns away, so this is the next best thing. :slight_smile:

BTW, Science are strict about prior publication, so I won't be placing this
on the Web, and I've taken down the older version. Anyone who wants a copy
when it's finished can ask, letting me know whether they would like a
PostScript file by email or a printed copy.

-- Richard Kennaway, jrk@sys.uea.ac.uk, http://www.sys.uea.ac.uk/~jrk/
   School of Information Systems, Univ. of East Anglia, Norwich, U.K.

[From Rick Marken (971120.0800)]

Bill Powers (971120.0430 MST)

In Rick Marken's special issue of the American Behavioral Scientist,
I published a little simulation study... This paper, beside being
about PCT, was intended as a cautionary tale. The _apparent_
relationship you get from varying IVs and measuring DVs over a
population is NO INDICATOR AT ALL of the actual relationship
between IV and DV for any individual in the group, and can be
completely wrong for ALL of them, as in my example. There is simply
no substitute for measuring individual characteristics, one at a
time.

Thanks. I meant to mention this little beauty yesterday. I'm sure it
will not convince Abbott that statistical studies tell us nothing
about individuals -- he's got a psychological statistics book to
write and sell, after all; it's business, not personal -- but
anyone with a shred of intellectual integrity left will find it
most illuminating.

If you're dealing strictly with populations, as would be the case
for educational institutions, insurance companies, government
programs, and market research organizations, then you don't care
_why_ any observed population characteristic exists. You simply
take advantage of it, to your profit. That's where casting nets
is a valid approach. But if your success in predicting population
effects leads you to think you've learned something about human
nature, you're simply deluded. You've learned nothing about any
individual. To understand individual characteristics, you have to
use the method of testing specimens, and this means making and
testing models. If people share anything in common, it is at the
level of organization, not behavior. We are all control systems,
but what we control, and how, and to what ends, is almost
infinitely variable.

This paragraph lays out, as clearly and concisely as I have ever
seen it, the difference between the PCT and conventional approaches
to psychological research. In particular, I like the idea that what
people share in common is not behavior but _organization_. In fact,
the group- based research approach of conventional psychology
_assumes_ a model of organization (S-R) and turns the statistical
crank to see what kinds of behaviors emerge from this (unconsiously
assumed) organization. The individual- based research approach of
control theory does not assume a particular organization; rather,
it is aimed at testing to determine _whether_ variables are, indeed,
under control and, if they are, what kind of individual organization
can account for this control.

Hank Folson (971119) to Hans Blom (971118) --

Beautiful post, Hank.

David Goldstein (971120) --

The research literature, is a source of possible treatments,
which may or may not work in any particular case. We are stuck
with an almost trial and error mode. The value of the research
literature is that it suggests which approaches may be more
worthwhile to try. It is understood that there is a warning label
on the research: This author will not be held responsible if the
treatment does not work for you. Buyer beware!

This seems like an absolutely ridiculous state of affairs to me.
Why pay for clinical research if treatment is a trial and error
process anyway? I don't think the research is justified as a
"source of possible treatments"; heck, clinicians are certainly
capable of thinking up _possible_ treatments on their own. Why
not just call off this dumb, group level research process as a
way of studying individuals (and treatments are, as you note, aimed
at individuals) and start doing the research the right way -- one
individual at a time? Actually, I already know the answer the
_that_ question-- the Bruce Abbott's (instead of the Richard
Kennaway's (971120.1500 GMT), god bless him) of the world are
running the psychological research establishment and will be for
the foreseeable future. PCT is a barely noticeable and only slightly
annoying gnat in the library of textbooks on how psychological
research should _really_ be done.

Best

Rick

···

--
Richard S. Marken Phone or Fax: 310 474-0313
Life Learning Associates e-mail: rmarken@earthlink.net
http://home.earthlink.net/~rmarken

[From Rick Marken (971120.0830)]

Richard Kennaway (971120.1500 GMT) --

\cite{Powers} is the Am.Beh.Sci. paper that Bill Powers mentioned,
but I haven't seen it yet. Can someone give me the exact reference?

AMERICAN BEHAVIORAL SCIENTIST, Volume 34/Number 1, September/October
1990. Issue devoted to: Purposeful Behavior: The Control Theory
Approach, Edited by Richard S. Marken.

It's published by Sage Publications. They have a Web page (URL not
handy at the moment, sorry) and a London office so it should be easy
to get a copy. I think Bill's contribution to that volume would be a
great citation for your Science article (and I will be briefly
converting to my favorite religion, Catholicism --if it's not
mystical, prejudiced and superstitious it's just not a religion,
as far as I'm concerned;-) -- so that I can pray for acceptance of
your article in Science.

Best

Rick

···

--
Richard S. Marken Phone or Fax: 310 474-0313
Life Learning Associates e-mail: rmarken@earthlink.net
http://home.earthlink.net/~rmarken

[From Bruce Abbott (971120.1300 EST)]

Richard Kennaway (971120.1500 GMT) --

You are of course correct that relationships shown by population means need
not correspond to the relationship found in any individual from that
population. I have illustrated this well-known fact below:

               > .
               > . .
               > . . .
               > . . .
           B | . . .

···

. . .

               > . .
               > .
               +---------------------------------
                                A

Here, each individual is represented by three points, in which an increase
in variable A is associated with a decrease in variable B. However, the
trend evident _between_ individuals shows a different relationship, such
that an increase in A is associated with an increase in B. The inverse
relationship is a characteristic of each individual in the group, whereas
the direct relationship is a characteristic of the group as a whole. So
far, so good.

Question: why does the group as a whole display a direct relationship
between variables A and B? Answer: The individuals must differ by some
third factor which affects _both A and B_, and in the same direction.
(Note: the "third factor" could actually consist of the composite effect of
many "third" factors.) Thus, as the unobserved factor C varies, it carries
both A and B with it (e.g., as C increases, both A and B increase).

If this third factor were held constant, then all individual functions would
lie atop one another and (in this example) group means at each level of
variable A would follow the same trend as the individual functions.

If we randomly assigned these individuals to two groups, individual values
on C will tend to equalize, on average, across groups. We now apply a
particular value of variable A to one of the groups and a different value to
the other. To the extent that C is on average equal in the two groups, mean
differences in B between the groups cannot be due to differences in C, the
factor whose influence on both A and B in the original dataset produced the
spurious population relationship. The observed relationship between group
means will look much like the individual relationships, insofar as these are
similar to one another. (Large individual differences in this relationship
will tend to produce an apparent lack of relationship between the two
variables in the group means rather than a strong spurious one in some
irrelevant direction.)

The Powers demo to which you referred is not an experiment but a
correlational design that does indeed suffer from the defect you note, but
this lacks the statistical control of extraneous variables found in
randomized groups designs. Thus it does not provide a valid demonstration
of the problems inherent in the latter.

Regards,

Bruce

[Martin Taylor 971120 15:40]

Bill Powers (971120.0430 MST)]

In Rick Marken's special issue of the American Behavioral Scientist, I
published a little simulation study in which 4000 control systems produced
some degree of effort to produce some amount of reward in the presence of
varying costs (disturbances). The individuals had reference levels for the
desired amount of reward that were distributed over a range around a mean
value. For the whole population, the result was an apparent increase of
effort with increasing reward. With 4000 samples, this was a highly
signficant relationship.

This apparent relationship followed from the fact that individuals with
higher reference levels for reward had to work more to get the higher
levels of reward. But for _every individual_, the actual relationship was a
strong _decrease_ in effort with increasing reward. As the obtained reward
increased toward the reference level (due to random decreases in costs) the
behavior sharply _decreased_. So the apparent group relationship between
independent and dependent variables was as wrong as it could get as an
indicator of individual characteristics.

This paper, beside being about PCT, was intended as a cautionary tale.,
as in my example. There is simply no substitute for measuring
individual characteristics, one at a time.

Isn't it a little disingenuous of you STILL to be citing that paper in this
context? Or have you reversed your understanding of some years ago that
the average of slopes is not necessarily the slope of averages? The error
is _only_ in assuming that taking slopes commutes with taking averages,
NOT in assuming that

The
_apparent_ relationship you get from varying IVs and measuring DVs over a
population is NO INDICATOR AT ALL of the actual relationship between IV and
DV for any individual in the group, and can be completely wrong for ALL of
them

If I were you, I'd stop trying to pretend that your paper has anything
whatever to do with the use of group statistics for predicting individual
effects. It doesn't, and never did.

There are plenty of other arguments, many of them mentioned in this wordy
set of messages. But most of them come down to one of two questions: can
"significance tests" ever be used as a valid index of anything?; and how
much variation in individual behaviour is important before the usefulness
of the group measure becomes essentially zero?

My answer to the first question is and always has been (since graduate school
days) NO--significance tests NEVER tell you anything of value. My answer
to the second question depends on the specific circumstances. If I want to
predict what an individual will do under specifica circumstances, very small
variations in individual behaviour are enough to scupper me; if I want to
tease out what influences affect the way a person controls, then individual
variation can be very large compared to the group effects without becoming
uninformative. (Incidentally, this was my big complaint about Richard
Kennaway's draft paper, and if he hasn't changed his conclusions, it will
be my complaint about the publication version as well).

Bill implicitly agreed with this position in his criticisms of our study
on the effects of sleep deprivation and drugs on tracking performance in
a range of tasks. He asserted that it was necessary for a lot of extraneous
circumstances to be held constant or the model wouldn't be appropriate.
Naturally:-) But there are also probably a lot of other circumstances that
an experimenter doesn't know about, and would like to know about. If
the precision of control varies when the experimenter tries changing
some of these (or does so inadvertently), it shows up as correlated
variation (or as individual differences).

···

------------------

I think the argument is being conducted on the wrong grounds. There's no
"scientific crime" in looking for effects that are small compared to
individual variation in the thing studied, if the idea is to try to see
what influences exist. Every time you can find an independent source
of 5% or 10% of the variability, the next test becomes more precise (or
the experimenter can make it so). That's incontrovertible.

The "scientific crime" comes in a quite different area--two areas, in
fact. One is the joint equation of "not significant" with "non-existent"
and of "significant" with "important."

The second area of crime is much more relevant to PCT. When the researcher
looks at the inputs to the individual and the outputs, the relationship
can be stable only under very special conditions--normally called "good
laboratory control." In PCT terms, those conditions include (but are not
limited to) stable values of reference inputs to the control
systems active during the experiment; stable disturbances, or at least
disturbances varying _only_ as the experimenter wishes and measures;
constancy of all "unexperimented" sensory input to the subject; constancy
of all "uninteresting" output that nevertheless might be part of the
subject's control loop(s),... A lot of holding things constant and requiring
stabilities that cannot be guaranteed. Lack of constancy in any of these
things shows up as "noise" in the experimental results, meaning that
there will be individual difference and session differences within
individuals showing up in the statistics--differences that would not show
up in a properly conducted "Test".

That's quite apart from the theoretically based objection (which is valid
only if you start by asserting that PCT is correct, which will not be true
of your run-of-the-mill psychologist) that the experiments study the
wrong thing in the first place, the input-output relationship rather
than Rick's oft restated "what is controlled?"
--------------------

PCT is a theory that applies to all living things, and to human psychology
in particular, in the same way as Newton's laws apply to all material
things, and planets and gas molecules in particular. And as with Newton's
laws, PCT can be used to predict individual behaviour only if all the
boundary conditions are well specified, as they are in simple tracking
studies, and as they are not when we deal with the social interactions
of groups of people, or even the behaviour of individuals when the
environmental feedback loops are complex.

But as with Newton's laws, with which the interactions of large groups
can be prescribed statistically, so with PCT laws, the interactions of
large groups of people can in principle be described statistically (I tried
to make a start at that in a talk I gave in late October, and I hope to
put some of it on a Web page soon). The laws of PCT may well describe
accurately the behaviour of every individual, and yet not be adequate
for prediction of the behaviour of any individual, for the same reasons
that Newton's laws cannot be used for an accurate prediction of the motion
of three or more bodies orbiting each other in a graviationally uniform
space (let alone a space with other bodies in it!).

Summing up, I think that you guys have been conducting a religious war,
not a rational discussion. Which is why I didn't chime in earlier. I hoped
that thought would eventually be substituted for rhetoric. But since
that doesn't seem to be happening...

Sorry.

Martin

[From Bruce Abbott (971120.1700 EST)]

Bill Powers (971120.0430 MST) --

Bruce Abbott 971119.2000 EST)

It's not a defense of the application of group statistics to individual
characteristics, and that's where you and others on CSGnet are missing the
boat. It's a defense of the appplication of group statistics to the
identification of influences on behavior (casting nets).

It's an identification of (1) _apparent_ influences on (2) _group_ behavior.

(1) Apparent in that, given the statistical nature of the analysis, it is
always slightly possible that the observed effect is due merely to chance
(and the analysis reveals just how unlikely such a result would be if chance
and chance alone were operating). (2) Not only on group behavior (see reply
to Kennaway).

If this apparent influence on group means were as strong as the apparent
influence of cooking temperature on cookie brownness (which you gave as an
example in another post) I would object far less. _All_ of the cookies get
more done than they started out, even at the lower temperature. The normal
result is for the cookies to vary in brownness across temperatures far more
than individuals vary from the mean brownness at each temperature. But this
is not normal for ANOVA results from psychological experiments. What you
end up with in most psychoological experiments is the equivalent of some
cookies getting _less_ done than when they were when they were raw dough.

There is no way you could know this by examining the experimental data from
such a study. A cookie baked at a given temperature is a certain shade of
brown. You can't tell by looking at it whether it would have been more or
less brown if baked at some other temperature. As to the complaint about
typical size of effect, I discuss this below.

In psychological experiments, what we usually observe when the independent
variable is changed is that some subjects show an increase in the dependent
variable, some show no change, and some show a decrease.

If you observe separate groups of subjects under different levels of the
independent variable, how can you tell? Each subject has been observed
under only one level. If you refer to repeated measures designs (each
subject is measured under all the different levels of the independent
variable), then you do get the effect of the independent variable, but these
are contaminated to some degree extraneous variables (e.g. fluctuations in
attention). If changes in performance differ across subjects, this will be
apparent in the data and will tend to weaken the average effect of the
independent variable. Each subject in this design serves to replicate a
single-subject manipulation. If most individuals tend to respond to the
independent variable in a similar way, the design will reveal that. If not,
the design will reveal that, too.

If there are
slightly more people who show an increase than a decrease, the customary
conclusion is stated as if _all_ of the people showed a _slight_ increase.
If the increase is highly significant, it is simply said that there is an
increase, without mentioning its magnitude. The obvious assumption is that
there was in fact an effect in every person, but that uncontrolled
variables introduced random noise that masked the effect in any individual.

If variables have such inconsistent effects across individuals, then there
are other variables that explain these differences. One can search for them
using the same methods. As to the problem of weak but statistically
significant effects, I agree with you and so do a lot of other experimental
psychologists. Journals are now demanding that some measure of "effect
size" be given in addition to the usual p values and group means. Effect
size shows the magnitude of the difference in means relative to level of
within-group variability, a kind of signal-to-noise ratio.

The problem of a low signal-to-noise ratio arises in many psychology
experiments because individual differences (both long-term and momentary)
often have a relatively large impact on the dependent measure, thus tending
to swamp the "signal" of the independent variable. There are certain
exceptions: arm/hand movements, for example, tend to be precise relative to
the range of possible motion and are not much affected by most subject
variables except in pathological cases or extreme conditions. This is the
reason PCT tracking studies generate high correlations between disturbance
and mouse both within a given subject and across subjects. Cursor movement
is a nice, relatively noise-free dependent measure, so the signal-to-noise
ratio (in assessing the effect of disturbances during the tracking task) is
high. This is great for PCT as it allows good predictions about tracking
behavior to be generated from the model. In other investigations of other
variables the influence of extraneous variables on the dependent measure can
be large and beyond the ability of the experimenter to physically remove
them. This is the situation for which statistically-based designs were
developed.

Incidentally, I was asking about the actual numbers in that experiment for
real, not rhetorically. Can someone supply them?

The experiment in question was published in 1959 and I do not have immediate
access to the journal. Perhaps Rick does -- his local campus library is
better than mine.

The method invites and almost demands misinterpretation. How about this
conclusion: when you give low pay for a job, the people who pride
themselves on doing a job for its own sake report higher statisfaction than
others who have no opinion on that subject. When you give high pay, _other_
people who consider monetary reward to be the index of satisfaction report
higher satisfaction than the others report, including the others would
would report greater satisfaction for lower pay. So the results are not
indicating any general human characteristics; the differing rates of pay
are selecting different subpopulations. While it may be true that there is
an inverse relation between pay and job satisfaction, that is true only of
some people, and for the rest it is false or the exact opposite of the
truth. And since you have no way to detect these subpopulations in advance,
the hypothesis is, in practice, useless for dealing with individuals.

Your example assumes that the only information to be had on the effect of
the variable in question is that first study in which the phenomenon was
documented. According to you, we are stuck with making broad
generalizations about the applicability of the finding, based on this one
result. In reality, the initial finding was followed up by a substantial
body of research aimed at identifying those very boundary conditions that
limit its generality.

In Rick Marken's special issue of the American Behavioral Scientist, I
published a little simulation study in which 4000 control systems produced
some degree of effort to produce some amount of reward in the presence of
varying costs (disturbances). The individuals had reference levels for the
desired amount of reward that were distributed over a range around a mean
value. For the whole population, the result was an apparent increase of
effort with increasing reward. With 4000 samples, this was a highly
signficant relationship.

This apparent relationship followed from the fact that individuals with
higher reference levels for reward had to work more to get the higher
levels of reward. But for _every individual_, the actual relationship was a
strong _decrease_ in effort with increasing reward. As the obtained reward
increased toward the reference level (due to random decreases in costs) the
behavior sharply _decreased_. So the apparent group relationship between
independent and dependent variables was as wrong as it could get as an
indicator of individual characteristics.

This paper, beside being about PCT, was intended as a cautionary tale. The
_apparent_ relationship you get from varying IVs and measuring DVs over a
population is NO INDICATOR AT ALL of the actual relationship between IV and
DV for any individual in the group, and can be completely wrong for ALL of
them, as in my example. There is simply no substitute for measuring
individual characteristics, one at a time.

See my reply to Richard Kennaway. The phenomenon you describe results from
your use of an inadequate research design (correlational in nature rather
than experimental), and has nothing whatever to say about the results
obtained from properly controlled experiments. Kennaway is no doubt a fine
mathematician, but he seems to be rather ignorant about the nature of
experimental designs as pioneered by Sir Ronald Fisher, and about what
conclusions can be drawn from their results. (I allow the possibility that
I will soon be eating crow!)

What is shown in such an evaluation is that behavior differed on average
across the levels of the independent variable, under the conditions of the
study, when other factors are unlikely to account for the difference. This
indicates that the independent variable very likely has some influence on
the behavior that was observed and recorded, in the sample of individuals
tested. It does not indicate that this variable was effective in every
individual tested, nor does it show that this variable would have an effect
consistently over time in the same individual. Nevertheless, it is a
variable having an influence on this dependent measure in enough
individuals, at the time of testing, to be detected by the procedure.

But there is no reason to think that it is a correct indicator of any
individual's characteristics, as I tried to show in my ABS paper. You're
echoing the standard rationale for statistical studies, and that standard
rationale is simply wrong.

No, what is simply wrong is your understanding of that rationale. Again, I
refer you to my reply to Kennaway.

To understand individual characteristics, you have to
use the method of testing specimens, and this means making and testing
models. If people share anything in common, it is at the level of
organization, not behavior. We are all control systems, but what we
control, and how, and to what ends, is almost infinitely variable.

I agree that to understand _an_ individual's behavior, we have to study
_that individual_. But to learn what sorts of factors may influence
_individual behavior_ (not _an_ individual's behavior), the group-based
methods I have been defending here can yield useful information, and
sometimes they are the _only_ way to obtain that information.

I'm not arguing that the discovery of a relationship between levels of a
variable and average performance constitutes a universally-applicable Law of
behavior, although clearly you are imagining that I am. (As a reality check
you might try re-reading my post. You won't find any such argument there.)

What I find are statements like "low pay is followed by a high evalution."
If that doesn't sound like a universally-applicable law of behavior, I
don't know what does. How would that sound if you described the results
more truthfully? "Sometimes people will give a high evaluation to a job if
the pay is high, and sometimes they won't, and I can't tell you when the
one or the other result will occur."

We are going to have to get this study described correctly. It involved
paying subjects either a high or low amount of money after they had
completed a boring and fatiguing hour-long task. The subjects were paid for
their efforts at the end of the hour, and then were asked to help the
experimenter by deceiving the next subject to be tested, by telling him or
her that the task would be interesting and lots of fun. (The "next subject"
was really a decoy.) The real subject was then interviewed to get his or
her private opinion about the task. On average, those who were paid little
tended to rate the task more positively than those who were paid much more.

You will be surprised to learn that this was a study about control of
perception, and that its results were what control theory would predict.
For now I will say simply that the investigators were much more cautious
about the implications of your results than you suggest here. Please note
(this tends to be overlooked here) that I am not defending the sort of
overgeneralizations about which you complain, and I find them just as
appalling as you do when I encounter them. However, that such abuses do
occur is no justification for condemning the research method itself.

In _all_ more or less normal college students? Obviously not. And anyway,
"confirming" this relation for more populations shows only that it exists
in populations. It is quite possible that the actual relationship in every
individual is different from or even opposite to the population effect. And
don't tell me that's impossible: I've proven that it's possible.

You have proven no such thing. what you've done is call attention to the
"third variable problem" in correlational research. It doesn't apply to
properly controlled experiments.

Are you now agreeing that population studies
can't tell us anything of use about individual characteristics?

In general they are not much use for predicting the behavior of a _given_
individual. For that you need a knowledge of individual parameters. But I
am still asserting that these methods (and they are not "population
studies") can and do provide useful information about variables that
influence individual behavior. Fair enough?

Regards,

Bruce

[From Rick Marken (971120.1530)]

Bruce Abbott (971120.1300 EST), Bruce Abbott (971120.1700 EST) --

I presume that you are now willing to admit that you were, are
and have always been a Klingon;-) I hope Scotty will be a bit
more careful about what he beams up from now on;-)

Martin Taylor (971120 15:40) --

Oh, please, Martin!

More detailed responses to follow as soon as I finish reading my
horoscope.

Best

Rick

···

--
Richard S. Marken Phone or Fax: 310 474-0313
Life Learning Associates e-mail: rmarken@earthlink.net
http://home.earthlink.net/~rmarken

[From Rick Marken (971120.1800)]

re: Bruce Abbott (971120.1300 EST), Bruce Abbott (971120.1700 EST), and
Martin Taylor (971120 15:40) I said:

More detailed responses to follow as soon as I finish reading my
horoscope.

Again, what am I thinking? I'm not going to be able to tell
these guys anything that will change their minds. I admire
the persistance of those (like Richard Kennaway and Bill Powers)
who are willing to continually re-expose the impropriety and
perniciousness of using group data to study individual behavior.
And I suppose Abbott and Taylor should get some credit for
showing Richard Kennaway the kind of garbage he can expect to
get back from the Science reviewers.

But the group-based statistical approach to behavior research
is one big ship and nothing has been able to change it's course
yet. It's a comfortable and familiar way to do research; it
doesn't require much thought (once you get the knack of it) and
it provides the kinds of results that make good psychological
"sound bites": bystanders will help in an emergency if they see
that no one else is around to help; people will express greatest
satisfaction with the task they are paid least to perform;
reinforcing consequences increase the probability of a response;
alcoholics who believe in god are more likely to be able to
quit drinking than those who don't; Pisces are better lovers
than anyone else (OK, that one's just in the research proposal
stage ;-)), etc. This is really what statistical group-based
research is all about -- careers are made or not depending
on who can get results that have the greatest crowd appeal.

I think the only thing we PCTers can do about this is just go
about our business of studying individual behavior the correct way;
we can provide examples and explanations of how to do this but
that's about it. My "Dancer..." paper, for example, is a first
step at trying to explain to an audience of behavioral scientists
the proper way to do behavior research. That paper should be out
in a couple weeks (the abstract is up at the APA web site --
http://www.apa.org/journals/met/1297tc.html) and it will be interesting
to see if it creates any disturbance at all (probably
not; Bruce Abbott, an officer on the the good ship Conventional
Research, read it and had no problem with it). But some people,
more interested in doing science than writing National Enquirer
headlines, will follow our lead (if we provide it). Those who
want to sell textbooks (or be on Oprah) will continue doing statistical
group- based studies and assuming that the results
apply to individuals.

By the way, this post is so incredibly intelligent because I
have been listening to Mozart (a symphony he composed at the age
of 12!) while writing it. Maybe there's something to that
group-based after all;-)

Best

Rick

···

--

Richard S. Marken Phone or Fax: 310 474-0313
Life Learning Associates e-mail: rmarken@earthlink.net
http://home.earthlink.net/~rmarken/