The Right Tool For the Job

Abbott_Bruce_Indiana · January 2, 1995, 9:30pm

[From Bruce Abbott (950102.1615 EST)]

My posts seem to be getting "stuck" in our SMTP server. The last one
languished there for a day or two. Ah, well...

Bill Powers (941231.1235 MST)

Bruce Abbott (941230.1700 EST)

The problem with the standard statistical analysis has nothing to do
with the analysis per se. It's in the assumed organization of the system
that is analyzed. What your analysis would reveal would be a high
correlation between the controlled variable of a control system and the
setting of its reference signal. But a person knowing nothing of how the
system in the aircraft works might well think that the joystick controls
a stimulus which affects something in the aircraft that causes it to
produce a response in the form of a control surface angle.

Yes, I agree.

Isn't it interesting how much I was able to learn about the system
using that outdated methodology described in my text?

You could learn about observable relationships, but the standard
statistical methods can't tell you that your underlying model is right.
I don't believe that the statistical approach would have suggested that
the control surfaces would push back against your attempts to deflect
them.

Perhaps not, but that fact is easy to demonstrate empirically using ordinary
"IV-DV" methods. I could push or pull the control surfaces with various
amounts of force (IV) and measure the counterforces generated by the servo
(DV). In fact, this is all that "the Test" amounts to, isn't it, i.e.,
applying a disturbance to some variable and observing the system's response?

Anyway, the correlations you would get in this experiment would be in
the high nineties, the kind we're interested in in PCT. I don't know
your opinion on the subject of correlations in the eighties or lower, or
on the subject of using population measures to predict the behavior of
individuals.

In my methods text I recount how, when inferential statistical tests were
first making their way into experimental psychology, journals sometimes
published research articles that contained elaborate ANOVA tables with all the
sources of variance, sums of squares, degrees of freedom, and so on, but which
failed to contain any mention of the actual data--not even the means were
presented. The story is a warning against believing that the results of the
inferential analysis are the important result of the research. I also warn
students that relationships shown by group averages may bear little
resemblance to those shown by individual subjects (Sidman, 1960), and devote a
whole chapter to single-subject methods.

My view of research methods and of statistics is that they are tools. As with
all tools, when used correctly for the purpose for which they were designed,
they can be helpful. When used incorrectly, for the wrong job, they can be
worse than useless. I would not argue that you remove the screwdriver from
your tool-kit just because someone might try to use it as a chisel.

Pearson correlation is one of those tools that too many researchers use
incorrectly, but which can be useful if one is aware of its limitations. It
is fine as a summary index of relationship when the relationship being
summarized is linear, there are enough independent points (two points will
always give r= 1.0), no serious outliers in the data, and the ranges of the
variables involved are not overly restricted (r basically compares the scatter
above and below the best-fitting straight line to the scatter ALONG the line;
the more elongated the oval, the higher the r). I would prefer to work with
scatterplots, residual plots, and measures of variance accounted for (of which
r-squared is one) than with Pearson r, but as a compact summary of the
relationships obtained in PCT experiments it seems perfectly suited.

Anyway, the correlations you would get in this experiment would be in
the high nineties, the kind we're interested in in PCT. I don't know
your opinion on the subject of correlations in the eighties or lower...

It depends on the context. As a student of the experimental analysis of
behavior, I was taught that sloppy relationships indicate a lack of sufficient
experimental control; that such results should precipitate an investigation
aimed at identifying the uncontrolled sources of variation and bringing them
under experimental control, and that such a search often identifies important
variables which themselves may become the subject of future research. The
high correlations achieved between disturbance and response in the PCT model
are impressive precisely because they demonstrate this kind of experimental
"control" (in the EAB sense). In this context, one would want the
correlations to be very high--as close to 1.0 as possible. In other contexts,
moderate correlations may be useful.

But let's return to the context of my original post. What got me started
along this line was Rick's assertion that the methods outlined in my text are
useless for studying "living control systems." I agree with you that a
correct model is necessary if certain data are to be interpreted correctly,
and that different methods of analysis (from those that have been applied in
the past) are required to properly study some aspects of the system. Yet
there are many, many questions about how humans and other animals function for
which these stock-and-trade methods work well, and I strongly disagree with
Rick's assertion that what traditional research methods texts teach must be
seen as total nonsense after one adopts PCT. It is one thing to argue that
the wrong tool has been chosen for the job at hand, but quite another to argue
that the same tool must be used on every job. Even "living control systems"
offer plenty of "jobs" for which those other tools are well suited.

Regards,

Bruce

Tom_BOURBON3 · January 4, 1995, 5:54pm

Tom Bourbon [950104.0950]

Just beginning to catch up after the holidays.

[From Bruce Abbott (950102.1615 EST)]

A quick question, Bruce. Have you read Phil Runkel's book, _Casting Nets
and Testing Specimens: Two Grand Methods of Psychology_?

Another quick (double) question. When you signed on to csg-l, you said you
were planning a new edition of your book on methods -- on all of the right
tools. Have you decided to include anything on PCT? If so, have you
decided how you will present the PCT material, relative to all of the
traditional tools? If so, how? (I guess this was a quick *triple*
question.)
. . .

Bruce was replying to Bill P:

Bill Powers (941231.1235 MST)

who had replied to a post by Bruce:

Bruce Abbott (941230.1700 EST)

Bruce had described the movements of the control surfaces on a model
airplane "in response to" him manipulating the joystick that sets the
reference signal for the device that "controls" the control surfaces. Bruce
had intended this as an example of how an experimenter can manipulate an
independent variable (IV -- joystick position), observe a dependent variable
(DV -- position of the control surface), and learn a lot about the system
(the servo inside the airplane? the pilot-joystick-servo-surface system?).
In subsequent posts, Rick and Bill have dealt with the fact that the example
was about what happens when a pilot changes the reference signal of the
servo, and that it was not about IV-DV relationships, as traditionally
defined. The pilot in the example does not play a role analogous to that of
the experimenter in behavioral research: experimenters cannot manipulate a
joystick to directly, continuously adjust the reference signals inside their
living participants.

Bruce had said:

Isn't it interesting how much I was able to learn about the system
using that outdated methodology described in my text?

Bill:

You could learn about observable relationships, but the standard
statistical methods can't tell you that your underlying model is right.

In fact, using those traditional ("right") tools, you could learn about an
infinite number of observable relationships. Enough to keep you publishing,
famous, and employed for a professional lifetime. It is almost impossible
to fail when you play the IV-DV game. For one thing, it's very hard --
nearly impossible -- to fail to find a statistically significant difference
between two or more sets of scores, given a sufficiently large sample size.
What's more, you can always find new IVs, previously untested combinations
of IVs, and magnitudes of each IV that will fill in gaps or extend the range
of the magnitides previously tested. The journals become full; new journals
are created; behavioral science flourishes. Of course, there is at least one
problem in that traditional (right) scenario: no one learns anything about
the phenomenon of control, or about how living things might create that
phenomenon. One of the defining facts of life goes unnoticed and
unexplained. Other than that, evrything is fine.

I don't believe that the statistical approach would have suggested that
the control surfaces would push back against your attempts to deflect
them.

Perhaps not, but that fact is easy to demonstrate empirically using ordinary
"IV-DV" methods. I could push or pull the control surfaces with various
amounts of force (IV) and measure the counterforces generated by the servo
(DV). In fact, this is all that "the Test" amounts to, isn't it, i.e.,
applying a disturbance to some variable and observing the system's response?

Bill (and Rick?) already replied to your portrayal of "the Test." Enough
said on that.

Bill:

Anyway, the correlations you would get in this experiment would be in
the high nineties, the kind we're interested in in PCT. I don't know
your opinion on the subject of correlations in the eighties or lower, or
on the subject of using population measures to predict the behavior of
individuals.

Bruce:

In my methods text I recount how, when inferential statistical tests were
first making their way into experimental psychology, journals sometimes
published research articles that contained elaborate ANOVA tables with all the
sources of variance, sums of squares, degrees of freedom, and so on, but which
failed to contain any mention of the actual data--not even the means were
presented. The story is a warning against believing that the results of the
inferential analysis are the important result of the research.

Good! But you shouldn't speak of that practice in the past tense. I am
putting together a book about some of the horrors that are wrought when
behavioral and life scientists abuse their experimental designs, their
statistics, and their participants. Examples of the practice you describe
here are among the easiest abuses to find in the *contemporary* literature.
I have a drawer full of them.

I also warn
students that relationships shown by group averages may bear little
resemblance to those shown by individual subjects (Sidman, 1960), and devote a
whole chapter to single-subject methods.

Great! If only more authors would do that. (Have you read Runkel? I
think you would like him.) When I was teaching courses on human perception,
at the start of each semester my students read a nifty little article by
Baloff & Becker (1967), "On the futility of aggregating individual learning
curves." (Not one animal has a learning curve like the aggregated curve, but
guess what shows up in the journals and textbooks.) For the remainder of
the semester, every time they ran an experiment and collected psychophysical
or perceptual data, they would calculate the points for the class aggregate
curve or function. We would post the average curve in the center of a big
bulletin board, surrounded by 25-50 sets of data on individuals, not one of
which looked just like the average. Of course, they could also compare our
wall-o-data with the functions and discussions in their textbook and in
journal articles. It got to where each year a few bright psychology majors
changed to some other major, after a semester of seeing the discrepancies.

My view of research methods and of statistics is that they are tools. As with
all tools, when used correctly for the purpose for which they were designed,
they can be helpful. When used incorrectly, for the wrong job, they can be
worse than useless. I would not argue that you remove the screwdriver from
your tool-kit just because someone might try to use it as a chisel.

Agreed. If procedures that are suited for polling data are used in
properly-run polling studies, that's great. But when someone uses those
methods (often called the "traditional" experimental methods in behavioral
science) to study *individuals*, that's not great; that's an abuse of the
worst order. I say that because the people who use polling methods (methods
suitable for *groups* of people) to learn something about *individuals*
often believe they learned something about individuals. They didn't. The
consumers of research results (therapists, parents, counselors, consultants,
policy makers, etc.) often believe thay can trust those studies as guides to
what people will do, or why they will do it. They shouldn't.
. . .

Bill:

Anyway, the correlations you would get in this experiment would be in
the high nineties, the kind we're interested in in PCT. I don't know
your opinion on the subject of correlations in the eighties or lower...

Bruce:

It depends on the context. As a student of the experimental analysis of
behavior, I was taught that sloppy relationships indicate a lack of sufficient
experimental control; that such results should precipitate an investigation
aimed at identifying the uncontrolled sources of variation and bringing them
under experimental control, and that such a search often identifies important
variables which themselves may become the subject of future research.

Sounds familiar. That's the way we were taught. That's what we taught to
the generation(s) behind us. And "that" is what is wrong, so often. Notice
the unspoken reliance on a lineal model of causality, in those remarks:
behavior, the DV, is a function of very many (an infinity of) antecedent
environmental events (sources of variation; IVs). If my data are sloppy,
it means I haven't identified and controlled all of the IVs that control
behavior. If I keep at it, if I'm a diligent little scientist, I'll be
rewarded by identifying the immense matrix of antecedent IVs responsible
for every slightly different value of the DV. That's the way we account
for data that once were sloppy. The problem is, living things don't work
like that; they control, all of the time. The fact of control by organisms
doesn't show up in the final, grand IV-DV matrix.

The
high correlations achieved between disturbance and response in the PCT model
are impressive precisely because they demonstrate this kind of experimental
"control" (in the EAB sense). In this context, one would want the
correlations to be very high--as close to 1.0 as possible.

Say again? Does "good experimental control" also explain the same high and
negative correlations when we observe them in the parts of nature that
happen to be outside the lab, for example, the high and negative correlations
one could find between the "responses" of a driver and the net disturbances
to the position of the automobile on the road? Those are the kinds of
correlations you will find nearly every time, if you have identified a
reasonably well-controlled variable. Those are the kinds of correlations
that probably define life; they are not flukes of good experimental design
in the laboratory.

In other contexts,
moderate correlations may be useful.

But not very useful, at least not if the uses to which they are put are
significant in the lives of innocent people.

. . .

there are many, many questions about how humans and other animals function for
which these stock-and-trade methods work well, and I strongly disagree with
Rick's assertion that what traditional research methods texts teach must be
seen as total nonsense after one adopts PCT. It is one thing to argue that
the wrong tool has been chosen for the job at hand, but quite another to argue
that the same tool must be used on every job. Even "living control systems"
offer plenty of "jobs" for which those other tools are well suited.

Yes, *if* we want to know which proportions of a particular species of
living control systems will do X, Y or Z, under conditions A, B, or C, then
group methods are the *only* way to go. But we had better not, on pain of
ranting by Marken and Bourbon, say we learned anything specific about any
individual, or about individuals in general -- remember the aggregated
curves and keep them holy. If we want to know about individuals, or about a
particular idividual, we must study individuals, no two ways about it.

Later,

Tom