[From Jeff Vancouver (2007.08.09.1215 EST)]
> Bill Powers (2007.08.09.0150 MDT)--
>> Rick Marken (2007.08.08.1900) --
>> If you look at the article you will find that this is not true for
>> every mouse; this is an average result being pitched as something
that
>> is true every individual mouse.
I skimmed this article. Given the discussion, I found it interesting that in
the results section the authors were careful to say group instead of mice,
while in the discussion they referred mice. At the same time, they often
said the results "suggested" this or that. I employ my students not to say
results suggest something; only people do. Nonetheless, I see this
grammatical error all the time. I know what they really mean and I doubt it
really leads to misinterpretation. That is, it is probably no big deal. What
is my point? The argument is what interpretations researchers and readers
are making of data and analyses. Thus, it seems the question is not how you
and I would interpret the text because I interpret the researchers as being
largely aware of what the data could mean (i.e., mean differences does not
mean all the mice in the one group differed from all the mice in another
when a mean difference was statistically significant) whereas Rick and Bill
seem to interpret the opposite. That is, the question at hand is how are
these statistic interpreted by those in the target population (e.g.,
psychological researchers). That requires asking the interpreters (not just
us). To study that I might suggest constructing a questionnaire that asks
question about an article that differentiate these two types of
interpretations (or other types or degrees of interpretational issues). For
example, we might ask a reader of the paper in question, which is the most
accurate interpretation of results of related to Figure 4, panel D)?
a. All the young mice in the toy condition swam slower than all the other
young mice.
b. On average, the young mice in the toy condition swam slower than all the
other young mice.
Or
True/False
___ a. No young mouse in the toy condition swam faster than any young mouse
in the control condition.
___ b. Some young mouse in the toy condition might have swam faster than any
young mouse in the control condition.
We would have to agree on the questions and what would be most diagnostic,
but otherwise we are all just extrapolating our belief about how others
would interpret these reports. It is my contention that most (but not all of
course) researchers know better. But that empirical question cannot be
answered by reading what they say because shortcuts (like saying the data
suggest) have permeated scientific writing.
> Actually, I see no reason to suppose that their conclusions are true
of ANY
> mouse in the study. The problem is not just the statistics (presented
in as
> uncommunicative a way as humanly possible), but the leaps of
interpretation
> about the meaning of the various treatments to a mouse.
Hey, I'm just trying to deal with one problem at a time;-)
The leap you refer to is the construct validity problem. That is, we
psychologists are often dealing with the problem of indirect measures or
manipulations of the underlying latent constructs we hypothesize are
involved. Indeed, when one infers the reference level from the TCV, one is
indirectly measuring that parameter. Otherwise, one gets dangerously close
to radical behaviorism, where one is restricted to a science of only the
observable. Now that that perspective has been largely rejected, we have to
be concerned that our indirect measures are contaminated or deficient.
Related to the issue above, writing differs on this. Some are careful to
speak of the variable (measure or manipulation) and not the construct when
discussing the result. Much more common is for researchers to refer to the
construct (like the article example) and maybe (if a reasonable possibility)
refer to construct validity issues in a limitations section. A lot depends
on reviewers and editors. I just had a paper accepted where in an earlier
version I had devoted a lot of discussion to the possible limitations of my
measures and manipulations (largely because reviewers had asked for it).
Yet, for this last journal (versions of it had been rejected from 4 journals
previously), the editor and reviewers thought this discussion was
unnecessary and just made the paper longer than necessary. So I deleted it.
I chock this up to differences in what the reviewers/editors saw as
"reasonable" possibilities. It would not surprise me if someone questioned
the findings of that study by questioning the construct validity of one or
more of the measures/manipulations. That would lead to a study addressing
the issue with their measures and results. So it goes. We are always
questioning each other's variables, interpretations, generalizability, etc.
That is what we do: consider explanations and alternative explanations for
findings, each trying to promote (or undermine) some theory or another.
Jeff said that there are all kinds of examples of psychological
research on groups where the researchers make it clear that the
results are relevant only to the group, not necessarily any individual
therein.
I did not say it this way. For example, I would say that if one found mean
difference between groups then it must mean that at least one individual in
the one group differed from at least one individual in the other group (I
think what you meant to say was not necessarily EVERY individual therein,
not ANY). Note, that when Bill says he thinks the mice finding might not
refer and ANY mice, I believe he is talking about the construct issue, not
the variable (group defining) issue. That is, he is not saying that it is
possible that no young mouse in the toy condition swam slower than any young
mouse in any other condition; he is saying that it is possible that no young
mouse who had a more mentally stimulating environment swam slower because
toys might not translate into mental stimulation. This later case is
possible. How reasonable would be subject to the question of what other
thing in the mouse/environment system might toys be affecting. That is, the
reasonableness of interpretation is judged against alternative explanations.
Unfortunately, alternative explanations are limited to what
reviewers/editors/researchers can think of at the time. But what can you do?
So I went to the psychological research literature and tried
to find examples of group level data in psychology used to study only
groups (as is appropriate) or used to study individuals (which is
not). I quickly found 2 examples of studies where the group level data
is taken to tell us something about individual level processes. So
it's now:
Inappropriate use of group data 2
Appropriate use of group data 0
Maybe Jeff can even out the score or tilt the balance toward
"appropriate".
See above (i.e., I do not think the data you present is very relevant to the
issue at hand).
Although I am hesitant to do this (because it will mean even more of my time
is spent arguing with you all rather than trying to promote PCT to the
psychological community), I am inclined to submit our paper referenced above
as an example. The curious thing about that paper is that I discuss some of
the issues related to this debate within it. My position was similar to
yours. That is, we needed to look at individual models. I had 2 studies
where we examined these models in some detail. We were trying to compare
four possible empirical models of what an individual was doing. But we
rejected many individuals because they did not act consistently during the
study, so none of the models would have applied. This bothered reviewers.
However, there was one study where we had a between-person manipulation. In
such a case, one cannot remove individual subjects because that would create
a mortality threat to internal validity. Thus, we included all subjects. The
paper that was finally accepted is this one where no subjects were dropped.
The effect we found was smaller (because as with the other study, some were
not following instructions/trying to do the task well/or whatever), but
still diagnostic in terms of address our research question. Nonetheless, I
think it violates the prescriptions you two are suggesting (we look at
individual empirical [not process] models, but we never present individual
parameters, only average parameters, mostly using analysis and statistics
that the uninitiated will find hard to follow). In other words, you will
find a lot to complain about.
So if you want the paper, please send me your email (it is copyrighted to
APA, so I cannot post it to the list). It is not yet in proof stage, so you
might not want to print it (it is still in double space form).
Jeff V.