Statistics and psychotherapy

[From Bill Powers (931102.0730 MST)]

David Goldstein (931102.0630) --

Yes, I do give you a hard time about statistical data. This
doesn't mean that I can't recognize good data:

Thatcher(1989) was able to correctly identify 94.8% of
the cases with respect to the question: Did this person suffer
from a mild head trauma?

That's an impressive result. I automatically ask, however (as I
would ask of anyone's results, including my own) how good the
data are on both sides of the comparison. I have no doubt that
Thatcher can accurately identify the condition of elevated EEG
activity. How well could he identify "mild head trauma?"
Presumably, he had some independent way of determining whether
such a head trauma had actually occurred, but what were the
criteria? You stated no time limit; there is hardly a person
alive who has not had a bump on the head in the last five years,
and maybe even in the last year or two. If Thatcher found a
correlation with verified moderate head injuries within the
previous week or month, that would be one thing. If he merely
asked "Have you ever bumped your head fairly severely?" that
would be another.

Are you satisfied that Thatcher actually identified mild head
trauma within a significant time frame, and that his results
suffer neither from a false-positive nor a false-negative defect?
If you are satisfied, then so am I. I trust your ability to
evaluate statistical manipulations when you put your mind to it.

All this, however, doesn't provide a reply to my question about
the relationship between depression/elation and elevated activity
in the left and right forebrain. Citing one good experimental
result doesn't make up for a different bad experimental result. I
think you would agree that Thatcher's results are outstanding,
meaning that they are exceptionally good. If his results hold up,
they show that in at least some circumstances, statistical
results can be scientifically believable. But they don't show
that all statistical results are believable. To see whether they
are believable, you have to ask the right questions and get the
right answers for each purported fact.

Sonders, a psychiatrist, reports that 80% of the people who
were given EEG Neurofeedback based on these generalizations
[about left/right excitation and depression] experienced
meaningful reductions of depression symptoms.

How many people who did NOT have the hypothesized imbalance of
excitation but WERE depressed were helped by the same EEG
neurofeedback? If that was also 80%, then the generalization is
useless even if the treatment works. And among people who had the
imbalance but were given random feedback, what percentage had
their depression alleviated? I realize that this condition would
probably violate ethical considerations (withholding treatment
that might help merely to get a control population). But in its
absence, one would not be able to say whether the treatment was
actually effective for the assumed reason. Perhaps the treatment
helps 80% of depressed people whether they show any right-left
imbalance or not. Perhaps being in that situation is helpful,
whether systematic feedback is created or not.

I realize that my skepticism sounds like your father downgrading
your accomplishments; having experienced a good bit of that
myself, I sympathize. But one has to get beyond these
associations if one hopes to make sound evaluations. The question
here is not whether David Goldstein is an effective therapist (I
am sure, from talking with you for something close to twenty
years, that you are among the most effective ones). It is whether
the statistical methods you are using are actually helping you in
your work -- whether you are a good psychotherapist because of
using statistical aids, or in spite of them.

I am much more sympathetic with your plight as a clinician than I
used to be. When a person is desperate for your help and you have
no clear and certain solution to the problem, all you can do is
try whatever seems to work, without getting into abstract
scientific philosophy. I hope I have conveyed to you that I
consider that approach not only reasonable but necessary. Your
willingness to tackle any problem that is presented, even when
all you have to go on is caring deeply for your patients, is one
of your greatest strengths.

But that only defines the status quo; it doesn't mean that we
have to be content with it. The point here is that the better
your tools are, the better your results will be both for you and
for the individuals you treat. You have already done considerable
work on sharpening your tools; simply by deriving self-evaluation
adjective lists from your patients' own experiences and language,
rather than from the ambiguous random lists customarily used, you
have made your preliminary evaluations more immediately relevant
to each patient. By using principles of PCT, you have created
self-evaluation methods that go quickly to the real problems that
your patients have.

Why object, then, to a critical and realistic examination of ALL
the standard approaches to human behavior? If one of your tools
is rusty and dull, why not replace it with a better one? At the
very least, shouldn't you discard "facts" that have less than a
50% chance of being true of any one individual who comes to you
for help? Psychotherapy is, or ought to be, intensely concerned
with the welfare of each individual, not simply with achieving a
statistical track record that's slightly better than chance over
the long run. Should you prescribe an expensive and prolonged
treatment when it has a 95% chance of helping 10% of the people
presenting the same symptoms, and a 95% chance of being
psychologically and financially burdensome to 100% of the
patients? Should you act on the basis of a test result for a
given patient when it is known to have less than an even chance
of being correct for a given individual?

I'm realistic enough not to demand 0.95 correlations in the field
where you work. Some day, yes. Now, no. All I really hope for is
to generate serious dissatisfaction with correlations less than
0.6, so much dissatisfaction that they are not taken seriously
any more. That will lead to more attempts to sharpen the tools --
after which I will raise the ante to 0.7, and so on until we get
into a useful range. Of course not many people read the papers
every day to see what my latest criterion is, but some people
care, and some of them will make the attempt. All it takes,
really, is to stop accepting the current state of affairs as the
best that can be done. That kind of acceptance is like saying
that human behavior is just too variable to understand, which is
enough to prevent all attempts at improvement.



Bill P.