Statistical Analysis

Of course it can be calculated,
and easily, too. Did my detailed tutorial on how to do that [Martin
Taylor 2007.07.17.11.08] simply not arrive in your mailbox? It arrived in
mine, and I’ve made reference to it several times in this duscussion. And
yet you persist in saying that nobody wants to touch that, as if it were
some taboo, rather than a trivial operation. Rick pointed out that you
can get your answer from standard tables or the outptus of standard
programs.
From Bill Powers (2007.07.21.1655 MDT)]

Martin Taylor 2007.07.21.15.36 –

Yes, I got that post, but I must have missed the one in which you
actually did the calculations with some data, like Rick’s spreadsheet
data.

I guess you really
are controlling for a perception of “statistics” with a value
of “useless”, and doing so at a rather higher gain than mere
skepticism would warrant.

Perhaps. I’m trying to learn enough about spreadsheets and trying to find
some real data that I don’t have to pay for and trying to learn the
statistical jargon so I can do these calculations myself. I seem to be
the only one interested in answering my question, but unfortunately I’m
probably the least competent one here to actually do it. So far I haven’t
seen you touching a spreadsheet to show me the trivial operations that
will give me the answers. Not that you have any obligation to do so, but

Best,

Bill P.

Martin Taylor 2007.07.21.15.52]

[From Bill Powers (2007.07.21.1010 MDT)]

Martin Taylor 2007.07.21.11.08 --

In other words, you would like to get more information about each individual, but would discard the information you are actually able to get, on the grounds that it is obtained from people "like" the person of interest, rather than from that person's self.

No, not on those grounds: on the grounds that it is more likely to be false for a given individual than true.

Perhaps, but less so than if you didn't have the group data.

I would prefer high-quality facts about each individual to low-quality facts, where "quality" is simply the probability that the fact will prove to be true of the individual being studied.

Wouldn't everybody? You aren't unique in wanting to act in a well known environment rather than in a fog. How do you clear the fog is the question.

An example would be the height of the individual. I would prefer actual measures of the individual's height, and discard measures of people "like" that individual in regards other than height.

What if you can't get hold of the person to see whether he would fit in your box? Wouldn't you rather select a pygmy bushman than a Masai warrior if you need a short person and all you know is tribal background?

I would rather know a person's actual grades in math courses than the grade point average of people who had similar SAT scores.

Unfortunately, those scores are in the future at the time you are making your choice. Too bad.

But would the result of interviews establish the "fact" you want to know about the individual (presumably the degree to wich they will benefit from the education you offer, or the success of the medical intervention, or ...)?

It would come a lot closer than an SAT score would. As to medical interventions, I would prefer an MRI scan to a survey of people of similar physical traits who show similar symptoms of a brain tumor.

You ae switching the situation. And in all cases you mention so far, you are asserting the possibility of getting information from other, and better sources, which everyone will agree is a good idea. The issue is that you want to NOT get information from one particular kind of source, even when it is available to you, and you seem to be asserting that rejecting the information from group statistics is justified because to use that inforamtion is WORSE than making a pure guess.

Let's consider the SAT - GPA scatterplot, which we all acknowledge shows a pretty poor correlation, reliable though it may be. MSAT accounts for only 22% of the variance in GPA. If we define the median GPA as the pass-fail point, 25 will pass and 25 fail. Suppose we choose 25 at random from the 50 candidates in the scatterplot. Different choices of them will result in different numbers of failures, but on average, 12 or 13 will fail. However, if we use the MSAT, and choose those that are above the median in the scatterplot you posted, we find that 16 pass and 9 fail (using the GPA of all of them to obtain the median). Of the ones rejected, 9 would have passed and 16 would have failed.

I think if you were an admissions officer, and GPA your criterion for success, you would prefer to thing that you would have 16 passes and 9 failures rather than 12 or 13 passes and 13 or 12 failures in the 25 you select. It's not a big improvement, but is an improvement.

You're offering all the standard defenses of statistics that I've heard all my life. They all sound like attempts to defend doing something that one intends to go on doing no matter what, even if it's wrong.

If you've heard all these so-called "defences" (I'd call them explanations), I'm wondering even more bwilderedly why you are defending so strongly the position that it is wrong to use what information you can when asessing a situation.

Back to Martin:

No, your interview would would not establish the "fact" you want to know. All you would do by conducting the interview is to make it more probable that you would select those who would be most likely to benefit (two levels of chance, here).

That would depend on what the interview is about. I would be trying to see how much the person wants to attend college, and why, and how much work the person is willing to contract to do, and whether the person's word is any good.

As I said: "All you would do by conducting the interview is to make it more probable that you would select those who would be most likely to benefit". Are you disagreeing with me? It sounds as though you are just giving examples to support my statement.

Who is there who doesn't deserve a chance to try, even if the end result is not something dazzling? And who is to say that a person who does poorly in high school will not get his act together in college? Not me, for certain! And who has the nerve to tell someone he doesn't deserve an education just because he isn't smart? Who decides what a "benefit" is? Would not a person with a middling to poor intellectual history benefit more than a person who already does intellectual gymnastics with ease?

All most laudable sentiments, with which I think few would disagree. However, admissions officers aren't in the pleasant position of being able to offer places to everyone who wants. They MUST tell some applicants "Sorry, we don't expect you to benefit as much as these others who we have accepted, and because we accepted them, we don't have room for you."

You say you don't want to be the one to select. Fine. I'd hate the job, too. But if I were in a position where I had to make a choice (about anything), I'd want as much information as I could reasonably get, recognizing that acting to get that information is likely to conflict with the requirement to act to control some other perceptions, and that if I don't make the choice before time T, I might as well not choose at all. I use what I can get, and so do real-life admissions officers.

With a high likelihood of misclassifying the person and wrongly treating the person. How high? That can be calculated, but nobody wants to touch that calculation with a 10-foot yardstick, apparently.

I answered that one a few times already. But you follow the Nelson principle of putting your telescope to your blind eye when there is something you don't want to see.

If you can show me what the test results have to be to lower the probability of misjudging an indivdual to an acceptable level, I will pay attention.

ANY correlation is better than none. ANY information is better than none. What you need in order to get the probability down to an "acceptable level" depends on the situation. If you have to make your decision NOW, then you guess based on what you know NOW. If you can wait, you can try to get more information. If the only acceptable level is zero error, you have to take an infinite time before you make your choice (Shannon).

Of course we then have to agree on what is acceptable. Richard Kennaway did that analysis once, and the results were so shocking that people with a vested interest in statistics as it is done immediately leaped on him and then abandoned the subject as quickly as possible.

Not my understanding of what happened.

There is always a way to get more information if you really want it.

Aye, there's the rub!

That stops all the discussion, right there. You can always wait, you can always devote more resources, if you really want it. But in the real world what you say simply is not true. You can't always wait, and you don't always have access to more resources. Substitute "is always" with "may well be" and I'll agree with you.

Sixty percent correct may not be much better than 50%, but it IS better. Causality doesn't enter into it; the correlation helps you make the choice that is more likely to turn out well.

I think that scheme is a lot like the one so many amateur gamblers come up with. If they just double their bets every time they lose, they will eventually come out ahead.

Nonsense. If you win 60% of the time with an even payoff win or lose, you win big-time by doing that, if you don't lose all your capital early on in the game. The random-walk problem of a long losing run that exhausts your capital very soon has a low probability of occurring within any reasonable finite time.

And anyway, the choice is not between 50% and 60% right for an individual; it goes the other way for many statistical tests. Look at that paper I did for Hershberger's collection, showing that there was an upward trend in a dataset where the same variables were related in the direction opposite to the trend, for every individual in the set.

I wondered why you hadn't brought up earlier that example of why effects for individuals occasionally don't go the same way as the correlations for the group would suggest. I had thought of mentioning it myself as a reason to be guarded. This as a perfectly valid argument for looking carefully at the implications of any correlation. You could make the same argument if the group correlation is 0.95. The effect still might go the other way within any individual. It ties in with the issue of confusing correlation with causality.

You do not have enough capital to survive being wrong 40% of the time. It wouldn't help the gambler to know that a certain strategy would increase his chances of winning from 50% to 60%.

It would. That's what the house bets on.

But if 40% of my patients die from the medication, there's little solace in knowing that 50% might have died without it. I would go on looking for a better medication, rather than stopping to practice medicine before I knew what I was doing.

Not knowing of a beter medication, then, I presume you would withold the medication and callously accept a 50% death rate, rather than using the medication you have available and reducing the death rate to 40%. I'd call that criminal negligence.

But you do a disservice to the individuals you want to serve if you arbitrarily discard relevant information on the grounds that probabilities are not certainties.

I have just the opposite feeling: I do a disservice if I use a treatment based on bad data on the grounds that on the average it provides an improvement -- but knowing that for any individual, it will most probably be useless or harmful.

"On the average it provides an improvement, but for any individual it will MOST PROBABLY be useless or harmful". Think about that statement for a moment.

I can think of statistical distributions for which this statement could be true, but they are pathological and very, very, rare in real-life situations. Whay take such situations as the norm, and base your actions on the supposition that they are likely to occur, before you have evidence (easily obtained) as to whether the situation of interest has this pathological form?

I write a lot of words, but the gist is this: some information is better than none, and good information is better than poor. If you have time and resources to get good information on something that matters to you, go for it. If you have to act now, don't, on ideological grounds, discard information that you do have.

Martin

[From Rick Marken (2007.07.21.2320)]

Bill Powers (2007.07.21.1205 MDT)]

Then please explain the quote I put in my next to last post to Martin:

There you say you don't think it is a mistake to use the regression line as
a predictor of individual results.

Poorly worded. I meant that it's not a mistake to predict individual
results when the goal is to achieve a certain group result. For
example, past performance can be used as a basis for making many
individual hiring decisions in order to get the best overall set of
employees for the firm.

You were using the regression line as a
predictor of individual countries' infant mortality rates

I did that only to show you how statisticians currently answer what I
thought was your question: how far off is a regression- based
prediction likely to be for each individual? The spreadsheet shows
that it will be pretty far off (18 on average) even when the
correlation is quite high (.80). I would not use those data to predict
any individual country's infant mortality unless I wanted to achieve a
result at the "group of countries" level (see my "god" example below);
nor would I use it to say that a country's per capita income tells me
something important about the quality of healthcare in that country.
All the spreadsheet analysis shows is that there is a pretty strong
relationship between log per capita income and infant mortality over
this set of countries.

If I were god and I had wanted to cut down the population of my
creation but had only a limited number of counties I could flood to
achieve this end and knew only each country's per capita income, then
I would use this data to determine which countries I should flood to
get the best overall result; I would flood the countries with the
highest per capita income. I'd know that, by doing this, I'd probably
be flooding some countries that actually have a pretty high infant
mortality and not flooding some with a pretty low one. But at the
group level if I know only the per capita income of the countries I
would get the best results, population reduction-wise, by flooding the
high per capita income countries.

Right. And therefore if your aim is to serve or work with individuals, you
should not use group statistics to evaluate the individuals.

Yes. That's a message I am always giving to my students, especially
those who are going into clinical practice.

You should use
modeling and experiments, or the method of levels, or any other way of
testing specimens.

Yes. I was hoping to be able to include this in a course on research
methods. The problem was that it took me all semester to present the
standard material and, more importantly, there isn't much of this kind
of testing specimens type research that has been done. If you and I
are the only people doing the research -- and since I am only modestly
competent at doing it anyway -- then there's a real problem with
developing a curriculum for teaching a model-based science of control
by specimens.

If the admissions officer says "I'm sorry, we're accepting only SAT scores
above 700 this year," and turns an individual away, that officer cares more
for the school's policies than that individual student's education. Is it
rude to say that? Maybe. Is it true? Yes.

I just meant that the officer himself may care for individuals; but
his job is to enforce policies designed to improve the group outcome.
This is quite a different role than that of a teacher or physician,
who deals with individuals (at least I do, as a teacher); if a teacher
or physician deals with an individual based on group data then they
are just doing what I call institutionalized prejudice; and I teach my
clinical students not to do that.

I think we have the heart of my complaint right here. I keep asking "What
are the chances of guessing wrong about an individual," and the answer I
keep getting is, "Well, of course sometimes we will be wrong about an
individual." I want to know what "sometimes" means. If "Some people" means
2% of them, that's unfortunate but hard to imagine improving. However, if it
means 40% of them, that's an entirely different matter. If it means 75%, we
have a case verging on fraud. So HOW DO WE CALCULATE THE ODDS THAT AN
INDIVIDUAL WILL BE MISCLASSIFIED, MISDIAGNOSED, MISTAKENLY
REJECTED AND SO ON?

And after I find out how, I want to see it actually done, not just
talked about

I'll try to work it out for you in the spreadsheet tomorrow morning.
It doesn't seem like it would be too hard. I think this is what is
done in discriminant analysis (and signal detection theory, for that
matter); depending on the strength of the relationship we should be
able to determine the probability that a person will be misclassified
(False Alarm + Miss probability). I can't do it analytically but I bet
Martin or Richard Kennaway could. What I think you want is a formula
for:

Pr(False Alarm I D)+Pr(MissI D) = Pr(Misclassification I D) = f (r)

where D is the decision variable, like SAT and r is the correlation
between D and the criterion variable, like GPA.

If a student has any choice, my advice would be not to take any of those
tests

I think this Ruby Ridge attitude is a little uncooperative.

I don't get the allusion to Ruby Ridge.

The allusion was to people who don't want to follow agreed-on policies.

If a student is turned down for entrance to a college, that is very serious
for the student, though the college can always find some other student who
tests better. If the chances of being turned down for invalid reasons are
very low, we can just shrug and say "It happens." But if the chances are
very large, that's not an option if we're trying to be fair. So it's
important to find out whether those chances are low or high.

I would hope that the chances of being incorrectly rejected (and
accepted) is lower than it would be if your entry was simply
determined by whether your name was pulled from a hat. But we'll see
what happens with the formula. If

Pr(MisclassificationI D) >= Pr (Misclassification)

where Pr(Misclassification) is the probability of misclassifying given
by simply random classification, then I would agree that there is a
serious problem here.

It is, at the group level. But it's possible to improve the group measures
while still doing damage to the majority of people in it.

This seems quite counter intuitive. I can't see how it can be correct,
but then I couldn't see how the solution to the Monty Hall probability
problem could be correct either. So I'm not saying anything until I
write the simulation in the spreadsheet.

I thought that
sacrificing the good of individuals for the improvement of the group is a
bad thing -- isn't that called Fascism? Or do I have it mixed up with
Communism?

If this is what is demonstrably going on when you use a decision
variable like SAT to improve group performance then that would be a
jolly bad thing, no matter what you call it.

Best

Rick

···

--
Richard S. Marken PhD
Lecturer in Psychology
UCLA
rsmarken@gmail.com

No, not on those grounds: on the
grounds that it is more likely to be false for a given individual than
true.

Perhaps, but less so than if you didn’t have the group
data.
[From Bill Powers (2007.07.22.0440 MDT)]

Martin Taylor (2007.07.21.15.52) –

Not necessarily. Suppose you come up with a treatment that markedly helps
20% of a population, but you don’t know which 20% will be helped, so you
apply it to everyone. Since 20% of the population will in fact be helped,
the group status of the population improves significantly on the relevant
measure. But since 80% of them will not be helped, and will only suffer
the expense and whatever range of side-effects of the treatment there may
be, most of the people in the population will either spend their money
for nothing, or in addition be somewhat harmed in other ways.

As this situation is usually presented, the impression is given that
everyone is helped 20% by such a treatment. “Helps prevent heart
attacks” is the way it’s put. You get the idea that if you take the
pill, you will be at least somewhat protected against a heart attack. But
that’s not how it works. A heart attack is either prevented or it’s not
prevented; you can’t 20% prevent something. You don’t know which people
would have had a heart attack without the treatment, and you don’t know
which 20% of those who didn’t have one didn’t have it because of the
pill. The “helps prevent” concept is a clever advertising scam
that presents group data as if it applies to individuals. Most of medical
advertising works this way, and always with the disclaimer, “Consult
your physician”. That means “We have just told you a lie.”
Insurance advertising also does this: there’s one insurance ad that shows
accidents in reverse, with everything being put back together as if it
had never happened.

I would prefer
high-quality facts about each individual to low-quality facts, where
“quality” is simply the probability that the fact will prove to
be true of the individual being studied.

Wouldn’t everybody? You aren’t unique in wanting to act in a well known
environment rather than in a fog. How do you clear the fog is the
question.

A lot of my discomfort here is not so much the fogginess but the apparent
contentment with the statistical approach. Of course we must use the best
abilities we have to deal with problems, even if they’re pretty feeble.
But enormous investments are made in these feeble measures, just as if
they were powerful and worked well. And when the suggestion is made that
we really need better ways of solving problems, all the defenses go up,
making it even less likely that we will end up with something
better.

I have noticed that in the last couple of years, something called
“systems biology” has finally appeared on the scene. They are
finally starting to ask how this system works, not one little local
chemical reaction at a time or even less specifically, but globally and
in detail. This is going to be so much more effective than the
traditional statistical approach that I predict the demise of statistics
for all uses but those specifically concerned with group
phenomena.

You are switching
the situation. And in all cases you mention so far, you are asserting the
possibility of getting information from other, and better sources, which
everyone will agree is a good idea. The issue is that you want to NOT get
information from one particular kind of source, even when it is available
to you, and you seem to be asserting that rejecting the information from
group statistics is justified because to use that inforamtion is WORSE
than making a pure guess.

Maybe in the light of what I said above my reactions are a little clearer
now. It’s not that I think we should stop using these small chances of
helping; it’s that I don’t think we should defend that as a good thing,
to the point where we spend all our research money on it instead of the
systems approach which is infinitely more likely to give us real
solutions. Why not just admit that statistics is a dull tool that we use
when we don’t understand how something works? When something goes wrong
with a system that we understand, we don’t just bang on it and shake it
and drip things into it because sometimes doing that has helped in the
past. We open it up and fix it, and we know it’s fixed because we found
what the problems was. That approach is just in a different universe from
the statistical approach.

You’re offering all
the standard defenses of statistics that I’ve heard all my life. They all
sound like attempts to defend doing something that one intends to go on
doing no matter what, even if it’s wrong.

If you’ve heard all these so-called “defences” (I’d call them
explanations), I’m wondering even more bewilderedly why you are defending
so strongly the position that it is wrong to use what information you can
when asessing a situation.

Maybe you understand what I’m saying now that I’m saying it better. You
can keep accumulating better statistical information until you are
stuffed full of it, but that way of solving problems will never approach
the success rate that comes from understanding the system. You’re
assuming that we will be stuck with the statistical method forever, so
all we can hope for is to improve the probabilities. But that’s not true.
When your car stops, and you do a little investigation that shows you
have run out of gas, you fill the tank and drive on. You don’t review all
the past cases in which the car stopped, and try to figure out which of
the remedies that worked in the past is the most likely to work now. By
the time that method has brought you around to filling the tank, you
would have been on your way a week ago by the systems method.

As I said:
“All you would do by conducting the interview is to make it more
probable that you would select those who would be most likely to
benefit”. Are you disagreeing with me? It sounds as though you are
just giving examples to support my statement.

This question presupposes that the statistical approach is all we will
ever have to go on, and that there is no way of understanding why one
person does well in school and another doesn’t, as if the universe is run
by chance instead of regular relationships. If a student says “I’m
applying because my parents want me to go to college, but I don’t give a
damn about going to college and would rather do something else
instead,” the admissions officer has a rather easy decision to make
that doesn’t depend on probabilities at all, at least not the sort of low
probabilities he’s used to. The more we understand about intentions and
desires and the structure of the control hierarchy, the less we will have
to rely on uncertain measures to make decisions.

Who is there
who doesn’t deserve a chance to try, even if the end result is not
something dazzling? And who is to say that a person who does poorly in
high school will not get his act together in college? Not me, for
certain! And who has the nerve to tell someone he doesn’t deserve an
education just because he isn’t smart? Who decides what a
“benefit” is? Would not a person with a middling to poor
intellectual history benefit more than a person who already does
intellectual gymnastics with ease?

All most laudable sentiments, with which I think few would disagree.
However, admissions officers aren’t in the pleasant position of being
able to offer places to everyone who wants.

That’s because we don’t know how to teach, either. If we did, we would
simply take everyone who applied, and if the school was full, we’d build
another one.

They MUST
tell some applicants “Sorry, we don’t expect you to benefit as much
as these others who we have accepted, and because we accepted them, we
don’t have room for you.”

That’s not because of some kind of natural law; it’s because of
deficiencies in our understanding. Is there some good reason why everyone
who wants to get a college education shouldn’t be able to get
one?

You say you don’t
want to be the one to select.

I didn’t say that. I described what my criteria would be. If I believed
that the applicant really wanted a college education, I’d say “Fine,
you’re in.” That may explain why nobody has hired me to be an
admissions officer.

With a high
likelihood of misclassifying the person and wrongly treating the person.
How high? That can be calculated, but nobody wants to touch that
calculation with a 10-foot yardstick, apparently.

I answered that one a few times already.

No you didn’t. You described how, in principle, it would be done by
someone with the necessary skills, but you didn’t actually do it. The
only one trying to get to the point of actually doing it are Rick and I
– Rick promises to start in the morning.

But you
follow the Nelson principle of putting your telescope to your blind eye
when there is something you don’t want to see.

I’m not looking where your signal flag is because you’re sending an
irrelevant message. I don’t want to be told the method, I want to be told
the results of applying it. I’m having to fumble my way toward learning
the method, but if it’s as trivially easy as you claim, why aren’t you
the one doing it?

If you can
show me what the test results have to be to lower the probability of
misjudging an indivdual to an acceptable level, I will pay
attention.

ANY correlation is better than none. ANY information is better than none.
What you need in order to get the probability down to an “acceptable
level” depends on the situation. If you have to make your decision
NOW, then you guess based on what you know NOW. If you can wait, you can
try to get more information. If the only acceptable level is zero error,
you have to take an infinite time before you make your choice
(Shannon).

I can’t think of a better description of the weaknesses of the
statistical approach. That’s exactly why I take my car to a repair shop
instead of a statistician. The mechanic may not have a PhD, but he
understands cars.

You do not have
enough capital to survive being wrong 40% of the time. It wouldn’t help
the gambler to know that a certain strategy would increase his chances of
winning from 50% to 60%.

It would. That’s what the house bets on.

No. Casinos work on a statistical margin of around 3%. That’s how you
avoid killing the goose that’s laying the golden bets.

Not knowing of a
better medication, then, I presume you would withold the medication and
callously accept a 50% death rate, rather than using the medication you
have available and reducing the death rate to 40%. I’d call that criminal
negligence.

I would too. I think you know now that this is not the course I
advocate.

“On the
average it provides an improvement, but for any individual it will MOST
PROBABLY be useless or harmful”. Think about that statement for a
moment.

I think I explained in in the opening part of this post.

I can think of
statistical distributions for which this statement could be true, but
they are pathological and very, very, rare in real-life situations. Whay
take such situations as the norm, and base your actions on the
supposition that they are likely to occur, before you have evidence
(easily obtained) as to whether the situation of interest has this
pathological form?

Perhaps now you can see that this statement is a lot more likely to be
true than you claim.

I write a lot of
words, but the gist is this: some information is better than none, and
good information is better than poor. If you have time and resources to
get good information on something that matters to you, go for it. If you
have to act now, don’t, on ideological grounds, discard information that
you do have.

I quote Monty Python: And now for something completely different:
understanding.

Best,

Bill P.

[From Bill Powers (2007.07.22.0610 MDT)]

Rick Marken (2007.07.21.2320) –

Attached is the IM spreadsheet. I changed the chart to a log plot, which
shows that if we do log transformations we should get an even higher
correlation. 0.9 might be possible. I think this illustrates the idea
that with high enough correlations at the group level, it is possible to
make some modestly accurate predictions at the individual level.

Best,

Bill P.

imlogplot.xls (42 KB)

[Martin Taylor 2007.07.22.10.21]

[From Bill Powers (2007.07.22.0440 MDT)]

If I understand your basic point, it is that money spent on statistical surveys would be better spent on discovery of mechanisms. However, your previous messages have argued that one should not use the results of statstical surveys when considering actions relating to individual members of classes surveyed. It's that "Do not use information available to you" stance to which I object.

Discovery of mechanism is obviously better than blind statistical survey, but survey frequently precedes mechanism. How, other than by statistical survey, would the source of cholera in the Braod Street pump have been found, and how would we, from that discovery, have learned that cholera is a water-borne disease?

Practitioners don't have the luxury of researching mechanism. They have to use what information is available to them. If that is limited to statistical data, so be it. That's what they must use, no matter how much they might prefer a more ideal understanding of what is going on.

I'm not going to try to answer each point in you message, as these interchanges tend to expand unnecessarily. But I am commenting on perhaps more points than necessary, below. The gist is above.

Martin Taylor (2007.07.21.15.52) --

No, not on those grounds: on the grounds that it is more likely to be false for a given individual than true.

Perhaps, but less so than if you didn't have the group data.

Not necessarily. Suppose you come up with a treatment that markedly helps 20% of a population, but you don't know which 20% will be helped, so you apply it to everyone. Since 20% of the population will in fact be helped, the group status of the population improves significantly on the relevant measure. But since 80% of them will not be helped, and will only suffer the expense and whatever range of side-effects of the treatment there may be, most of the people in the population will either spend their money for nothing, or in addition be somewhat harmed in other ways.

As I said yesterday:

I can think of statistical distributions for which this statement could be true, but they are pathological and very, very, rare in real-life situations. Whay take such situations as the norm, and base your actions on the supposition that they are likely to occur, before you have evidence (easily obtained) as to whether the situation of interest has this pathological form?

Perhaps now you can see that this statement is a lot more likely to be true than you claim.

I don't agree, but rather than starting a "Yes it is" "No it isn't" conflicted dialogue, let's try to see why.

You postulate the case in which statistical data show that 20% of the people will be markedly helped, but that say nothing about the other 80%. That's a very strange survey. Let's instead postulate a survey in which the fate of the other 80% is in fact known, and let's also postulate that the results for this 80% fall moderately evenly around small benefit and small harm, with some known percentage showing nasty side-effects. I think that's a more realistic case.

Now what is a doctor to do when confronted with a patient showing the condition? Surely it depends on how ethical that doctor is. I suspect most doctors would say "You have a 1 in 5 chance of getting better if you take this drug, but there's a 5% chance it will make you appreciably worse. Do you want to take it?"

What would a medical researcher be likely to do? The statistical data indicate that there is an important difference between one sub-population and the rest of the people. The very first thing would be obvious: see if there is any characteristic property of the people who benefit as compared to those who don't already noted in the data collection: say almost all of those who benefit have names starting with letters S or later in the alphabet. Of those people 95% get the big benefit, and of the others only 3% do.

Now, knowing that, what does the ethical doctor do? I leave that as an exercise for the reader, and postulate that the doctor does know the name of the patient.

What does the medical researcher do? Knowing full well that correlation does not imply causality, I suggest that the next step is clearly to look and see if there is anything that affects people with late alphabetic names differently than it affects people with earlier alphabetic names. In other words, the strange data revealed in the initial statistical survey lead to clues about mechanism. (Actually, I'm thinking of stuff like sickle-cell anemia and some of the more recent linkages of drug response with genetic factors).

As this situation is usually presented, the impression is given that everyone is helped 20% by such a treatment..... The "helps prevent" concept is a clever advertising scam that presents group data as if it applies to individuals. Most of medical advertising works this way, and always with the disclaimer, "Consult your physician". That means "We have just told you a lie." Insurance advertising also does this: there's one insurance ad that shows accidents in reverse, with everything being put back together as if it had never happened.

Come on, now! You must be getting pretty desperate to sustain your "statistics is useless" belief if you have to resort to drug advertising to sustain it! How long were tobacco companies advertising that smoking was good for you after they knew it killed -- using statistics?

A lot of my discomfort here is not so much the fogginess but the apparent contentment with the statistical approach.... And when the suggestion is made that we really need better ways of solving problems, all the defenses go up, making it even less likely that we will end up with something better.

That's not a very strong straw man. Medical research (any research) asks "why", whereas practitioners ask "what is best for me to do now?" Practitioners take what information they can get; in the example above, if the patient's name is Zachary, they suggest taking the drug, with the warning that there's a 5% chance it will do no good and might be harmful. If the patient's name is Arnold, they say that there is this drug, but there's only a 1 in 30 chance it will help, and about 50% chance it will do harm. Unless Arnold is on the brink of death, Arnold will probably not choose to take the drug, but Zachary quite probably will, even if his symptoms are not life-threatening.

Researchers, on the other hand, take the statistical data as evidence, and trawl through it for suggestions as to where to look next for mechanism. I see no evidence of "defences", in either case.

I have noticed that in the last couple of years, something called "systems biology" has finally appeared on the scene. ...This is going to be so much more effective than the traditional statistical approach that I predict the demise of statistics for all uses but those specifically concerned with group phenomena.

It will be interesting to see if you are correct. When I was in graduate school, I thought that significance tests in statistics would be long forgotten before twenty years were out. Yet here it is almost half a century later, and they are still the (fools') gold standard for publishing in experimental psychology.

Yes, statistics can be, and often are, misused. But I think you look in the wrong places for those misuses.

All most laudable sentiments, with which I think few would disagree. However, admissions officers aren't in the pleasant position of being able to offer places to everyone who wants.

That's because we don't know how to teach, either. If we did, we would simply take everyone who applied, and if the school was full, we'd build another one.

That's a political problem, not a scientific one. It is the socialist ideal, a wondeful, impractical, dream. Taxes pay for those schools, and EVERYONE KNOWS that lower taxes trump all those idealistic fantasies.

There is, however, a scientific issue, which is how to find enough teachers of sufficient knowledge and teaching skill. Ideally every student should be able to interact freely with the teacher, but each teacher has a limited capability to conduct multiple conversations. Even if we had perfect virtual presence systems, you still couldn't effectively teach large classes. You have to train teachers first.

With a high likelihood of misclassifying the person and wrongly treating the person. How high? That can be calculated, but nobody wants to touch that calculation with a 10-foot yardstick, apparently.

I answered that one a few times already.

No you didn't. You described how, in principle, it would be done by someone with the necessary skills, but you didn't actually do it. The only one trying to get to the point of actually doing it are Rick and I -- Rick promises to start in the morning.

The "necessary skill" basically is the ability to count how many points in a scatterplot are above or below a line on the graph. No great skill, but if you have equations that will do it for you, the job is easier, isn't it?

From yesterday's post, here are numbers from the plot you sent:

Let's consider the SAT - GPA scatterplot, which we all acknowledge shows a pretty poor correlation, reliable though it may be. MSAT accounts for only 22% of the variance in GPA. If we define the median GPA as the pass-fail point, 25 will pass and 25 fail. Suppose we choose 25 at random from the 50 candidates in the scatterplot. Different choices of them will result in different numbers of failures, but on average, 12 or 13 will fail. However, if we use the MSAT, and choose those that are above the median in the scatterplot you posted, we find that 16 pass and 9 fail (using the GPA of all of them to obtain the median). Of the ones rejected, 9 would have passed and 16 would have failed.

Isn't that what you asked for? I used the method I described in the tutorial, and assumed that you were going to select half of the applicants. You could, of course select 25% or 75%.

Martin

Message
[David Goldstein]

Dear Bill, Martin and Rick,

Here is a small point. The existance of community colleges in the USA gives all students the chance to show

that they can do college work.

If they perform adequately well at the community college, they can transfer to a 4-year college.

David

···

-----Original Message-----
From: Control Systems Group Network (CSGnet) [mailto:CSGNET@LISTSERV.UIUC.EDU] ** On Behalf Of** Bill Powers
Sent: Sunday, July 22, 2007 8:18 AM
To:
CSGNET@LISTSERV.UIUC.EDU
Subject: Re: Statistical Analysis

[From Bill Powers (2007.07.22.0610 MDT)]

Rick Marken (2007.07.21.2320) –

Attached is the IM spreadsheet. I changed the chart to a log plot, which shows that if we do log transformations we should get an even higher correlation. 0.9 might be possible. I think this illustrates the idea that with high enough correlations at the group level, it is possible to make some modestly accurate predictions at the individual level.

Best,

Bill P.

No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.476 / Virus Database: 269.10.12/910 - Release Date: 7/21/2007 3:52 PM

No virus found in this outgoing message.

Checked by AVG Free Edition.

Version: 7.5.476 / Virus Database: 269.10.12/910 - Release Date: 7/21/2007 3:52 PM

[From Rick Marken (2007.07.22.0930)]

I have to go visit my adorable mother-in-law and write a letter to the
LA Times. So I won't have time get a final answer to Bill's question
completed today. Bill's question was:

So HOW DO WE CALCULATE THE ODDS THAT AN
INDIVIDUAL WILL BE MISCLASSIFIED, MISDIAGNOSED, MISTAKENLY
REJECTED AND SO ON?

I have built a spreadsheet that will let me calculate the probability
(not the odds; I'm not a gambler so I don't know what there are) that
an individual will be misclassified as a function of the correlation
between predictor scores and outcomes. I just have to write the loop
that does the calculation. But just clicking through the program and
looking at the misclassification probability as a function of
correlation, it looks to me like the correlation between predictor and
criterion has to be at least .5 before you start getting a
misclassification probability that is equal to or less than the
misclassification probability using random guessing.

So Bill is right again!! Using a predictor variable for selection in
order to produce group results that are better than those you would
get from just random selection will lead to _worse_ group results and
a higher probability of misclassification of individuals unless the
correlation between predictor and criterion is pretty high by
conventional standards -- probably .5 or greater.

This seems like a VERY important point that should be made in every
statistics text that discusses the use of regression analysis for
prediction. I'll definitely include it in my lectures on regression
and correlation from now on.

I seem to recall Richard Kennaway already given the mathematical
solution to this problem. And I think Tom Bourbon pointed out the same
thing at a meeting. So I am probably just reinventing (or
rediscovering) the wheel.

Anyway, does this address your question, Bill?

Best

Rick

···

--
Richard S. Marken PhD
Lecturer in Psychology
UCLA
rsmarken@gmail.com

Using a predictor variable for
selection in

order to produce group results that are better than those you would

get from just random selection will lead to worse group results
and

a higher probability of misclassification of individuals unless the

correlation between predictor and criterion is pretty high by

conventional standards – probably .5 or greater.
I seem to recall Richard
Kennaway already given the mathematical

solution to this problem.
Anyway, does this address your
question, Bill?
[From Bill Powers (2007.07.22.1050 MDT)]

Rick Marken (2007.07.22.0930) –

You don’t suppose the break-even point is going to turn out to be 0.7071
(for one-dimensional corelations), do you?

Yes, I’m trying to find that paper. I know it’s on my computer
somewhere.

Right on the button.

Best,

Bill P.

[Martin Taylor 2007.07.22.14.44]

[From Bill Powers (2007.07.22.1050 MDT)]

Rick Marken (2007.07.22.0930) --

Using a predictor variable for selection in
order to produce group results that are better than those you would
get from just random selection will lead to _worse_ group results and
a higher probability of misclassification of individuals unless the
correlation between predictor and criterion is pretty high by
conventional standards -- probably .5 or greater.

You don't suppose the break-even point is going to turn out to be 0.7071 (for one-dimensional corelations), do you?

I seem to recall Richard Kennaway already given the mathematical
solution to this problem.

Yes, I'm trying to find that paper. I know it's on my computer somewhere.

Try http://www.cmp.uea.ac.uk/~jrk/distribution/corrinfo.pdf

However, nowhere does Richard say anything remotely like what Rick suggests.

Richard's arguments deal largely with the reliability of relationships. He asks questions like "What is the correlation you need between X and Y in order to be 95% sure that if measure X for an individual is above the median then measure Y will be as well". That's very different from saying that the probability that a randomly selected individual will have Y above the median is greater than the same probability taking the individual's X values into account, if the correlation is low (misclassification).

Considering just the median, if you don't consider the correlation at all, then the probability a randomly selected individual will have a measure above the median is exactly 0.5. You can't do worse than that, because if you did find some other measure that would lead you to a probability of a randomly selected person (given the value of the other measure) being above the median was only 0.4, then you should reverse your selection procedure to get 0.6.

If you want to know Y, and you have values for an X known to be correlated with Y, then you have more information about Y (lower standard deviation, if it's a normal distribution) than you would without knowing X.

Martin

Discovery of mechanism is
obviously better than blind statistical survey, but survey frequently
precedes mechanism. How, other than by statistical survey, would the
source of cholera in the Braod Street pump have been found, and how would
we, from that discovery, have learned that cholera is a water-borne
disease?
[From Bill Powers (2007.07.22.1335 MDT)]

Martin Taylor 2007.07.22.10.21 –

I agree that survey precedes mechanism. Before much was understood about
natural laws, about the only method of prediction available was to go by
past observation and extrapolate them into the future. Modern statistics
represents that prescientific approach brought up to a state of maximum
usefulness.

I suppose (not being much of a historian) that the concept of explanatory
models, around the time of Galileo, started the transition to what I call
“understanding how things work.” That led to science, and the
far more powerful methods now available.

Practitioners don’t
have the luxury of researching mechanism. They have to use what
information is available to them. If that is limited to statistical data,
so be it. That’s what they must use, no matter how much they might prefer
a more ideal understanding of what is going on.

Well, I hope it’s clear now that I don’t militate against using
statistical facts when they are all we have. I think we have reached the
point of diminishing returns on that approach, however, and I recommend
putting a much larger part of our resources into studying explanatory
models, which potentially have a far better payoff.

I’m not going to
try to answer each point in you message, as these interchanges tend to
Not necessarily. Suppose you come up with a treatment that markedly helps
20% of a population, but you don’t know which 20% will be helped, so you
apply it to everyone. Since 20% of the population will in fact be helped,
the group status of the population improves significantly on the relevant
measure. But since 80% of them will not be helped, and will only suffer
the expense and whatever range of side-effects of the treatment there may
be, most of the people in the population will either spend their money
for nothing, or in addition be somewhat harmed in other ways.

You postulate the case in which statistical data show that 20% of the
people will be markedly helped, but that say nothing about the other 80%.
That’s a very strange survey.

No, it’s not. You send out the questionnaires and dump the returned data
into your statistical program, which reveals that there is a correlation
between administering the pill to this group and getting an improvement
which is not seen in the control group. The fate of all the people is
known, but there is so much variability in the data that it’s not obvious
who benefited. If you could simply measure the amount of benefit in each
individual, you wouldn’t need the statistics. Instead, you see signs of
benefit, often indirect, superimposed on a lot of noise. It takes a
statistical analysis to find that there was an effect of the treatment
relative to the effect on the control group. And since statistics uses
group averages, sigmas, and other measures that use all the data, there
is no way to tell which people produced the effect. You can’t tell
(except from common sense) the difference between 20% of the people
getting all the benefit, and all the people each getting 20% of the
benefit – if the noise is as large as the signal.

Let’s instead
postulate a survey in which the fate of the other 80% is in fact known,
and let’s also postulate that the results for this 80% fall moderately
evenly around small benefit and small harm, with some known percentage
showing nasty side-effects. I think that’s a more realistic
case.

But use common sense. How do you get 20% prevention of heart attacks?
There is no such thing. The basic data is the incidence of heart attacks,
which either occur or don’t occur. You’re measuring the incidence after
treatment, but obviously you’re not very often measuring a second heart
attack – usually it’s different people having the attacks afterward.
Because you have to use all the the data set to compute the statistical
measures (such as means, standard deviations, and covariance), the
mathematics does not allow for identifying the individuals.

Now what is a
doctor to do when confronted with a patient showing the condition? Surely
it depends on how ethical that doctor is. I suspect most doctors would
say “You have a 1 in 5 chance of getting better if you take this
drug, but there’s a 5% chance it will make you appreciably worse. Do you
want to take it?”

We’re not arguing about the best way to present the results once a person
has decided to base a decision on the group statistics. We’re discussion
how accurate such decisions are likely to be.

What would a
medical researcher be likely to do? The statistical data indicate that
there is an important difference between one sub-population and the rest
of the people. The very first thing would be obvious: see if there is any
characteristic property of the people who benefit as compared to those
who don’t already noted in the data collection: say almost all of those
who benefit have names starting with letters S or later in the alphabet.
Of those people 95% get the big benefit, and of the others only 3%
do.

I doubt that you could identify the 3% that get the smaller benefit or
none. You have to use all the data to detect small differences. If you
use subsets of the data, you’re doing what Phil Runkel called fine
slicing, which has the principal effect of increasing the variance in
each slice, negating the attempt to get more precision. See Richard K.'s
discussion of determining deciles of effect.

As this situation
is usually presented, the impression is given that everyone is helped 20%
by such a treatment… The “helps prevent” concept is a
clever advertising scam that presents group data as if it applies to
individuals. Most of medical advertising works this way, and always with
the disclaimer, “Consult your physician”. That means “We
have just told you a lie.” Insurance advertising also does this:
there’s one insurance ad that shows accidents in reverse, with everything
being put back together as if it had never happened.

Come on, now! You must be getting pretty desperate to sustain your
“statistics is useless” belief if you have to resort to drug
advertising to sustain it! How long were tobacco companies advertising
that smoking was good for you after they knew it killed – using
statistics?

What’s this “statistics is useless” crap? I have never made
such a statement. Isn’t what I actually say bad enough not to need
exaggeration?

A lot of my
discomfort here is not so much the fogginess but the apparent contentment
with the statistical approach… And when the suggestion is made that we
really need better ways of solving problems, all the defenses go up,
making it even less likely that we will end up with something
better.

That’s not a very strong straw man. Medical research (any research) asks
“why”, whereas practitioners ask “what is best for me to
do now?”

Most doctors simply prescribe drugs nowadays. Most drugs are developed
through statistical analysis, even when there is some attempt to get
beneath the surface a millimeter or two (“beta blockers”). But
if that’s all we have, that’s all we have, however crude it is. If a
person isn’t interested in research, then just read the journals and
apply what they say. We can’t all be researchers.

Researchers, on the
other hand, take the statistical data as evidence, and trawl through it
for suggestions as to where to look next for mechanism. I see no evidence
of “defences”, in either case.

Well, what are those two sentences, then, if not defenses?

I have noticed that
in the last couple of years, something called “systems biology”
has finally appeared on the scene. …This is going to be so much more
effective than the traditional statistical approach that I predict the
demise of statistics for all uses but those specifically concerned with
group phenomena.

It will be interesting to see if you are correct. When I was in graduate
school, I thought that significance tests in statistics would be long
forgotten before twenty years were out. Yet here it is almost half a
century later, and they are still the (fools’) gold standard for
publishing in experimental psychology.

That’s because all the money has gone into statistics-based research, not
modeling.

I answered that one
a few times already.

No you didn’t. You described how, in principle, it would be done by
someone with the necessary skills, but you didn’t actually do it. The
only one trying to get to the point of actually doing it are Rick and I
– Rick promises to start in the morning.

The “necessary skill” basically is the ability to count how
many points in a scatterplot are above or below a line on the graph. No
great skill, but if you have equations that will do it for you, the job
is easier, isn’t it?

No. I have to understand the equations first, which is slow for me, and
I’m not sure I’m applying them correctly anyway. When I can see how
someone who knows the ropes uses them, I can understand them a lot
better.

What you’re carefully not mentioning is my needle about your not actually
doing any of the computations. Put your spreadsheet where your mouth
is.

From yesterday’s
post, here are numbers from the plot you sent:

Let’s consider the SAT - GPA
scatterplot, which we all acknowledge shows a pretty poor correlation,
reliable though it may be. MSAT accounts for only 22% of the variance in
GPA. If we define the median GPA as the pass-fail point, 25 will pass and
25 fail. Suppose we choose 25 at random from the 50 candidates in the
scatterplot. Different choices of them will result in different numbers
of failures, but on average, 12 or 13 will fail. However, if we use the
MSAT, and choose those that are above the median in the scatterplot you
posted, we find that 16 pass and 9 fail (using the GPA of all of them to
obtain the median). Of the ones rejected, 9 would have passed and 16
would have failed.

Isn’t that what you asked for? I used the method I described in the
tutorial, and assumed that you were going to select half of the
applicants. You could, of course select 25% or
75%.

I was selecting 100% of them; another data set of the same size should
provide better predictions than a smaller set, shouldn’t it?. Over a
quarter of the predictions were in error by 100% or more. When you’re
asking a question like “Which nations should be given help in
compating infant mortality,” isn’t a 50% probability of error in a
quarter of the cases too much?

This is moot, however, because using the right model we can get the
correlation up to 90%, which makes the predictions a lot better. Not
great, but certainly much better.

Best.

Bill P.