Validity

[From Bill Powers (990916.1916 MDT)]

After Phil Runkel's delicious ruminations on validity, usefulness, and
purposes, I have only a short commentary to add about the MOL and the
"therapy" thread.

The MOL is a method of psychotherapy which is very simple, quite probably
harmless, and of unknown efficacy. It wasn't designed particularly for
resolving conflicts, although it seems to help with them. The chief reason
I would like to enlist people's help in testing it is to find out whether
it is (a) teachable, and (b) good for anything.

The time to say what the MOL is good for is after we know what it is good
for. In order to know that, we have to try it out and see what outcomes
result. In order to know the outcomes of therapy using the MOL, we have to
be sure we are doing what we define as the MOL, and also -- especially --
we must not mix it with other treatments at the same time. Mixing it with
other treatments results in committing a sin called "confounding", which it
seems I must explain even though those to whom I am explaining this might
have been thought to know this already.

If we do other things at the same time, we will not know what part of the
outcome was due to using the MOL and what part was due to the other things
we did. It would be possible to attribute bad results to the other things
we did when it was the MOL that caused them. We could attribute good
results to the MOL when they were due to the other things we did.
Similarly, bad results could falsely be attributed to the MOL, and good
ones could be falsely attributed to the other things we did.

Ignoring these principles simply means that we still will not know whether
the MOL is a worthwhile method of therapy. There is no point in adding it
to one's "toolbox" if nobody knows its effects when used alone.

Best,

Bill

Best,

Bill P.

Re.: Bill Powers (990916.1916 MDT)

Bill: There is no point in adding it to one's "toolbox" if nobody knows its
effects when used alone.

David: This is a compromise way of exploring what the MOL is good for,and
with whom MOL is even workable, within the constraints of acutal therapy
work. So, I think there is a point.

As far as I can tell, it does no harm.

If I, and the person I am working with, don't feel it has any benefit, we
don't use it again. This is the standard I use for all of my "tools." We
usually give it more than a single try. The MOL is still in my toolbox
because sometimes it has proven helpful. I can't really say that I have seen
any "MOL miracles."

I leave it for others, working in academic settings, to follow the
one-variable-at-a-time approach. I will certainly be interested in what is
found. This sort of research is needed. It would make a terrific Ph.D.
thesis project.

With respect to the interchange with Tim, I have already apologized to him
and other CSGnet members. I am not going to respond to posts about this
interchange.

ยทยทยท

From: David M. Goldstein, Ph.D.
Subject: Re: Re: Validity
Date: 9/17/99

At 16:22 Philip wrote about Validity on 16 Sep 99,

Dear friends:

I don't often contribute to the CSGnet, because I try to save
time for my writing. But now and then my comparator buzzes, and

[snippity lots of commentary]

You can do head counts of what people achieve when they use (in
some manner or other) a Slammer hammer or a Glopper golf club or
an ETS Entrance Exam or a Phonics curriculum or a Blank
sentence-completion inventory or an MOL. (Sometimes a head count
is useful; I won't take space here for how that can come about.)
But you are hitting the duck over the head if you try to
ascertain the validity of one of those _things_.

Thanks for the validation. <g>

(I'm controlling for Branden's sentence completion technique for
facilitating awareness and resulting "shifts" of reference levels or as
Carlos Casteneda's Don Juan taught "assemblage points").

Excellent stuff. Best I've read in days.

nth

Dear friends:

I don't often contribute to the CSGnet, because I try to save
time for my writing. But now and then my comparator buzzes, and
I have to do something other than bite my fingernails. Yesterday
I was reading some remarks about MOL and therapy.

My brother was a fanciful fellow. One day he told me he had
discovered a cure for the common cold. All you need is a duck
and a mallet. You identify your cold with the duck, hit the duck
over the head with the mallet, and your cold is gone!

I sometimes think of that seductive reasoning when I read or hear
about a proclivity such as schizophrenia, intelligence, or one of
those ailments that sound to my ear something like heavy-traffic
disorder, burping-in-polite-company syndrome, or hyperactive
resentment of boredom. In looking for creatures upon which to
blame the troubles of humankind, one rarely comes across ducks,
chimeras, ravens, sphinxes, or magpies in the indexes of
psychology books or in the catalogs of psychological test
publishers, but one finds any number of inventions to take their
places.

As to psychological tests, I was for some years a testing expert
myself, and I am still amazed at the inventiveness of that
subgroup of humankind.

I was also for some years an expert on "program evaluation" -- an
activity in which you arrive at a pronouncement of whether some
program, treatment, training or anything, indeed, that someone
does to someone else, has been successful. There are books and
books about how to do that. I did some writing about it myself.

Someone on the CSGnet asked whether there was any research on the
validity of MOL. I've written some (long ago) on validity, too.
Some years back I got to worrying about some strange features of
the validity concept, and I wondered how physicists or chemists
might think of it. I went to the library and got hold of the
physics periodical -- I forget the name of it -- that is parallel
to Psychological Abstracts. It's the one that gives short
abstracts every month of every article in all the journals
touching on physics. I discovered that "validity" and
"reliability" were NOT key words in physical research. You can
NOT find anything on those topics through the physics abstracts.
Then I discovered that those terms were not key in chemistry,
either. Nor even in biology! Furthermore, I discovered that
neither term occurs in the index to B:CP!

How, pray tell, can physicists, chemists, biologists, and W.T.
Powers lay claim to respectability?

Here are some typical questions asked by school administrators,
teachers, advertisers, physicians, organizational consultants,
psychological therapists, pastors, and other people who do things
for somebody's own good: Does this treatment produce an
improvement? Will this method of teaching increase average test
scores? Is this diagnostic instrument valid? Is this technique
of therapy suitable for ailment X?

Notice how easy it is to ask questions of that sort without any
hint that humans are involved -- without any thought of
somebody's purposes.

In texts about program evaluation, the authors typically say that
you must state the goals of the improvement program very
carefully and objectively. How else can you ascertain whether
you have attained them? Oh? Whose goals? (Actually, that
question often causes program evaluators no worry; they simply
adopt the goals of the persons who are paying their fees.) And
when are you going to ask for the visualization of the goals --
before anyone has heard of the improvement program? After the
boss has told them about it? After the program has gone on for
three years? After the boss has resigned and a new boss has
taken over? Suppose, after the program is declared over (or
after the evaluators have reached the end of their budget and
have gone home) you discover that one-third of the employees are
glad the program had been undertaken. Is that a validation of
the program or not? I leave you to think of the questions that
last question raises.

If a new curriculum is followed by increase of so-and-so much in
an average of scores on a standardized test, does that show that
the new curriculum is superior to the old? Imagine what would
happen if the principal were to tell the teachers that the new
curriculum was a stinker and he was sure the new curriculum was
going to depress the average score on the standardized test. Is
it the curriculum that increases or decreases the scores? Whose
purposes are being pursued when the curriculum is (or is not)
being followed?

Is this treatment or this test suitable for use with ailment X?
For use with patients from Bulgaria? Whose purposes are being
pursued when the treatment or test is being used?

We make use of features of the environment to pursue our
purposes. Managers, teachers, psychiatrists, pastors all pursue
their purposes. So do the underlings: workers, pupils,
patients, parishioners. And they all make use, to the degree it
seems promising, to features of the environment such as
intelligence tests, Rorschach tests, standardized subject-matter
tests, interviewing techniques, office decoration, and so on.
And, too, managers make use of workers, workers of managers,
therapists of patients, patients of therapists, and so on.

Other people bring disturbances to our controlled variables, but
they also serve as resources we can use in maintaining control of
perceived variables. Yesterday I took a bus home from the
hearing clinic. I got on the bus and turned over my entire
welfare to the hands (that is, purposes) of the bus driver.
Tomorrow Claire will go to her physician and turn over her
welfare, perhaps her very life, to that physician's purposes.
Someone goes to a psychotherapist and turns over his or her
sanity, or at least peace of mind, to that therapist. Maybe that
therapist uses a bundles of principles he or she calls "MOL" in
carrying our his or her purposes with the patient.

I am not saying that the therapist is selfish or fiendish or in
any way reprehensible for having purposes. The audiologist, the
bus driver, the physician, the therapist all have purposes in
connection with their customers, clients, or patients. Thank
goodness I can understand fairly well the purposes that are
mostly controlling while the bus driver is driving me. The other
kinds of servitors I watch more closely.

I remember the tale of one school, after the idea of teaching
machines became glamorous (early 1960s?), that installed a
battery of teaching machines in several classrooms. In the
beginning, it looked as if the machines were aiding the task of
the teacher. But then it was discovered that the teachers were
no longer sending the students to the teaching machines. The
teachers had found themselves at loose ends. What were they to
do while the students were happily punching the keyboards?
Teachers are supposed to be _doing_ something. So they returned
to _doing_ something.

And the agricultural experts who wanted to help the Navaho (I
think it was) grow more corn. (This was way back in the 1930s.)
They had a new seed that would grow better in that dry climate
than the seed being used. Sure enough, after the crop with the
new seed matured, the plants bore enough corn not only to feed
the family, but enough over that to sell. The families that year
actually had some disposable cash. The experts were very happy.
But the next year, the farmers planted the old kind of corn!
Enquiring, the experts discovered that the wives had demanded the
old kind of corn, because flour from the new kind did not make
good tortillas.

A diagnostic instrument may or may not be "suitable" for a
disease or a patient, but if it is not "suitable" for the
therapist, it is not going to be used. And vice versa: no
matter how unsuitable an instrument (such as the Rorschach) may
be shown to be for purposes of learning something useful (useful
to either therapist or patient), it is going to be used if the
therapists finds it suitable for his own purpose (such as
_feeling_ as if he knows something about the patient, regardless
of data). If you like horror stories of this sort, read "House
of Cards" by Robyn Dawes. Another book showing how personal
needs of the psychologist, and politics, too, get all mixed up
with psychological theory is "Constructing the Subject" by Kurt
Danziger.

I have read lots of articles in journals about projects
undertaken in industry to improve morale or productivity or
profitability or some other variable that managers care about.
One type of study (of which the are hundreds of examples)
collects data from a dozen or three dozen or 500 firms and looks
to see how many of them managed in manner X are high on outcome Z
and how many low, and similarly for those managed in manner Y or
at least non-X. If the proportion high on outcome Z managed in
manner X is greater than the proportion managed in manner Y, then
the writer invariably concludes that if you want lots of outcome
Z, you should do X, not Y. Again, whose purposes do we care
about, and whose purposes were served by manners X and Y and
outcome Z? And what other purposes were obstructed, and what
purposes of whom will be facilitated next month or next year by
the environmental resources produced or destroyed by outcome Z?
Those questions are not to be answered by fine slicing.

What shall I do if I want outcome Z but do not want to use manner
X, because using it makes it difficult for me to control other
variables in my company for which I have internal standards?
That is, suppose that getting Z via manner X puts me in internal
conflict. Regardless of the majority shown in the study, I am
not going to follow the advice of the writer. Furthermore, there
are lots of ways to get to outcome Z other than manner X. Just
because manner X is the manner that the writer may have thought
most likely, or most suitable to the theory the writer loves
best, that fact need not prevent me from thinking up my own
successful manner. Most social scientists, it seems to me, think
the people they study cannot possibly be as clever as they are.
But any reader of the literature soon finds out that even rats
now and then outwit the experimenters.

Let me use the analogy of a map. It is an environmental resource
many people use to further their purposes. You can use a map for
many things, but a common use is to let you lay out a sequence,
or a program, for getting someplace you want to go. Will the map
get you there? No, you can get yourself there if you use the
right map in the right way. If you want ten people to go to
Chicago, it will not be sufficient to hand them copies of a map
with Chicago indicated on it. It will not be sufficient even if
you write "Please go to Chicago" on the margins of the maps. The
necessary condition for them to go to Chicago is that they must
want to go there. (Some may simultaneously _not_ want to go
there; they may be suffering conflict. But going there must
produce less error than not going.) But whether the map is going
to be useful depends on more than wanting to go to Chicago. The
person must be able to read the map. (And must be able to do a
lot of other things, too, not associated with the map. But let's
stick to the map, because we are talking about evaluating the
efficacy of the map.) So you give copies to ten people, and only
three get to Chicago. Is the map ineffective? Is giving the map
to the ten persons ineffective? Is the map invalid?

Well, maybe five didn't want to go to Chicago. The map, however,
was useful; they used it to wrap up the garbage. The other two
wanted to go to Chicago, but couldn't read, and didn't know that
the map had anything to do with Chicago. There are lots of maps
showing Chicago and lots of people owning those maps. But lots
of those people never get to Chicago. Does that show that those
maps are unreliable for getting to Chicago? That they are
invalid?

What does "valid" mean here? Correct? Well, we certainly want
the map to be correct. But that is not what social scientists
mean by validity. (I suppose you know of that clever pastiche on
"Hiawatha" about shooting arrows at a target and never hitting
it, but having the average position of the arrows right on the
bull's eye.) Standard references for test validity are the early
writings of Cronbach (I've lost the earliest reference). Campbell
and Fiske (1959) set the tone for study validity. I am supposing
most members of CSGnet know those traditions. A much more
sophisticated conception was that of Cronbach, Rajaratnam, and
Gleser (1963), and a treatment to out-sophisticate all
sophistication was that of Brindberg and McGrath (1985). That
last book, entitled "Validity and the Research Process,"
postulates several validities within each of three types:
valuation validities, correspondence validities, and
generalization validities. Yet none of those writings say
anything about accuracy or correctness. "Accuracy" does not
appear in the index of Brindberg and McGrath. The reason social
scientists pay so little attention to accuracy, I think, is that
they so rarely gather quantitative data that can tell you how
close something comes to something else in any way detectable by
sense organs and conceivable at a low level of the hierarchy
(like inches).

Anyway, I hope you can see why I think it reasonable to speak of
the accuracy of a map, and to speak of the ways a map can be
useful to a person who is looking this way and that for help in
getting to Chicago, and why I think it is meaningless to talk
about the validity of a map. None of the conceptions of validity
in all those deeply-thought books is of any help whatever to
someone who wants to get to Chicago (nor to someone who does not
want to go there, either). Unless, of course, you are using
"valid" merely as a synonym for "accurate" (which those books
don't).

I suppose the analogy is not strict, but I am trying to tell you
why I think we mislead ourselves when we ask what a device or a
program or a test will do or yield or produce in interaction with
human purposes. Those programs or specifications or procedures
or plans or etc etc don't _do_ anything. We humans _do_ things,
and sometimes (when it suits our purposes) we make use of those
programs or specifications, etc etc etc. And when we choose to
use something as a means to an end -- to use something such as a
hammer or a program of organizational development, or MOL -- we
do not, if we have our wits about us, set the thing in motion and
then stand back, hoping for something wonderful to happen.
Instead, we watch every moment to be sure things are going as we
want them to go. If we think a program or procedure can be set
in motion and that the program or procedure will then _itself_ do
what we want, we find ourselves in the position of the Sorcerer's
Apprentice.

That reasoning applies to MOL, too, no matter how anybody feels
while using the strategy (using, actually, his or her adaptation
of someone else's description of it).

And it also follows that you cannot do research on the validity
or efficacy of MOL or any other program or procedure, despite all
the books on that sort of topic. That is, no matter how you go
about such "research," you are not going to end with anything
useful to say to someone contemplating the use of the program or
procedure, any more than you would if you did research on the
validity or efficacy of a map.

You can do head counts of what people achieve when they use (in
some manner or other) a Slammer hammer or a Glopper golf club or
an ETS Entrance Exam or a Phonics curriculum or a Blank
sentence-completion inventory or an MOL. (Sometimes a head count
is useful; I won't take space here for how that can come about.)
But you are hitting the duck over the head if you try to
ascertain the validity of one of those _things_.

Am I saying we should give up using maps, because you cannot
ascertain their validity? No, I am saying that validity is the
wrong conception. Accuracy is the better idea. (Physics books
do pay attention to accuracy.) But what about the fact that some
people buy maps and fail to get to Chicago? Doesn't that show,
at the very least, that the maps need to be improved? If the
maps are wrong, yes. But if the maps show Chicago in the right
place, then don't blame the maps if some people fail to get to
Chicago. As far as I am concerned, the people who wrapped their
garbage in their maps were making just as valid use of the maps
as the people who went to Chicago.

I fear I have rambled in one direction and another here, but I
have already spent a couple of hours editing this, and I don't
want to spend two more hours paring it down. I hope there is
something here useful to somebody.