[From Bill Powers (941006.0645 MDT)]
Martin Taylor (941006.1345) --
Now that you are working with a slow link, I got your post within no
more than 6 hours after your time stamp. It could have been less; I
haven't looked at my mail for 6 hours.
I agree with you about the "no counterexamples" criterion of Popper;
taken literally it rules out all theories. But I have a positive
attitude toward his falsifiability criterion, at least as I interpret
it: that a test should actually be a test. A test that can't be failed
is no test of a theory.
A lot depends on how a theory is stated. If you say "I propose that
grade performance in college depends on ACT scores," you're proposing a
"theory" that is next to unfalsifiable. If you don't get a significant
correlation on your first try, just keep doubling the size of the test
population until you do get significance. If that doesn't work, accept a
less stringent significance criterion. You can hardly lose, whichever
way the correlation goes.
Then look at this theory: "A person in a tracking task behaves as a
negative feedback control system does." That alone leaves a lot of room
(you could be talking about heat dissipation), but when you add that a
model based on control theory should duplicate a subject's performance
within a few percent, with 1800 samples of data in an experiment, the
prediction suddenly becomes far more risky. Even allowing for bandwidth
limitations and consequent coherence within the data, you're talking
about a probability of producing this behavior by chance that is
trillions to one against -- and a correspondingly large chance that any
important change in the details of the model would lead to a wrong
prediction. And instead of relying on large numbers of people to reveal
the effect, you're claiming that it will be seen in a single individual.
Even putting aside the question of cooperation (any other theory would
have the same problem), we can work out what other kinds of models would
predict, and even test behavior under conditions requiring other models,
to show that there are in fact ways in which the control-theoretic model
could be falsified. For example, I have set up open-loop tracking
experiments where the subject can see the target but must move the
cursor without perceiving it (depending on sensing arm or mouse
position). Even in easy situations, the RMS tracking errors are 10 times
as large, meaning about 10 standard deviations larger than the errors in
a normal control experiment. Choosing between these models is not a
matter of statistical significance any more: it is a matter of picking
the most probable explanation where the chance of mistaking one type of
behavior for the other is about 10^9 to one against. There is no reason
in principle why the behavior could not be that of an open-loop system,
so we can definitely say that the control model is falsifiable. But with
such odds we can make the claim that it simply is not falsified, whereas
the rival model is, beyond any reasonable doubt, wrong.
Furthermore, we can say (after considerable experience) that if in the
normal tracking experiment a subject were to show errors as much as two
or three times the normal tracking errors for that person, something
must be seriously wrong with the person. Falsification of the model
would almost certainly be diagnostic of the person rather than the
model. In this case, if we knew that something was seriously wrong with
the person, yet the same model as before with the same parameters
continued to predict that person's behavior just as accurately as
before, we would take this "failure of the model to fail" as showing
something wrong with the model!
It seems to me that when we start talking about quantitative model-based
predictions of behavior, we move into a different type of theorizing
altogether. All the old statistical manipulations are predicated on the
assumption that the phenomenon under investigation will be swamped by
noise, so it can't be seen accurately or at all without statistical
analysis. It's assumed that we will do well to prove the _existence_ of
a relationship; determining its form is normally impossible.
What is meant by falsifiability is very different in the two kinds of
theories.
You say " All, and I mean ALL, theories actually proposed are false in
some respect, and we know that as solidly as we know any fact." This is
a technically true statement, but there is a vast difference between a
theory that is falsified by one in 20 of the the data sets used to test
it, and one that is falsified by one in 10^9 data sets. There is surely
something very different going on when we speak of the truth of Newton's
laws of dynamics and the truth of Bandura's laws of self-efficacy. Yes,
Newton's laws are no longer "true," but one has to ask about the
practical significance of that fact when under most conditions our most
sensitive instruments still can't detect any deviation of predictions
from observations. It's only in imagination that "true" and "false" have
values of 1 and 0.
Between differences from prediction of 2 standard deviations and
differences of 5 standard deviations, we go from a probable occurrance
of 5% to a probable occurrance of 5 x 10^-5%. By reducing the standard
deviation of prediction error by a factor of only 2.5, we achieve an
increase in predictability of 100,000 times. It seems to me that this
region of predictive error represents a gulf between two approaches to
understanding phenomena, and a giant step separating two ways of doing
science. And to get from the one region to the other, we only have to
reduce our predictive errors to 40% of their previous values!
Come on, you statistics guys. What am I doing wrong here?
···
--------------------------------------------------------------------
Best,
Bill P.