more harping

[Hans Blom, 940509]

(Bill Powers (940508.1800 MST)) On explanations, theories, and models:
harping on a theme.

Theories in psychology and other similar pursuits are commonly
expressed as a statement to the effect that y depends on x. ...
                                                       The most
common way of testing theories of this sort is to do an experiment
in which the independent variable is varied, and its relation to the
dependent variable is subjected to a statistical analysis.

To do _an_ experiment? In real life, the choice of exactly which experi-
ment to do is often crucial. Experiments have costs, in terms of money,
time, and even human lives. If the effect of a new cancer drug X appears
to be beneficial in the very first stages of the experiment, the experi-
ment will be aborted and from that moment on the control group will also
receive drug X. The reliability of the outcome of the experiment is thus
much more reduced than seekers for 'hard' quantitative truths demand. We
have recently seen such problems with AIDS-related drugs. This is the fate
of many experiments in medicine (which I am more familiar with than psy-
chology), but I assume that the argument is valid more generally: the more
elaborate -- and thus costly -- the experiment, the more certainty is ob-
tained. If cost (or time; time is money) is to be reckoned with, less than
absolute certainty will often be an acceptable compromise solution.

What about humans as control systems? I am often operating under condi-
tions of less-than-certain knowledge. Yet I may have to act with some
urgency. What to do? Perform experiments first in order to collect additi-
onal knowledge? But while I do that, I cannot control, and that may be
costly. Control, but with the knowledge that my control will probably be
suboptimal? That may be costly as well. We have a real dilemma here. In
psychology, this dilemma is visible in the people we call perfectionists.
In extreme cases, their search for a "perfect" solution takes so much time
that they are doomed to a life of total inertia. To some degree this style
of behavior shows up in a great many scientists. (For those who might want
to confront this dilemma: one branch of control theory, which goes under
the name of "dual control theory", explicitly focuses on the issue of
finding the overall-least-costly solution. Incidentally [or not], dual
control theory has much in common with information theory).

In most statistical tests of theories, the only conclusion to come
out of it is that y is or is not dependent on x.

In this case, the information (in the information theory sense) in the
experimental data is only one bit. One bit is the amount of information
necessary to make a dichotomous choice: Do I go north or south (or any 180
degree regions)? Two bits would allow me to decide between north, south,
east or west (or any 90 degree regions). And so on. In medicine, one bit
often suffices: Do I take drug X or not? Do I take drug X or drug Y? I see
nothing wrong with this approach if the data supply not enough information
to know more. You would like to know _how much_ of drug X to take (corre-
lation, regression line), and so would I, but medicine -- due to the dif-
ficulty of experiments -- must often be satisfied with standard dosages
(per kg of body weight, as custom dictates). Sometimes more is known about
the response of a drug, yet most frequently your family doctor will pre-
scribe either one, two, or three pills a day or some similarly coarse
alternative. The accuracy of medical prescriptions is often in the order
of 20 to 50 percent, I would guess, corresponding with 1 to 3 bits of
information per decision.

            People who take one aspirin per day either are or
are not protected against heart attacks. There is no room for
equivocation in statistical results: the quality of the results
generally would not support any statements about the degree of an
effect, other than the statement that it exists or does not exist.

Indeed. Sometimes the experimental data do not support higher accuracies.

Remember that statistics -- and all theories that are based on it, such as
optimal control theory -- started out as the theory of gambling. Much of
behavior can be seen as gambling: do I take path X or path Y? PCT deals
only with situations where behavior is both unconstrained (except maybe by
extremes) and immediately goal-directed: the path to the goal must be
there to perceive. The position of a handle can have any value within a
range and you can see both the handle's representation on the screen and
the target. Neither is true in many real life cases. In medicine, there
are only so many drugs; inventing or manufacturing a new one for every new
case is out of the question. Moreover, you are not certain that the drug
will work for a particular individual because he/she may be insensitive or
allergic to it. In chess, only so many moves are possible in any position,
and the outcome of any one move is something of a gamble. In bridge, only
one of so many cards can be played, and you have even less information
about where to go/what to do. In poker, the choice is between playing a
card and forcing a showdown in a situation where lying (mis-information)
is explicitly allowed. PCT is not the theory for these uncertain and/or
one-out-of-a-few choices, but other theories are (more or less). On the
other hand, those other theories cannot express what PCT can. Conflicts
between theories? Sure. Is that a problem? No. Unless you pretend that PCT
is the one-and-only theory to describe _all_ behavior.

A major consequence of the unusability of the regression line as a
predictive model is that explanation and prediction become estranged
from each other.

I maintain that _every_ theory is a description only. An 'explanation' is
a description in terms of already known concepts. I see a theory as just a
concise summary of observations or experimental data. Some theories
describe gross features only, and therefore have large errors. A regres-
sion line through a cloud of points is an example. Other theories have a
better signal to noise ratio. But always there will be _some_ noise: a
summary is never the whole thing, the map is not the territory.

Prediction becomes possible only if you have the additional knowledge that
the future will be like the past, that the relationships between variables
will remain unchanged over time. This is particularly untrue for systems
that learn (i.e. that do things differently at different times), such as
adaptive control systems and (most :slight_smile: humans.

                                                 So in defense of
statistical theory, we have the anecdote which illustrates how the
theory works when it seems to work, accompanied by a blank lack of
interest in counterexamples.

Actually the opposite is true in science, as Popper has indicated: one
counterexample destroys a theory, whereas an infinity of positives cannot
make it true. Regrettably, if a theory has a noise term, the situation
becomes much more complex, because in such cases the theory says something
like "in X percent plus or minus Y percent of the cases Z is true". Test-
ing such propositions requires some education in statistics and may be too
costly -- see above.

The next stage in the estrangement is complete divorce of
explanation from prediction. Once the custom of supporting a theory
by referring to positive instances of it is well-established, the
concept of predicting can quietly be dropped. Now what matters is
whether the explanation seems plausible, appealing, beautiful,
ingenious, or mathematically sound.

You seem to vacillate about the point whether prediction is possible at
all (as I understand some of your previous postings) and the high pre-
dictive power that a theory must have. Have I lost you here?

     Sometimes theory A works best, and sometimes theory B works
best. You use whichever one works.

Yes, that is the currently most prevalent view of what theories are.
Theories are _tools_, and just as we have only a limited set of tools in
our toolbox, words in our language, muscles in our body, reflexes in our
motor system, the task is to accept those constraints and come to an
"optimal" solution within the limitations of those constraints. Those
limitations are what makes modelling a twelve degrees of freedom arm so
difficult, yet they are real, so we'd better take them into account.

The scientific revolution brought into being the concept of a nature
that has properties independent of the observer.

That is a pretty "noise-free" model in a great many cases. It breaks down
when the things to observe become so small that they are molested by the
tools that we use in observing them. The opposite position, that what I
see is different from what you see, may be far more true, but conflicts
with our human inborn urge to have others see the world as we see it.
Science tries to go up a level above this urge and build "objectivity",
i.e. it invents concepts that we can agree about. Sometimes with success
(the "simple" objects of physics), sometimes not (the "complex" subjects
of psychology).

By the way, the relationship between Euclidian and non-Euclidian geometry
is much more straightforward than you assume. As you know, a geometry is
based on a small set of axioms. Euclidian and non-Euclidian geometry have
_all_ axioms in common except for _one_. That is all. The one difference
is that Euclidean geometry has the axiom that parallel lines do not inter-
sect anywhere, whereas all non-Euclidean geometries take as an axiom that
parallel lines _do_ intersect (in infinity). Such a small detail, and in-
accessible to observation to boot; yet so large consequences...

                                             This viewpoint has
some obvious problems, but along with it came the principle that
pure thought could not establish the existence of a property of
matter and energy. The only way to defend a proposition about nature
was to cast a theory in a form from which a prediction could be
made, and then do an experiment to test the prediction.

That is not quite true. Logic has given us a number of tools that allow us
to derive additional knowledge from given facts. If I know that a die is a
cube with six faces which is to land on a flat table top, I can predict
that it will land with one of its faces up. Performing the experiment --
under a cup and without looking at _which_ side comes up -- does not give
me additional information. If I have all the axioms of Euclidian geometry
and a procedure to link axioms into valid theorems, I can derive the whole
infinite number of theorems of Euclidian geometry -- if I had the time. If
I have all the words of a language and its syntax and semantics, I can say
anything that can be said. More generally, a finite set of tools allows
you to do an infinite number of things. Or, rephrased into the terminology
that you use above, a few givens (axioms; things that my senses indicate
that I can believe in, without yet being able to "prove" them) allow pure
thought to establish the existence of an infinity of new concepts, inclu-
ding "mass", "energy" and an infinity of equally valid theories.

Explanation and prediction have to be put back together. Any
explanation of a phenomenon should be followed immediately by a
prediction: if the independent variable is manipulated in a certain
way, the dependent variable should behave in a certain way. And if
it does not, the prediction should be accepted as wrong, and the
model behind it as wrong.

As to the immediacy of being able to predict, I remain in doubt. The
theory states which aspects should be observed; it does _not_ state what
can be neglected and what not. The experimental conditions matter.
Newton's quadratic speed increase of a falling object was initially
demonstrated by having a ball slowly roll down a slightly inclined gully.
Questions remaining were, of course: Does the gully matter? Does the
inclination matter? Does the color of the ball matter? Does the size of
the ball matter? An infinity of questions. Genius (or luck?) is required
to think of the question whether the presence of air might matter.

One weakness of PCT lies in its proposal of The Test as the tool to decide
as to which variable is controlled. The Test is impractical, since it
would need to test an infinity of possibilities, some plausible, others
less so. A task never truly finished...

The basic problem with Newton's law is that it is never the only one that
is operative. A selection of the experimental conditions is required so
that other "laws" do not simultaneously exert their influence.

That is the source of another weakness of The Test in PCT: in general, we
simultaneously control a great many interacting variables. There is in PCT
no prescription as to how to create the right experiments that tease those
interactions apart so that just one controlled variable can be discovered.

I imagine that it is these two weaknesses that make some people say that
PCT is at most a meta-science, a good idea that is in search of a theory
("theory" in the explicit formula or "law" sense of the sciences).

                    The only standard that will, in the long run,
result in knowledge worth knowing is that our predictions should be
accurate within the limits of measurement of the variables. If we
can detect any error at all between our predictions and what
actually happens, we should take the error as information for
improving the model.

This does not help. Both relativity theory and quantum theory are accurate
within the limits of measurements of the variables. Yet, _logically_ they
are in conflict (for one thing, one has everything quantized, the other
works with continuous variables, zeroes and infinities/"singularities").
Hence the search for "grand unifying" theories.

                                    If the variables we measure
are simple enough, and our predictions are simple enough, we can
achieve the degree of predictivity that physics demonstrates.

Remember that physics is about inanimate matter. As soon as learning,
habituation, adaptation or what have you play a role, accurate modelling
may become impossible because of the immensity of the task to establish,
for a particular individual, that _all possible past and present varia-
bles_ have been taken into account. Physics depends on the reproducibility
of experiments, on objects that do not change over time, and on classes of
identical objects. Organisms are both genetically (nature) and environ-
mentally (nurture) unique. Do you really believe that this uniqueness is
so inconsequential that the methods of physics apply?

So we should go to kindergarten and learn how to make simple
predictions that are always borne out by observation. There is no
short-cut.

Philosophers have always seen it as their task to come up with a number of
statements or explanations that _everyone_ could agree with. This has
proved to be an impossible task, even if only truthful people "in their
right mind" (these value judgments form a weakness, of course) are allowed
to participate in the debate: different people have different opinions.
Even Bishop "if I don't perceive you, you don't exist" Berkeley, whose
position is quite unpopular (probably for personal reasons, because people
don't like the idea that their existence depends upon someone else's
perceptions), cannot be rejected because his point of view is logically
unassailable.

A science can be built only on predictions that are essentially
always born out by obseration. By learning to make simple
predictions that are always right according to crude measures, we
can learn to make more complex predictions that are always right
according to more discriminative measures.

Such a breakdown (reductionism) fails as soon as gambling plays a role.
Your first ticket in the sweepstakes doesn't pay off, nor your second or
third or hundredth. Your "complex prediction" would logically be that no
ticket will ever pay off. Philosophers-logicians will tell you that
induction is not an infallible tool. By insisting on certainty you focus
your attention, I think, on such a small segment of reality as to become
uninteresting. You gambled when you chose a job: would that be the most
satisfying one? How could you know? You gambled when you chose a partner,
decided for children, on a place to live, on what not. Or were you really
driven by an unfailing sense or certainty in all those cases?

                                 We still honor the discoverer
and formalizer of the fact that the elongation of a spring is
proportional to the applied force: Hooke's Law.

Hooke's Law is a _tool_, a useful approximation that works under certain
conditions. It is not A Truth.

The gross mistake made by conventional behavioral scientists was to
think that any piece of knowledge, however uncertain, is better than
no knowledge at all. It is not; it is far worse than total
ignorance.

This is where gamblers of all type will disagree with you. Any tiny piece
of knowledge that someone else does not have improves your odds. What is
your favorite game? Chess more likely than bridge or even poker...

          If we know we are ignorant, we will keep trying to better
our theories and methods. But if we think that in discerning a faint
trace of regularity we have reached the goal of understanding living
systems, we will settle for less than a science, declare our methods
to be good enough, and think we have become wise.

Wisdom is, I think, having a good set of tools. If you have only a hammer,
the whole world looks like a nail. Even with good tools some problems can-
not be solved well. But maybe just good enough...

I've been chided for telling people that they have wasted a life's
work. But I never tell anyone that. I merely present my analysis of
methods and knowledge as I am doing here. If I have made my case
clearly, then people will be able to draw their own conclusions
about their own life's work, both what it has been and what it will
be in the future.

I admire your beautiful hammer. I just wish that you could see the value
of some other people's tools as well, even though you are unfamiliar with
their use. I have no idea how realistic such a wish is...

Greetings,

Hans