Judge Ito; demonstrating reinforcement

[From Bill Powers (950609.1600 MDT)]

Bruce Abbott (950609.1145 EST) --
Rick Marken (950608.2130) --

The Judge Ito metaphor is appropriate to the argument between you and
Rick. In this system of determining truth, each side takes a position
and then looks selectively for all possible evidence to support it,
while also seeking to hide or suppress all evidence against it.

This is not, of course, how science is supposed to work, although it
often does work this way.

Probably the least important "scientific" argument of all goes like
this:

"Thousands of brilliant people have believed in my position, including
Nobel Prize winners and other people of impeccable scientific
credentials. To say that this position is wrong is to say that all these
people have been wrong, all of this time. That is so unlikely that we
have to assume it can't be true. Do you really think that you can find
some flaw that these skilled scientists have not already considered and
dealt with? Are you saying you are right, and all these thousands and
thousands of scientists are wrong?"

The problem with this argument is twofold.

First, thousands and thousands of scientists are even more likely to be
wrong than a single scientist, because most of them are simply following
the leaders, adopting their interpretations cookbook style and repeating
the catchy phrases they make up. If they copy the experiments, they also
copy the interpretations. They are also likely to be anywhere from a
month to a decade out of date with respect to what the leaders are now
saying. Furthermore, only a tiny handful of scientists will actually
have done the original experiments and the original interpretation of
the data; all that the rest of them know about is whatever conclusions
were published. So if the original handful was in error in some way, the
main result of spreading their ideas around is to amplify the error. The
likelihood of discovering the error decreases as the number of
scientists who believe there can't be any error increases.

Second, this argument says absolutely nothing about the substance of
whatever position is being taken: the validity of the data, the validity
of the interpretation, or the validity of the application to any real
case.

This is, of course, the principle of the Expert Witness. The more expert
witnesses you can call and the more impressive their credentials, the
more the jury is presumably impressed with the correctness of your
position. Unfortunately the other side counters with an equal or greater
number of expert witnesses with equal or better credentials and in
support of exactly the opposite position. While this should tell us
something about expert witnesses, apparently it does not.

Scientific beliefs are determined by a vote. Scientific truths are not.

ยทยทยท

-----------------------------------------------------------------------
Rick tells me, Bruce, although the post hasn't reached here, that you
have referred back to the E. coli programs to show a successful
application of the principle of reinforcement. I rather suspected you
would, and hoped you wouldn't.

The problem with that demonstration is that the basic model had already
been developed, so most of the explanation of the observed behavior had
been worked out. All that was left was for you to fill in a mechanism
whereby discriminative stimuli and reinforcements would lead to the
right answer, which was already known: the organism should tumble sooner
when going down the gradient, and later when going up it. The role of
the period of straight-line swimming was also already built into the
model.

Since that was already known, you could call any variable you liked a
discriminative stimulus, and say that any change of conditions you liked
was a reinforcer. The only constraint was that the outcome must be a
shortening of the delay before a tumble when going the wrong way, and so
forth. With this amount of freedom, almost any model at all could be
made to fit the behavior. Since you knew that when swimming the wrong
way the probability of a tumble per unit time must increase, you knew
that whatever reinforcer there was, it should increase that probability.
Then all you had to do was see what conditions held among the other
variables when swimming went the wrong way, and assign them
"reinforcing" or "punishing" properties as needed. Those properties
included the ability to change a probability of a tumble, but how that
was accomplished was not modeled. The effect was just calculated.

In our model of e. coli, there was no assumed component that required us
to postulate that control exists. There were no nonphysical mechanisms
proposed. The OUTCOME of the model's organization was that it produced
control, but the capacity to control was not among the premises. This is
the difference between your approach and ours: yours required building
in the very phenomenon whose existence you are trying to demonstrate.
---------------------------------
I had a great deal of trouble understanding your e. coli model. I now
realize that it was because I was looking for the origin of the
reinforcing effect or the discriminative effect, and couldn't find it --
yet the model seemed to work. So subtly was the question begged that the
point where it was begged slid right past me unnoticed.

If we forget about labels such as reinforcer and discriminative
stimulus, what we are left with in your model is a perceptual system
that detects a logical condition involving two variables, and a
mechanism involving continuous calculation of probability density as the
means for altering the delay between tumbles, along with a link that can
change the probability density. Of the four possible logical conditions,
two have an effect in the wrong direction, but are outweighed by the
other two. If we eliminate the inappropriate connections we are reduced
to a simple on-off perceptual system, and can combine the two separate
mechanisms for altering the delay between tumbles into one simpler one.
After substituting for your magical connection a more physiological
mechanism for changing the delay between tumbles, we are left with a
simple control system of the kind that Rick and I had proposed.

My conclusion is that your e. coli model did not demonstrate that
reinforcement is a real phenomenon; it simply assumed that it was.

I hope that in the case of the simple data tables I posted earlier, your
argument in favor of the reinforcement interpretation will not assume
that this interpretation is correct as a premise of the argument.
----------------------------------------------------------------------
Best,

Bill P.