Linguistics facts

[From Bill Powers (930315.1300)]

Bruce Nevin (930311.1520) --

What's underneath our dispute here is a basic difference of
interests.

You're an expert on how people usually or normally use language;
I shouldn't even try to argue on that level because it's too easy
for you to show that I'm ignorant. I'm interested in building a
model of how the brain works, filling it in with as many facts
from observed behavior as possible. You would think that I would
welcome getting information about how people actually behave, but
need. I can't build a brain model that leads to 80% of the people
doing one thing and 20% doing something quite different. A model
does only one thing: what its design says it must do. This means
that the model can only make one prediction at a time. If it
predicts wrong an unpredictable 20% of the time, then it's just
wrong.

The kind of data I need concern what people ALWAYS do. I don't
think this kind of data is common in linquistics or any other
conventional science of life.

When I say "always," I make reasonable allowances. But I do mean
that exceptions should be very rare. I also mean that I expect
the author of the data to have tried to discount it before
publishing it -- why should I have to try to think up ways it
could have been contaminated? Good data has already survived that
process in the hands of the person who obtained it, the person
most likely to think of things that are wrong with it.

Your report on Labov's data is, on the surface, the sort I like
to get.

In the lower-class department store, he uniformly got one
pronunciation, no exceptions; in the upper-middle class
department store he uniformly got the other pronunciation, no
exceptions.

But the more I thought about that, the more questions I had.
Perhaps Labov answered them and you just didn't get into that
much detail.

For example, it seems to be an astonishing run of luck that
NOBODY encountered in the "lower-class" department store was from
out of town. Were there no commuters from New Jersey or the
suburbs? From Brooklyn, or Queens, or White Plains, or Long
Guyland? Were these people interviewed afterward to see what they
actually had in common other than their pronunciation and an
assumed (by association) membership in the "lower class?"

Another problem is in the upper-middle class store. Did Labov
make sure that the salespeople he interviewed had not been
systematically selected for speaking in a certain way, and
specifically not in another? An interview with the personnel
manager might have revealed that manner of speech was an
important hiring criterion, so that Vassar graduates who spoke
with the wrong accent would be kindly encouraged to seek
employment elsewhere, or in a back room where they didn't meet
the public -- or Labov. And followup interviews with the
interviewees might have shown that some of them also worked in
lower-class stores where they spoke differently, or that they
lived in the same place where employees of the lower-class stores
lived, or that they had upper-class accents for totally
unexpected reasons.

An even more disturbing aspect of the data is the fact that Labov
knew, when he applied his fine-tuned linguist's ear to his memory
of the interview, where he was. He knew he was in what he
categorized as a lower-class store, or an upper-middle class
store. He could see the face and the clothing and the body
language of the respondent. And he did his own evaluations.

In a case like this, where a highly subjective judgment is to be
made, a common procedure is to have one person record the
interviews and to have the expert do the evaluations in a
different setting, without knowing where the recordings were
obtained and without being influenced by the appearance of the
interviewee. This eliminates at least any tendency to hear what
one expects to hear on the basis of the surroundings and with
knowledge of the theses one is trying to prove. It would be even
better to rely on an expert who does not know or care whether the
expectation is for the "class" of the store to have an effect on
pronunciation or to have no effect.

I don't know what precautions Labov took or how deeply he looked
into the origins and other characteristics of his interviewees. I
don't know how much skeptical investigation he did, looking for
selection factors that would change the meaning of the data. I
don't know what he expected to find when he did this experiment.
I don't know what he or other linguists would have made of the
pronunciations if they had encountered a clerk from one store
temporarily transplanted to the other store as a test for the
effects of expectations.

If Labov discusses all these things and shows how he circumvented
the implied problems, then I would definitely consider his data
to be of interest. Otherwise it's just another guy trying to
demonstrate that he's right -- or it might be. I wouldn't trust
the data until I knew more about how it was obtained.

When I get into arguments with psychologists, I'm often bombarded
with facts. "How do you explain ...?" they ask and cite some
finding about human behavior. Psychologists are loaded with
facts, which they have put together in their heads into some sort
of coherent picture of human behavior (different pictures, of
course, depending on the school). I get the same thing from
linguists and practically everyone else associated with human
behavior.

If I had the authority, I would ask all these people to gather up
all the facts they have accumulated, trace them back to their
sources, and put the papers describing the studies into three
piles:

Pile 1: Studies in which, of the total tested population, at
least 20% of the subjects failed to show the main effect or
showed an opposite effect.

Pile 2: Studies in which between 5% and 20% of the subjects
failed to behave according to hypothesis.

Pile 3: Studies in which 95% or more of the subjects behaved
according to hypothesis.

I would pick my candidates for useful facts from Pile 3. The
studies from Pile 2 I would send back to their authors with a
request to try to find out why the deviant persons didn't behave
as the rest did or to change their hypothesis, and to do another
study to try to make it into Pile 3. Pile 1 would go in the
wastebasket.

This would leave me with a small number of candidates for real
facts. Those, if interesting, would be close enough to facts to
justify putting in some labor trying to get rid of that last 5%
of the deviants, probably by redesigning the experiment and the
hypothesis and repeating the study a number of times. When the
number of deviants was down to, say, 1% (or whatever the level
where I got tired of being a good scientist), I would put the
result in Pile 4: facts usable in a scientific model of behavior.

Only then would I put any serious effort into trying to find a
model that could reproduce the reported behavior. Why go to the
great labor of developing a model that will accurately predict
the average behavior of a population in which you KNOW that 5% or
20% or 50% of the predictions made by this model will be wrong? I
could build a model right now that would fit the prediction that
"Mothers hold their babies on the left." All I would have to do
would be to program it so it would never use its right arm to
hold a baby. Then the model would perfectly fit the observed
"fact" as remembered by the psychologist. Unfortunately, it would
not fit the behavior of 20% or so of the mothers.

When you tell me linguistic facts, it would be helpful to my
attitude if I knew which pile they were in.

ยทยทยท

from my standpoint it's not easy to get the kind of information I
-------------------------------------------------------------
Best,

Bill P.

[Martin Taylor 930315 19:15]
(Bill Powers 930315.1300)

Here we go again :wink:

I can't build a brain model that leads to 80% of the people
doing one thing and 20% doing something quite different. A model
does only one thing: what its design says it must do. This means
that the model can only make one prediction at a time. If it
predicts wrong an unpredictable 20% of the time, then it's just
wrong.

I don't understand this. You've said similar things many times. But
you also have talked about reorganization of the hierarchy as being
necessarily random, and continuing until the hierarchy satisfactorily
controls in the conditions to which it is exposed. Why should you
assume that all people have reorganized the same, or that your model
is wrong if it has one set of connections and not all of your experimental
subjects have the same pattern? Why should you even assume that any
of your subjects have completed all reorganization, and will control
any particular variable. Some will, some won't, I would guess.

The kind of data I need concern what people ALWAYS do. I don't
think this kind of data is common in linquistics or any other
conventional science of life.

There isn't much that anyone ALWAYS does, let alone that everyone does, if
for no other reason that what people choose to control varies from
moment to moment (I AM aware of the sloppy use of the word "choose";
it will suffice). Added to the probable variation among people among
how they control when they do control what an observer might think is
the same thing, I would think it no problem to even a very good model
if it failed to predict some things 20% of the time.

When I say "always," I make reasonable allowances. But I do mean
that exceptions should be very rare.

"Reasonable," like so much else, is in the eye of the beholder. You
perceive yourself as reasonable. I propose that your excessive stringency
is due to the spectacular success you have had in your low-level modelling.
You say you have never tried to fit a four-level hierarchy to data, yet you
theorize (wonderfully) about an 11-level one. I know you apologize sometimes
for doing so, and say it isn't science. But I venture to guess that in
the more complex hierarchies, whose form is not dictated by the natural
circumstances that face all humans most of their lives, you will never
achieve the precision you seek, even though the models may be exactly
correct.

There are two issues: Is the model correct? and do I have the right
parameters for this particular event I am modelling? I think it will,
in principle, not be possible to get the parameters correct for any
reasonably complex hierarchic model.

Just to head off a possible misdirected counter, I am not saying that
you cannot expect precise prediction of control from a high-level ECS
that forms an element of a complex hierarchy (if they really do exist).
I am arguing that you will not be able ever to get precise prediction
from a complex hierarchy as a whole.

Martin