[From Bill Powers (2009.01.03.0858 MST)]

Martin Taylor(various) --

I've been looking over the two references to Bayesian logic that you conundrum that Mike Acree posted: what is the probability that John is the Pope given that John is Catholic, and what is the probability that John is Catholic given that John is the Pope? I said that one Catholic in 1000 is the pope, and 1 person in 10 is Catholic, and whaddaya know, the Bayesian theorem works:

p(P)p(C|P) = p(C)p(P|C)

(0.0001)(1) = (0.1)(0.001)

C: John is Catholic; P: John is Pope; p: probability

But there's something funny about this, or my internal translation of it. It has something to do with writing p(P). The calculation implies that after we choose any person from the total population, there is a certain probability that this person is the Pope. But that probability actually applies only within the subset of Catholics; outside that subset, the probability that a randomly chosen person will be the Pope is zero. p(P) is defined only within the set of Catholics -- there isn't really a uniform probability within the total population. We have a value for p(P|C), but the value of p(P|not-C) is always zero. That doesn't seem to be explicit in the notation, nor would it seem to be necessarily true in all cases.

I guess this is just an echo of the "disorder" discussion. I think of probabilities in a more Bayesian way, perhaps, because I always think in terms of "the number or proportion of people who..." rather than "the chance that any given person will ...". We are always talking about subpopulations, it seems to me: some might do it, some don't and never would. No noncatholic ever will be Pope, even though the population statistics for N people say there is one chance in N (or 2 chances, if we include Eastern Orthodox) for every person. It's not that everyone has an equal tendency or risk as implied by the probability; it's that only a certain subset has any probability at all of showing the effect. The probability of being Pope is measured relative to the size of the subset, not as the "strength" of a "tendency" in the whole population. The choosing of a person from the whole population is random, but the condition itself is not.

Maybe that's what you've been trying to say.

In a somewhat different context:

You can have a perfectly regular relationship between two variables, y = f(x), yet when you observe y as a function of x, it appears to have a random component. The random component is not a property of the functional relationship f, but of the method of observation. It's possible that the observation itself disturbs the observed variable in some unknown and therefore random-looking way, as per Heisenberg, but it's at least as likely that the observation is generated in a way that is only partly dependent on the actual states of the variables -- that we are observing the wrong variable. Like trying to see the temporal relationship between the way the violins play and the gestures of the conductor's right elbow.

When I was learning practical electronics, one of the first things we were taught was that measuring a voltage with an ordinary 20000-ohm-per-volt voltmeter drew enough current from a circuit to alter the voltage being measured. The thing we were taught next was how to compensate for that error, by figuring out the impedance of the measured circuit and using theory to calculate what the correct reading was. I always wondered why we couldn't do that for position and momentum measurements, but my physics professors would look offended and say that the uncertainty was in the position and momentum, not in the measurements. I never believed them.

What's on my mind now is the idea that perhaps there are two completely different contexts in which to apply probability ideas. One is the context of qualitative events: things that either happen or don't happen( which can be seen as logical propositions that are either true or not true). The other is the context of continuous quantitative relationships where we measure things rather than counting them.

When we measure things, probability does not apply to the act of measuring, but to the degree of uncertainty in the measurement. A voltmeter gives a reading of 10.0 volts, plus or minus 0.1. The probability enters with the 0.1, not the 10.0. But when we predict events, such as rain tomorrow, the 0.1 is all there is: either there is detectable rain tomorrow, or there is not. The laws of probability make sense in the latter qualitative case, but are relatively unimportant in the former quantitative case. There are, of course, gray areas where both are somewhat important: Yes it did rain, but you said there would be a downpour and all we got was a tenth of an inch.

This applies to perception. In some cases, qualitative probability enters: was that a dog I caught a glimpse of, or a coyote? But in ordinary everyday life, most perceptions are seen clearly and for long enough to leave only a small bit of uncertainty about the amount, and practically none about the identity or logical state. I don't think I have ever mistaken my glasses for a pencil and picked up the wrong thing. I drive a car without wandering from edge to edge of the road, and stop when the light is red but never when it's green. To measure uncertainty in perception, it seems to me, you have to set up pretty special conditions, with very low signal levels and masking noise or other distractions to make it difficult to see, hear, or what have you. Most of the time in real life, the world of perception is stable and repeatable, with uncertainty in the second to third decimal place, not in the first to second.

I think this pretty much wraps up my reluctance to think of information theory and uncertainty as being relevant to the way most of our control processes work. It's not that I think they don't apply, but that I don't think they can add much to the kinds of measurements we make in, for example, the tracking demo.

I can't say for sure, but I don't think that perceptual signals in the brain are any noiser than our experiences of them. When I say we may not be measuring them right, I mean that there may be parallel signals involved in what we experience as "a" signal, or that there are smoothing processes involved in the control loop that we don't see when we look at single-fiber activities, or that the signal levels are higher in most ordinary experiences than those used in neurological experiments, or that while the signal frequencies may be unpredictable, they change in smooth ways that show some regular analog process must be involved. I still say that in the records of spike trains that I have seen in many places, I see very little randomness in the way the frequencies seem to change. I really don't see the uncertainties that people write about. You just don't see those pulse train frequencies jumping randomly from one frequency to another: they speed up or slow down. About the only place where random-looking signals come up is in cases such as electronmyograms where we're really seeing the effects of many parallel signals which are not sychronized so on very fast time scales we see what looks like noise. But the individual signals in single axons show smoothly changing frequencies, as far as I have ever seen.

The main uncertainty here, as far as I'm concerned, is what people are talking about when they say neural signals have a lot of uncertainty in them. We really need some data.

Best,

Bill P.

## ···

sent with the "Dangers" paper. I actually checked out the little