Subjective and Objective probabilities (was Statistics: what is it about?)

[Martin Taylor 2009.01.14.15.38]

We have been talking around each other in a non-communicative fashion about subjective and objective probabilities, so I'm trying a different tack, to ask what is meant by "probability" in the first place.

First, what is it not. It is not an observation (a set of data). Observations are perceptions, and as such are assured. They have happened, and no probability can be assigned to them. They are the basic data. From the observations (including all ancilliary available evidence) we must discover whatever probability interests us.

Probability necessarily relates to future possible observations. If a probability is assigned, it is always assigned to an observation not yet made. Some possible values of a possible future observation are more likely than others. Where does this "more likely" comparison exist? Does it exist in the world that is observed? Does it exist in the entity that will do the observing? Where? The probability exists, but the observation does not, as yet. So where is the probability to be found? My answer is that it is in a mind. It is "subjective".

What is "objective" probability? Is it the "true" likelihood of the future observation? It can't be in the already taken observations, because they are fixed and their values are not in question. So where is it? To what can it be applied? If it is factually true, but unknown to anyone, that there are a million swans in the world (of which you have seen 10), and in all the world there are 16 black swans, all in Togoland, what does it mean to ask what the probability is that the next swan you see will be white? Is it (1,000,000 - 16)/1,000,000? Or is it 1, because all the black swans are where you won't ever see them? Or is it much less, because your observations have encountered only 10 swans? What is the probability that a random swan of all swans in the world is white? How could you know? Surely there must be ONE "objective probability" for this -- or must there?

To what can "objective probability" be applied? Can it be to the probability that a fair die will come up "4"? Surely not. To determine the value 1/6 (which we presumably agree is the "correct" value) for that depends on a mental model of what it means to have a fair die, and mental models are subjective. Oh, that's no problem. The die is real and is really fair, really and truly, out there in the "objective" world. So that's an "objective probability". Or is it? Why do we think it is? Is it because we each postulate the same mechanisms, and that mechanism (model) results in the value 1/6? If so, how is it an "objective" (true in the real world that we assume to exist) probability?

I heard a noise. It sounded like a bump or a muffled bang. Is someone knocking at the door? Did something fall? Is someone working on the roof next door? I think it was more probably someone knocking at the door than either of the others. "I think" but what is the true "objective" probability of each possibility? How would I find out? Am I perceiving a probability? If not, just what is it that feels as though I am perceiving a probability comparison? My feeling about the probability I think I perceive changes when I go to the door and find nobody there? What "objective probability" changed? My perception is that my probabilities for the three possibilities changed, but that's not objective, is it? What is the objective probability that someone was there but went away before I got to the door?

The concept of "objective probability" seems fraught with so many problems, starting with the fact that it is always about something in the future, and cannot ever be measured or observed, that I am bewildered by the apparent fact that other people believe it to be a useful scientific concept.

Martin

(Gavin Ritz 2008.01.15.11.19NZT)
[Martin Taylor 2009.01.14.15.38]

Martin

That is a very erudite response to the question of probability.

I do not know the answer to your questions but I can say that all of human
scientific endeavour is a projection of mind. No concepts ever created by
man can in fact be really proved to be based in any form of reality.
(whatever that may mean).

Modern science is a mental projection and simply I can ask if this is not so
then show me how I can measure the concept of energy directly with a
measuring instrument. The answer is you can't. It's an abstraction.

Is the really such a thing as H2O?

If I take Elliot Jaques Requisite Organizational theory then probability
resides at a level of abstraction (related to a time forward quantum) where
variables are bi-conditional (if-and-only-if) and in parallel and
interlinked. This means that it is neigh impossible to actually determine a
specific outcome. In fact the whole of human judgement and decision making
is ineffable.

If I look at PCT and its top level hierarchy of "systems concepts" this is
really saying the same thing. PCT is nested in itself as a systems concept.
The fact that behaviour is the control of perception really highlights this
notion.

All high level concepts like QM or PCT or Geometrodynamics are circular in
their notions. All are actually systems concepts which make them
probabilistic in nature.

And makes it all the harder to even dialogue the issue.

Regards
Gavin

We have been talking around each other in a non-communicative fashion
about subjective and objective probabilities, so I'm trying a different
tack, to ask what is meant by "probability" in the first place.

First, what is it not. It is not an observation (a set of data).
Observations are perceptions, and as such are assured. They have
happened, and no probability can be assigned to them. They are the basic
data. From the observations (including all ancilliary available
evidence) we must discover whatever probability interests us.

Probability necessarily relates to future possible observations. If a
probability is assigned, it is always assigned to an observation not yet
made. Some possible values of a possible future observation are more
likely than others. Where does this "more likely" comparison exist? Does
it exist in the world that is observed? Does it exist in the entity
that will do the observing? Where? The probability exists, but the
observation does not, as yet. So where is the probability to be found?
My answer is that it is in a mind. It is "subjective".

What is "objective" probability? Is it the "true" likelihood of the
future observation? It can't be in the already taken observations,
because they are fixed and their values are not in question. So where is
it? To what can it be applied? If it is factually true, but unknown to
anyone, that there are a million swans in the world (of which you have
seen 10), and in all the world there are 16 black swans, all in
Togoland, what does it mean to ask what the probability is that the next
swan you see will be white? Is it (1,000,000 - 16)/1,000,000? Or is it
1, because all the black swans are where you won't ever see them? Or is
it much less, because your observations have encountered only 10 swans?
What is the probability that a random swan of all swans in the world is
white? How could you know? Surely there must be ONE "objective
probability" for this -- or must there?

To what can "objective probability" be applied? Can it be to the
probability that a fair die will come up "4"? Surely not. To determine
the value 1/6 (which we presumably agree is the "correct" value) for
that depends on a mental model of what it means to have a fair die, and
mental models are subjective. Oh, that's no problem. The die is real and
is really fair, really and truly, out there in the "objective" world. So
that's an "objective probability". Or is it? Why do we think it is? Is
it because we each postulate the same mechanisms, and that mechanism
(model) results in the value 1/6? If so, how is it an "objective" (true
in the real world that we assume to exist) probability?

I heard a noise. It sounded like a bump or a muffled bang. Is someone
knocking at the door? Did something fall? Is someone working on the roof
next door? I think it was more probably someone knocking at the door
than either of the others. "I think" but what is the true "objective"
probability of each possibility? How would I find out? Am I perceiving a
probability? If not, just what is it that feels as though I am
perceiving a probability comparison? My feeling about the probability I
think I perceive changes when I go to the door and find nobody there?
What "objective probability" changed? My perception is that my
probabilities for the three possibilities changed, but that's not
objective, is it? What is the objective probability that someone was
there but went away before I got to the door?

The concept of "objective probability" seems fraught with so many
problems, starting with the fact that it is always about something in
the future, and cannot ever be measured or observed, that I am
bewildered by the apparent fact that other people believe it to be a
useful scientific concept.

Martin

[From Bill Powers (2009.01.14.1547 MST)]

Martin Taylor 2009.01.14.15.38 –

We have been talking around each
other in a non-communicative fashion about subjective and objective
probabilities, so I’m trying a different tack, to ask what is meant by
“probability” in the first place.

An excellent idea and a most eloquent post.

First, what is it not. It is not
an observation (a set of data). Observations are perceptions, and as such
are assured. They have happened, and no probability can be assigned to
them. They are the basic data. From the observations (including all
ancilliary available evidence) we must discover whatever probability
interests us.

Probability necessarily relates
to future possible observations. If a probability is assigned, it is
always assigned to an observation not yet made. Some possible values of a
possible future observation are more likely than others. Where does this
“more likely” comparison exist? Does it exist in the world that
is observed? Does it exist in the entity that will do the
observing? Where? The probability exists, but the observation does not,
as yet. So where is the probability to be found? My answer is that it is
in a mind. It is “subjective”.

Very nice development, as solid as it can be.

Let’s also introduce “why” here. Why do we want to know the
probability that something is going to happen in the future? Obviously,
because if it happens we want to know what to do about it, and that
depends on which of several things that could happen, about which we
care, does happen. So the idea enters of preparing ourselves for some
occurrance.

What is “objective”
probability? Is it the “true” likelihood of the future
observation? It can’t be in the already taken observations, because they
are fixed and their values are not in question. So where is it? To what
can it be applied? If it is factually true, but unknown to anyone, that
there are a million swans in the world (of which you have seen 10), and
in all the world there are 16 black swans, all in Togoland, what does it
mean to ask what the probability is that the next swan you see will be
white? Is it (1,000,000 - 16)/1,000,000? Or is it 1, because all the
black swans are where you won’t ever see them? Or is it much less,
because your observations have encountered only 10 swans? What is the
probability that a random swan of all swans in the world is white? How
could you know? Surely there must be ONE “objective
probability” for this – or must there?

Again, nothing but agreement. Clearly, “objective” can’t be
allowed to imply knowledge of the true nature of reality. But perhaps it
can still have a useful meaning. Here are some things we can continue to
demand of “objective” observations, even knowing that they are
perceptions.

First, the observation has to be something that we can’t affect by our
actions prior to the observation. This tells us that the perception is
probably being caused by something else independent of us.

Second, there must be no collusion with someone else who can affect the
observation – this is really just another way of saying we should not
affect the observation.

Third, no (important) part of the perception, including the lower-level
perceptions of which it is made, can be imaginary – generated inside us
rather than being derived from the senses. Since we can alter perceptions
by adding imagined components to them, this is a third way of saying we
should try not to influence the observation.

Fourth, under some circumstances it is important that someone else not in
communication with us record the same observation for later comparison.
This is just one of several possible ways of trying to avoid
illusions.

The intention behind these conditions is be as sure as possible that what
we want or intend or dislike and so on has no way of influencing what we
perceive, consciously or unconsciously. This isn’t easy to achieve, but
we are most of the way there if we recognize the importance of achieving
objective observation and want the observation to be uncontaminated. And
we want to achieve this because we want to know that the observation is
an indicator of something happening in the world we can’t directly
experience. If we influence the observation we lose that
opportunity.

This adds something to the definition of objective. We can’t know what
our perceptions really come from, out there in the quark soup. The next
best thing we can do is try to make sure the ones of interest are coming
from Out There rather than In Here. This is an ideal, of course, an
aspiration.

I think you are saying something very similar:

To what can “objective
probability” be applied? Can it be to the probability that a fair
die will come up “4”? Surely not. To determine the value 1/6
(which we presumably agree is the “correct” value) for that
depends on a mental model of what it means to have a fair die, and mental
models are subjective. Oh, that’s no problem. The die is real and is
really fair, really and truly, out there in the “objective”
world. So that’s an “objective probability”. Or is it? Why do
we think it is? Is it because we each postulate the same mechanisms, and
that mechanism (model) results in the value 1/6? If so, how is it an
“objective” (true in the real world that we assume to exist)
probability?

Merge this with what I have been saying. The fall of the die, as we
perceive it, is objective because it happens without (we hope) our
influence. The die is fair because to the best of our knowledge we can’t
influence it, and there is no other influence that we know of that would
favor one orientation. Our definition of a fair die is simple: it’s a die
that will show each number an equal number of times if we keep throwing
it in a complex enough way. If we can’t find a complex enough way beyond
our ability to control to cause the averages to approach 1/6, the die is
not fair. This doesn’t require us to reify the die or the
probability.

Observations and evidence are perceptions, but they are a subset of all
perceptions. They are the subset which, for a variety of reasons, we’re
prepared to trust as being uncontrolled by us. To the degree that we
trust them in that way, we call them objective.

I heard a noise. It sounded like
a bump or a muffled bang. Is someone knocking at the door? Did something
fall? Is someone working on the roof next door? I think it was more
probably someone knocking at the door than either of the others. “I
think” but what is the true “objective” probability of
each possibility? How would I find out? Am I perceiving a probability? If
not, just what is it that feels as though I am perceiving a probability
comparison? My feeling about the probability I think I perceive changes
when I go to the door and find nobody there?

What “objective probability” changed? My perception is that my
probabilities for the three possibilities changed, but that’s not
objective, is it? What is the objective probability that someone was
there but went away before I got to the door?

Objective probability can be defined as an observable kind of relative
frequency of occurrance of perceptions. What Mike Acree calls
“aleatory” probability, based on models, can’t be objective
unless the models have that property. The ultimate determinant of
objectivity is to demonstrate that the distribution of occurrances,
states, or relationships is repeatable and is beyond our ability to
control. See? Nothing up my sleeves!

This way of defining objectivity doesn’t solve all problems having to do
with fooling others and ourselves, but it does carry with it warnings
about how easily we can violate the conditions of objectivity.

The concept of “objective
probability” seems fraught with so many problems, starting with the
fact that it is always about something in the future, and cannot ever be
measured or observed, that I am bewildered by the apparent fact that
other people believe it to be a useful scientific
concept.

For me both words cause a problem. “Objective” can be handled
without asserting that we have impossible knowledge.
“Probability” gives different problems, but mostly verbal ones.
The first difficulty I see is that it immediately suggests the objective
existence of actual probabilities Out There, but I think we have a handle
on that. The other main one is the idea that we can quantify the
perception of probability in some way that has validity from one person
to another.
Your way of illustrating subjective probability shows that it can be cast
in terms of trying various hypotheses to explain observations, which we
human beings always like to do. We are looking for the best fit of a
mental model to perceptions in the class we call observations. That thump
actually occured; you didn’t make it; you didn’t imagine it. So, your
brain demands, explain it. Build me a mental model of some unseen
occurrance which, if it really happened, would explain every detail of
the experience: when, where, why, and in what manner it happened. The
explanation that covers the most details is the one we consider most
probable – most believable, most testable, most reliable.

An inspiring post!

Best,

Bill P.

(Gavin Ritz 2008.01.15.13.29NZT)

[From Bill Powers
(2009.01.14.1547 MST)]

Martin Taylor 2009.01.14.15.38 –

Well, this what Richard Dawkins has to say about probability. I’m
not sure he’s totally correct; as this Being resides in the perceptions of
the thinker (Martin’s subjective).
Is it out “There” (Martin’s Objective) probably not, in “Here” probably
yes.

We have been talking
around each other in a non-communicative fashion about subjective and objective
probabilities, so I’m trying a different tack, to ask what is meant by
“probability” in the first place.

An excellent idea and a most eloquent post.

First, what is it not. It
is not an observation (a set of data). Observations are perceptions, and as
such are assured. They have happened, and no probability can be assigned to
them. They are the basic data. From the observations (including all ancilliary
available evidence) we must discover whatever probability interests us.

Probability necessarily
relates to future possible observations. If a probability is assigned, it is
always assigned to an observation not yet made. Some possible values of a
possible future observation are more likely than others. Where does this
“more likely” comparison exist? Does it exist in the world that is
observed? Does it exist in the entity that will do the observing? Where?
The probability exists, but the observation does not, as yet. So where is the
probability to be found? My answer is that it is in a mind. It is
“subjective”.

Very nice development, as solid as it can be.

Let’s also introduce “why” here. Why do we want to know the
probability that something is going to happen in the future? Obviously, because
if it happens we want to know what to do about it, and that depends on which of
several things that could happen, about which we care, does happen. So the idea
enters of preparing ourselves for some occurrance.

What is
“objective” probability? Is it the “true” likelihood of the
future observation? It can’t be in the already taken observations, because they
are fixed and their values are not in question. So where is it? To what can it
be applied? If it is factually true, but unknown to anyone, that there are a
million swans in the world (of which you have seen 10), and in all the world
there are 16 black swans, all in Togoland, what does it mean to ask what the
probability is that the next swan you see will be white? Is it (1,000,000 -
16)/1,000,000? Or is it 1, because all the black swans are where you won’t ever
see them? Or is it much less, because your observations have encountered only
10 swans? What is the probability that a random swan of all swans in the world
is white? How could you know? Surely there must be ONE “objective
probability” for this – or must there?

Again, nothing but agreement. Clearly, “objective” can’t be allowed
to imply knowledge of the true nature of reality. But perhaps it can still have
a useful meaning. Here are some things we can continue to demand of
“objective” observations, even knowing that they are perceptions.

First, the observation has to be something that we can’t affect by our actions
prior to the observation. This tells us that the perception is probably being
caused by something else independent of us.

Second, there must be no collusion with someone else who can affect the
observation – this is really just another way of saying we should not affect
the observation.

Third, no (important) part of the perception, including the lower-level
perceptions of which it is made, can be imaginary – generated inside us rather
than being derived from the senses. Since we can alter perceptions by adding
imagined components to them, this is a third way of saying we should try not to
influence the observation.

Fourth, under some circumstances it is important that someone else not in
communication with us record the same observation for later comparison. This is
just one of several possible ways of trying to avoid illusions.

The intention behind these conditions is be as sure as possible that what we
want or intend or dislike and so on has no way of influencing what we perceive,
consciously or unconsciously. This isn’t easy to achieve, but we are most of
the way there if we recognize the importance of achieving objective observation
and want the observation to be uncontaminated. And we want to achieve this
because we want to know that the observation is an indicator of something
happening in the world we can’t directly experience. If we influence the
observation we lose that opportunity.

This adds something to the definition of objective. We can’t know what our
perceptions really come from, out there in the quark soup. The next best thing
we can do is try to make sure the ones of interest are coming from Out There
rather than In Here. This is an ideal, of course, an aspiration.

I think you are saying something very similar:

To what can
“objective probability” be applied? Can it be to the probability that
a fair die will come up “4”? Surely not. To determine the value 1/6
(which we presumably agree is the “correct” value) for that depends
on a mental model of what it means to have a fair die, and mental models are
subjective. Oh, that’s no problem. The die is real and is really fair, really
and truly, out there in the “objective” world. So that’s an
“objective probability”. Or is it? Why do we think it is? Is it
because we each postulate the same mechanisms, and that mechanism (model)
results in the value 1/6? If so, how is it an “objective” (true in
the real world that we assume to exist) probability?

Merge this with what I have been saying. The fall of the die, as we perceive
it, is objective because it happens without (we hope) our influence. The die is
fair because to the best of our knowledge we can’t influence it, and there is
no other influence that we know of that would favor one orientation. Our
definition of a fair die is simple: it’s a die that will show each number an
equal number of times if we keep throwing it in a complex enough way. If we
can’t find a complex enough way beyond our ability to control to cause the averages
to approach 1/6, the die is not fair. This doesn’t require us to reify the die
or the probability.

Observations and evidence are perceptions, but they are a subset of all
perceptions. They are the subset which, for a variety of reasons, we’re
prepared to trust as being uncontrolled by us. To the degree that we trust them
in that way, we call them objective.

I heard a noise. It
sounded like a bump or a muffled bang. Is someone knocking at the door? Did
something fall? Is someone working on the roof next door? I think it was more
probably someone knocking at the door than either of the others. “I
think” but what is the true “objective” probability of each
possibility? How would I find out? Am I perceiving a probability? If not, just
what is it that feels as though I am perceiving a probability comparison? My
feeling about the probability I think I perceive changes when I go to the door
and find nobody there?

What “objective probability” changed? My perception is that my
probabilities for the three possibilities changed, but that’s not objective, is
it? What is the objective probability that someone was there but went away
before I got to the door?

Objective probability can be defined as an observable kind of relative
frequency of occurrance of perceptions. What Mike Acree calls
“aleatory” probability, based on models, can’t be objective unless
the models have that property. The ultimate determinant of objectivity is to
demonstrate that the distribution of occurrances, states, or relationships is
repeatable and is beyond our ability to control. See? Nothing up my sleeves!

This way of defining objectivity doesn’t solve all problems having to do with
fooling others and ourselves, but it does carry with it warnings about how
easily we can violate the conditions of objectivity.

The concept of
“objective probability” seems fraught with so many problems, starting
with the fact that it is always about something in the future, and cannot ever
be measured or observed, that I am bewildered by the apparent fact that other
people believe it to be a useful scientific concept.

For me both words cause a problem. “Objective” can be handled without
asserting that we have impossible knowledge. “Probability” gives
different problems, but mostly verbal ones. The first difficulty I see is that
it immediately suggests the objective existence of actual probabilities Out
There, but I think we have a handle on that. The other main one is the idea
that we can quantify the perception of probability in some way that has
validity from one person to another.
Your way of illustrating subjective probability shows that it can be cast in
terms of trying various hypotheses to explain observations, which we human
beings always like to do. We are looking for the best fit of a mental model to
perceptions in the class we call observations. That thump actually occured; you
didn’t make it; you didn’t imagine it. So, your brain demands, explain it. Build me a mental model of
some unseen occurrance which, if it really happened, would explain every detail
of the experience: when, where, why, and in what manner it happened. The
explanation that covers the most details is the one we consider most probable
– most believable, most testable, most reliable.

An inspiring post!

Best,

Bill P.

new-bus-slogans[1].jpg

···

[Martin Taylor 2009.01.19.09.37]

[From Bill Powers (2009.01.14.1547 MST)]

Martin Taylor 2009.01.14.15.38 --

We have been talking around each other in a non-communicative fashion about subjective and objective probabilities, so I'm trying a different tack, to ask what is meant by "probability" in the first place.

What is "objective" probability? Is it the "true" likelihood of the future observation? It can't be in the already taken observations, because they are fixed and their values are not in question. So where is it? To what can it be applied? If it is factually true, but unknown to anyone, that there are a million swans in the world (of which you have seen 10), and in all the world there are 16 black swans, all in Togoland, what does it mean to ask what the probability is that the next swan you see will be white? Is it (1,000,000 - 16)/1,000,000? Or is it 1, because all the black swans are where you won't ever see them? Or is it much less, because your observations have encountered only 10 swans? What is the probability that a random swan of all swans in the world is white? How could you know? Surely there must be ONE "objective probability" for this -- or must there?

Again, nothing but agreement. Clearly, "objective" can't be allowed to imply knowledge of the true nature of reality. But perhaps it can still have a useful meaning. Here are some things we can continue to demand of "objective" observations, even knowing that they are perceptions.

I'm afraid I don't see your objective (or distinguish "objective") in what follows.

First, the observation has to be something that we can't affect by our actions prior to the observation. This tells us that the perception is probably being caused by something else independent of us.

Second, there must be no collusion with someone else who can affect the observation -- this is really just another way of saying we should not affect the observation.

Third, no (important) part of the perception, including the lower-level perceptions of which it is made, can be imaginary -- generated inside us rather than being derived from the senses. Since we can alter perceptions by adding imagined components to them, this is a third way of saying we should try not to influence the observation.

Fourth, under some circumstances it is important that someone else not in communication with us record the same observation for later comparison. This is just one of several possible ways of trying to avoid illusions.

The intention behind these conditions is be as sure as possible that what we want or intend or dislike and so on has no way of influencing what we perceive, consciously or unconsciously.

This set of requirements is the rationale behind "double-blind" experiments, but I can't see its relation to the perception of probability. The last sentence, however, reiterates what I have been saying, that humans often fail to conform to Jaynes's desideratum IIIb in forming overt (conscious) probability judgements.

Observations can do no more than provide data toward making a probability perception, can they? One never observes a probability. One observes a proportion, perhaps, but unless one has some kind of a model about what lies behind the observed proportion (such as, for example, that the process that generated the proportion observed will continue to generate the same proportion in future), one cannot use the observed proportion to make a probability assessment.

Why should it matter that we cannot influence the process that generates the observations? We perceive what we perceive, whether it is being controlled or not. The probabilities of different values of perceptions are different if we control or let alone, but that makes them no more and no less "objective" or "subjective". Observations already made are fixed, and can be used along with some kind of formal or informal model in the creation of our perception of probability. If the model includes that we control the perception so be it.

Merge this with what I have been saying. The fall of the die, as we perceive it, is objective because it happens without (we hope) our influence. The die is fair because to the best of our knowledge we can't influence it, and there is no other influence that we know of that would favor one orientation. Our definition of a fair die is simple: it's a die that will show each number an equal number of times if we keep throwing it in a complex enough way. If we can't find a complex enough way beyond our ability to control to cause the averages to approach 1/6, the die is not fair. This doesn't require us to reify the die or the probability.

Observations and evidence are perceptions, but they are a subset of all perceptions. They are the subset which, for a variety of reasons, we're prepared to trust as being uncontrolled by us. To the degree that we trust them in that way, we call them objective.

I'm sorry, but I can't see the use of "objective" as being restricted to uncontrolled perceptions.

Objective probability can be defined as an observable kind of relative frequency of occurrance of perceptions.

I would call this not "objective probability" but "observed proportion".

What Mike Acree calls "aleatory" probability, based on models, can't be objective unless the models have that property. The ultimate determinant of objectivity is to demonstrate that the distribution of occurrances, states, or relationships is repeatable and is beyond our ability to control. See? Nothing up my sleeves!

In retrospect, it was repeatable, but you can't demonstrate that it will be. In any case, it is a very low probability event that the die will give exactly 1/6 proportion for each number. What makes it fair is that you have a mental model of how it is constructed, and the results it gives are sufficiently likely given that model.

This way of defining objectivity doesn't solve all problems having to do with fooling others and ourselves, but it does carry with it warnings about how easily we can violate the conditions of objectivity.

Yes. To paraphrase Jaynes's Desideratum IIIb: "Take into account all of the evidence relevant to a question; do not ignore some of the information and base your conclusion on what remains; be completely non-ideological." That's what I would call being objective, rather than basing it on whether or not you are controlling the perception in question.

In [From Bill Powers (2009.01.16.0410 MST)] ( "One last bit before leaving.") you say that you do not believe the perception of probability can be represented by a scalar number? Why not, when you claim every other perception, from the intensity of red to the democratic value of the nation can be represented by a scalar number? Why do you think it uniquely improper to represent this one kind of perception by a scalar number (a neural signal, if you want to make that equivalence)?

Martin

[From Rick Marken (2008.01.19.1120)]

Martin Taylor (2009.01.19.09.37)--

Bill Powers (2009.01.14.1547 MST)--

Again, nothing but agreement. Clearly, "objective" can't be allowed to
imply knowledge of the true nature of reality. But perhaps it can still have
a useful meaning. Here are some things we can continue to demand of
"objective" observations, even knowing that they are perceptions.

First, the observation has to be something that we can't affect by our
actions prior to the observation.

Second, there must be no collusion with someone else who can affect the
observation

Third, no (important) part of the perception, including the lower-level
perceptions of which it is made, can be imaginary

Fourth, under some circumstances it is important that someone else not in
communication with us record the same observation for later comparison.

The intention behind these conditions is be as sure as possible that what
we want or intend or dislike and so on has no way of influencing what we
perceive, consciously or unconsciously.

This set of requirements is the rationale behind "double-blind" experiments,
but I can't see its relation to the perception of probability.

These requirements define (quite nicely, I think) what can be
considered "objective" in all science, not just "double blind"
experiments. The relation to probability is that this helps us see
what might be meant by objective probability. I would presume that if
these four conditions were met in an experiment aimed at determining
the probability of getting "heads" in a series of coin tosses, for
example, the probability (or proportion of "heads") observed could be
considered objective.

Observations can do no more than provide data toward making a probability
perception, can they? One never observes a probability. One observes a
proportion, perhaps, but unless one has some kind of a model about what lies
behind the observed proportion (such as, for example, that the process that
generated the proportion observed will continue to generate the same
proportion in future), one cannot use the observed proportion to make a
probability assessment.

I think you are defining "probability" in a way that makes it
unobservable. I'm comfortable saying that, if the proportion of times
heads comes up in a large number of coin tosses is ~.25 then the
probability of heads for the coin is ~.25 (assuming objective
observations as per Bill's criteria).

Why should it matter that we cannot influence the process that generates the
observations?

If someone is influencing (or, more importantly, controlling) the
observations then what we observe would not be considered objective;.
If someone set the coin down (controlled it's orientation on the
table) 100 times and had it showing heads 1/2 the time then I could
see that the proportion of heads was .5 but I would not, then,
conclude, that the probability of the coin turning up heads is .5.

We perceive what we perceive, whether it is being controlled
or not. The probabilities of different values of perceptions are different
if we control or let alone

This is not at all the case. If we (or someone) is controlling the
perception of "heads" then the proportion of times heads comes up is
not an objective indication of probability.

Merge this with what I have been saying. The fall of the die, as we
perceive it, is objective because it happens without (we hope) our
influence. The die is fair because to the best of our knowledge we can't
influence it, and there is no other influence that we know of that would
favor one orientation. Our definition of a fair die is simple: it's a die
that will show each number an equal number of times if we keep throwing it
in a complex enough way. If we can't find a complex enough way beyond our
ability to control to cause the averages to approach 1/6, the die is not
fair. This doesn't require us to reify the die or the probability.

Observations and evidence are perceptions, but they are a subset of all
perceptions. They are the subset which, for a variety of reasons, we're
prepared to trust as being uncontrolled by us. To the degree that we trust
them in that way, we call them objective.

I'm sorry, but I can't see the use of "objective" as being restricted to
uncontrolled perceptions.

Well, then we really must play poker sometime.

What Mike Acree calls "aleatory" probability, based on models, can't be
objective unless the models have that property. The ultimate determinant of
objectivity is to demonstrate that the distribution of occurrances, states,
or relationships is repeatable and is beyond our ability to control. See?
Nothing up my sleeves!

In retrospect, it was repeatable, but you can't demonstrate that it will be.
In any case, it is a very low probability event that the die will give
exactly 1/6 proportion for each number.

As the number of rolls increases the observed proportions for each
number on the die will stabilize; I would consider these very good
objective measures of the probability of getting a particular number
on each roll.

Best

Rick

···

--
Richard S. Marken PhD
rsmarken@gmail.com

[Martin Taylor 2009.01.19.14.48]


[From Rick Marken (2008.01.19.1120)]
> Observations can do no more than provide data toward making a probability
perception, can they? One never observes a probability. One observes a
proportion, perhaps, but unless one has some kind of a model about what lies
behind the observed proportion (such as, for example, that the process that
generated the proportion observed will continue to generate the same
proportion in future), one cannot use the observed proportion to make a
probability assessment.
I think you are defining "probability" in a way that makes it
unobservable.

No more unobservable than is any other perception, so far as I can see.
Anyway, since a probability is always either about some future event(s)
or about something present but not presently perceptible, it is pretty
well guaranteed to be unobservable to an external observer. It exists
only in a person’s mind. I think Bill agreed to that, but I gather you
don’t.

I'm comfortable saying that, if the proportion of times
heads comes up in a large number of coin tosses is ~.25 then the
probability of heads for the coin is ~.25 (assuming objective
observations as per Bill's criteria).

On what grounds do you say this? (Note: this is a trick question. I’ll
explain the trick at the end.)


Why should it matter that we cannot influence the process that generates the
observations?
If someone is influencing (or, more importantly, controlling) the
observations then what we observe would not be considered objective;.

How is it less objective than is the perception of position when we
observe the position of a spoon we are placing in a dinner arrangement?

If someone set the coin down (controlled it's orientation on the
table) 100 times and had it showing heads 1/2 the time then I could
see that the proportion of heads was .5 but I would not, then,
conclude, that the probability of the coin turning up heads is .5.

I ask the same trick question. On what grounds do you say this?


We perceive what we perceive, whether it is being controlled
or not. The probabilities of different values of perceptions are different
if we control or let alone
This is not at all the case. If we (or someone) is controlling the
perception of "heads" then the proportion of times heads comes up is
not an objective indication of probability.

Of probability of what? Again I ask the question: On what grounds do
you say this?

I'm sorry, but I can't see the use of "objective" as being restricted to
uncontrolled perceptions.
Well, then we really must play poker sometime.

What’s the relevance of that comment? If it is meant to suggest that I
deny that one can get probability estimates from counting, you are
mistaken. If you think people get the probabilities of different poker
hands by counting, you are also mistaken.


What Mike Acree calls "aleatory" probability, based on models, can't be
objective unless the models have that property. The ultimate determinant of
objectivity is to demonstrate that the distribution of occurrances, states,
or relationships is repeatable and is beyond our ability to control. See?
Nothing up my sleeves!
In retrospect, it was repeatable, but you can't demonstrate that it will be.
In any case, it is a very low probability event that the die will give
exactly 1/6 proportion for each number.
As the number of rolls increases the observed proportions for each
number on the die will stabilize; I would consider these very good
objective measures of the probability of getting a particular number
on each roll.

In 6,000 rolls, what do you think the probability is that exactly 1000
cases of each number will have turned up? In 6 million rolls, what is
the probability that exactly one million of each has occurred? If there
happened to have been 993,000 ones, 1,000,460 twos, … would you then
say the die was biased?

···

[Frmo Rick Marken (2009.01.19.1345)]

Martin Taylor (2009.01.19.14.48) --

Rick Marken (2008.01.19.1120) --

>I think you are defining "probability" in a way that makes it
> unobservable.

No more unobservable than is any other perception, so far as I can see.

OK.

Anyway, since a probability is always either about some future event(s) or
about something present but not presently perceptible

I think we are talking about two different probabilities: theoretical
and empirical. Probability theory deals with probability as a
theoretical concept. But for me, when I talk about perceiving
probability I'm talking about empirical probability: the observed
(under objective conditions) relative frequency of an event.

, it is pretty well
guaranteed to be unobservable to an external observer. It exists only in a
person's mind. I think Bill agreed to that, but I gather you don't.

I guess in the sense you're talking about it -- as this kind of
unobservable theoretical property of events -- then I agree that it's
unobservable; but it is certainly perceptible (imaginations are
perceptions so this imagined unobservable theoretical notion called
probability is a perception).

I'm comfortable saying that, if the proportion of times
heads comes up in a large number of coin tosses is ~.25 then the
probability of heads for the coin is ~.25 (assuming objective
observations as per Bill's criteria).

On what grounds do you say this? (Note: this is a trick question. I'll
explain the trick at the end.)

On the grounds that probability, for me, is relative frequency.

If someone is influencing (or, more importantly, controlling) the
observations then what we observe would not be considered objective;.

How is it less objective than is the perception of position when we observe
the position of a spoon we are placing in a dinner arrangement?

It might not be less objective (where objectivity is defined by Bill's
criteria). It depends on what you are trying to find out about the
observation. If I am studying how someone other than myself positions
the spoon on the table, then if I have positioned the spoon on the
table, the position of the spoon is not an objective observation.
Bill's criteria for objectivity presume, by the way, that "it's all
perception". That is, they assume that there is no such thing as
"objectivity" in the sense of being able to make observations of what
is "on the other side" of one's own perceptual experience.

If someone set the coin down (controlled it's orientation on the
table) 100 times and had it showing heads 1/2 the time then I could
see that the proportion of heads was .5 but I would not, then,
conclude, that the probability of the coin turning up heads is .5.

I ask the same trick question. On what grounds do you say this?

On the basis of the empirical definition of probability as relative
frequency (obtained under objective circumstances, as per Bill's
objectiveness criteria) with the "goodness" of this definition
increasing as the size of the denominator increases.

This is not at all the case. If we (or someone) is controlling the
perception of "heads" then the proportion of times heads comes up is
not an objective indication of probability.

Of probability of what?

Probability of getting heads under objective circumstances (without
someone influencing the result).

I'm sorry, but I can't see the use of "objective" as being restricted to
uncontrolled perceptions.

Well, then we really must play poker sometime.

What's the relevance of that comment?

The relevance is that I could then presumably control the cards (stack
the deck) and that would not affect your estimates of the "true"
probability of getting various hands. I think empirical and
theoretical (analytic) definitions of probability assume that the
events for which we are calculating probabilities are independent of
the observer and, also not under control by some outside agency.

In retrospect, it was repeatable, but you can't demonstrate that it will be.
In any case, it is a very low probability event that the die will give
exactly 1/6 proportion for each number.

As the number of rolls increases the observed proportions for each
number on the die will stabilize; I would consider these very good
objective measures of the probability of getting a particular number
on each roll.

In 6,000 rolls, what do you think the probability is that exactly 1000 cases
of each number will have turned up?

To calculate that I would have to make assumptions about the "true"
probability (the relative frequency in an infinite number of rolls) of
each number. I was just talking about empirical probability.

In 6 million rolls, what is the
probability that exactly one million of each has occurred? If there happened
to have been 993,000 ones, 1,000,460 twos, ... would you then say the die
was biased?

That's what statistical significance tests are designed to answer.
It's not a question I was addressing. I was just saying that the
relative frequency of each number after a very large number of rolls
would provide (for me) an acceptable measure of the probability of
each number coming up on a throw of the die. I think it's highly
unlikely that the results for each number would be precisely 1/6; the
1/6 probability is a theoretical value derived from the axioms of
probability theory (which allows you to assume a "perfect" cube with
perfect weight distribution and frictional properties for the die,
etc).

Best

Rick

···

--
Richard S. Marken PhD
rsmarken@gmail.com

[Martin Taylor 2009.01.19.17.16]

[Frmo Rick Marken (2009.01.19.1345)]

I know Dr. Rick Marken, but which religion offers the Frmo (Brother of
most?) designation?


Anyway, since a probability is always either about some future event(s) or
about something present but not presently perceptible
I think we are talking about two different probabilities: theoretical
and empirical. Probability theory deals with probability as a
theoretical concept. But for me, when I talk about perceiving
probability I'm talking about empirical probability: the observed
(under objective conditions) relative frequency of an event.

I would call that “observed proportion” or, as you did, “observed
relative frequency”, not “probability”.


, it is pretty well
guaranteed to be unobservable to an external observer. It exists only in a
person's mind. I think Bill agreed to that, but I gather you don't.
I guess in the sense you're talking about it -- as this kind of
unobservable theoretical property of events -- then I agree that it's
unobservable; but it is certainly perceptible (imaginations are
perceptions so this imagined unobservable theoretical notion called
probability is a perception).

Precisely.


I'm comfortable saying that, if the proportion of times
heads comes up in a large number of coin tosses is ~.25 then the
probability of heads for the coin is ~.25 (assuming objective
observations as per Bill's criteria).
On what grounds do you say this? (Note: this is a trick question. I'll
explain the trick at the end.)
On the grounds that probability, for me, is relative frequency.

Of the observed past?

I won’t deny that for the right situation, relative frequency of future
events is a reasonable measure of probability, but considering that it
is about the future, it cannot yet have been observed. If you are
talking about the observations you have already made, probability
doesn’t seem an appropriate concept to apply. You have already made the
observation, so you know the result.


If someone is influencing (or, more importantly, controlling) the
observations then what we observe would not be considered objective;.
How is it less objective than is the perception of position when we observe
the position of a spoon we are placing in a dinner arrangement?
It might not be less objective (where objectivity is defined by Bill's
criteria). It depends on what you are trying to find out about the
observation.

Of course. I rather think that’s one of the points I have been making
all along.

If I am studying how someone other than myself positions
the spoon on the table, then if I have positioned the spoon on the
table, the position of the spoon is not an objective observation.

No. If you are studying the distribution of X, an observation of Y is
unlikely to be helpful, unless you have a mental model that specifies
some realtion between X and Y. That wouldn’t make an observation of Y
any the less “objective”, would it? It is just an observation of Y, not
of X.

Bill's criteria for objectivity presume, by the way, that "it's all
perception". That is, they assume that there is no such thing as
"objectivity" in the sense of being able to make observations of what
is "on the other side" of one's own perceptual experience.

Something we all agree on, I think.

If someone set the coin down (controlled it's orientation on the
table) 100 times and had it showing heads 1/2 the time then I could
see that the proportion of heads was .5 but I would not, then,
conclude, that the probability of the coin turning up heads is .5.
I ask the same trick question. On what grounds do you say this?
On the basis of the empirical definition of probability as relative
frequency (obtained under objective circumstances, as per Bill's
objectiveness criteria) with the "goodness" of this definition
increasing as the size of the denominator increases.

But if you have the belief that the person would keep on doing what
they have been doing, why would you not say that the probability of the
coin continuing to turn up heads is 0.5?


 This is not at all the case. If we (or someone) is controlling the
perception of "heads" then the proportion of times heads comes up is
not an objective indication of probability.
Of probability of what?
Probability of getting heads under objective circumstances (without
someone influencing the result).

No, I agree. If you observe X, it does not tell you much about Y
(unless you have a mental model that indicates that it does).


I'm sorry, but I can't see the use of "objective" as being restricted to
uncontrolled perceptions.
Well, then we really must play poker sometime.
What's the relevance of that comment?
The relevance is that I could then presumably control the cards (stack
the deck) and that would not affect your estimates of the "true"
probability of getting various hands. I think empirical and
theoretical (analytic) definitions of probability assume that the
events for which we are calculating probabilities are independent of
the observer and, also not under control by some outside agency.

Why would I assume you would cheat? Is that part of your mental model
of me? Or do you want it to be? Or is it just that you think that if I
had a mental model that indicated you would cheat I would still expect
the card hands to have the same probabilities as they would if my
mental model said you would not cheat? Check back on my postings about
conditionals to see what I have said about probabilities depending on
them.


In retrospect, it was repeatable, but you can't demonstrate that it will be.
In any case, it is a very low probability event that the die will give
exactly 1/6 proportion for each number.
As the number of rolls increases the observed proportions for each
number on the die will stabilize; I would consider these very good
objective measures of the probability of getting a particular number
on each roll.
In 6,000 rolls, what do you think the probability is that exactly 1000 cases
of each number will have turned up?
To calculate that I would have to make assumptions about the "true"
probability (the relative frequency in an infinite number of rolls) of
each number. I was just talking about empirical probability.

You mean “observed proportion”, don’t you?


In 6 million rolls, what is the
probability that exactly one million of each has occurred? If there happened
to have been 993,000 ones, 1,000,460 twos, ... would you then say the die
was biased?
That's what statistical significance tests are designed to answer.

Oh, are you still a believer in significance test? To me, they have
much the same scientific respectability as Ptolomaic epicyclic orbits
for the planets.

It's not a question I was addressing.

Was it not? You said it was. I just quoted the bit in which you did:
“As the number of rolls increases the observed proportions for each
number on the die will stabilize; I would consider these very good
objective measures of the probability of getting a particular number on
each roll.” I assumed from what you said that you were going to use the
observed proportion as your estimate of the probability of getting each
number in future. The numbers I provided would indicate that you would
think the die was biased, if I take what you said to be what you meant.

I was just saying that the
relative frequency of each number after a very large number of rolls
would provide (for me) an acceptable measure of the probability of
each number coming up on a throw of the die.

Yes, you say it again.

Let’s suppose you went to Vegas and wanted to get into a game of craps.
Would you insist on taking each die and throwing it a few thousand
times to find its “true” probability of turning up fair? Would you be
inside the casino after the tenth throw or the hundredth? Would you
play craps with dice you had not tested? If so, why would you, having
no perception of the “objective” probability that each die was fair?

I think it's highly
unlikely that the results for each number would be precisely 1/6; the
1/6 probability is a theoretical value derived from the axioms of
probability theory (which allows you to assume a "perfect" cube with
perfect weight distribution and frictional properties for the die,
etc).

So do I, but then I wouldn’t stake my estimate of the probability of a
4 coming up on the count of fours in a few thousand throws of the die
that I would have to make before allowing it to be used in a gambling
game. I would base my probability estimate on that physical model, and
on my estimates of the probability that the supplier of the die was
honest, and that the thrower would not substitute a different die. If I
thought that latter probability was too low, I would make observations
of how the thrower manipulated the die before throwing. I wouldn’t go
throwing the die a few thousand times, counting the number of times
each number came up.

Would you, before you judged the probability of a 4 to be 1/6 in a die
used by a Las Vegas casino?

Martin

[From Bill Powers (2009.01.19.1723 MST)]

Martin Taylor 2009.01.19.14.48 –

[Rick Marken]If someone is
influencing (or, more importantly, controlling) the
> observations then what we
> observe would not be considered objective;.

[Martin
Taylor] How is it less objective than is the perception of position when
we observe the position of a spoon we are placing in a dinner
arrangement?

I think Rick and I are offering a definition of objective probability
that takes more into account than your proposal, which seems to apply the
concept simply on the basis that someone can’t predict something and
feels uncertain about it. While we all agree that probability is a
perception, Rick and I both want to say that some perceptions occur and
change in ways we can’t affect, and so are accepted as indicating
something important about the outside world. That includes perceptions of
probability, given certain safeguards needed to assure us that someone
isn’t trying to present a false appearance, or that we’re not fooling
ourselves.

I’m sure you can see that any scientific experiment intended to test a
model would be invalidated if the results were being manipulated by the
experimenter. This applies just as well to observed frequencies of two
occurrances, A and B. If we find that in any extended series of
observations that A occurs nearly twice as often as B, but that the
sequence of As and Bs seldom or never repeats, we can claim that there is
some influence of property out there seeing to it that this ratio
continues to be seen even if there is no other regularity we can
recognize.

However, if this regularity exists because during or after any series of
observations, the observer adds just enough As or Bs to bring the ratio
close to 2:1, we would no longer think the observer has found a fact of
nature. We would say he was cheating, and he would have to retract his
published papers announcing discovery of this ratio. That is why it’s
important to show that the observer has no control over the observed
ratio of occurrances – not only in science, but at casinos, too. We want
the observer to show not just that he refrained from affecting the
results but that he could not have done so.

If probability were merely a sense of uncertainty, as you seem to be
proposing, then of course it wouldn’t matter why a person was unable to
predict a sequence of events. If the A’s and B’s alternated according to
a Fibonacci series, but the observer didn’t know about that series, the
sequences of occurrances would appear random and probabilities would have
to be invoked to find any order at all. The observer would be very
confused and uncertain about the observations; they would seem to be
unpredictable. So simple unpredictability or feelings of uncertainty
can’t be the sole criterion for claiming that probability calculations
can be legitimately applied to the observations.

Best,

Bill P.

P.S. Hilton head is miserably cold and snow is now being predicted. But
the IAACT meeting is going well, solidly committed to understanding PCT
and the method of levels.

[Martin Taylor 2009.01.19.13.25]

[From Bill Powers (2009.01.19.1723 MST)]

I thought you were away until the 21st. Welcome back.

Martin Taylor 2009.01.19.14.48 --

[Rick Marken]If someone is
influencing (or, more importantly, controlling) the

observations then what we observe would not be considered objective;.

[Martin Taylor] How is it less objective than is the perception of position when we observe the position of a spoon we are placing in a dinner arrangement?
I think Rick and I are offering a definition of objective probability that takes more into account than your proposal, which seems to apply the concept simply on the basis that someone can't predict something and feels uncertain about it.

Actually, I think you are offering a definition that is more restricted, and moreover one that refers not to probability, but to observed (and therefore already known) proportion. You cannot assert that previously observed proportion is the same as probability without accepting that seeing 100% white among all the swans you see is proof that all swans are white. Put another way, you MUST have additional assumptions to go with your observations before you can derive a probability.

While we all agree that probability is a perception, Rick and I both want to say that some perceptions occur and change in ways we can't affect, and so are accepted as indicating something important about the outside world.

So they do. Some other perceptions occur and change in ways we can and do affect. Those also indicate something important about the outside world. I don't see why uncontrolled perceptions should be more important than controlled ones; in fact the opposite seems more commonly to be the case. (I hope you won't say that the values of controlled perceptions are always exactly and perfectly known in advance, which would make them inappropriate for probability judgements :-).

If, for some reason, you want to consider only perceptions of things you don't control and of which you have no evidence anyone else controls, then by all means go to it. Just don't require them to be the only perceptions that relate to perceptions of probability.

That includes perceptions of probability, given certain safeguards needed to assure us that someone isn't trying to present a false appearance, or that we're not fooling ourselves.

I'd amend the beginning of that to say "That includes some perceptions of probability..." It includes perceptions of probabilities about things for which you have a mental model that "someone isn't trying to present a false appearance, or that we're not fooling ourselves."

How do you propose to include the probability that someone isn't trying to present a false appearance, if your only data are the frequencies you count?

I'm sure you can see that any scientific experiment intended to test a model would be invalidated if the results were being manipulated by the experimenter. This applies just as well to observed frequencies of two occurrances, A and B. If we find that in any extended series of observations that A occurs nearly twice as often as B, but that the sequence of As and Bs seldom or never repeats, we can claim that there is some influence of property out there seeing to it that this ratio continues to be seen even if there is no other regularity we can recognize.

If your only available data is observed frequency (I prefer "proportion"), then your hypotheses must concern only the proportion. But by restricting yourself to this constrained kind of data, you are, by your own choice, using blinkers.

One point of the Bayes procedure is that you can provide an infinite number of hypotheses that describe how the data might have come about, given whatever conditionals you think appropriate, and can then use the data to determine which of the hypotheses is most likely, as well as how much more likely it is than any one of the others. If several of your hypotheses include methods by which someone could have faked your data, they must be included in the set. Remember, however, that prior probabilities get swamped only by relevant data, and frequency count data are not usually relevant to your estimate of the probability that someone cheated. (Though they can be, as witness a recent situation here, when a seller of lottery tickets won considerably more often than seemed reasonable. He was accused of checking people's tickets and telling them they lost, and then taking the winning tickets as his own. The rules were changed so that you have to sign your tickets before having them checked.)

However, if this regularity exists because during or after any series of observations, the observer adds just enough As or Bs to bring the ratio close to 2:1, we would no longer think the observer has found a fact of nature. We would say he was cheating, and he would have to retract his published papers announcing discovery of this ratio. That is why it's important to show that the observer has no control over the observed ratio of occurrances -- not only in science, but at casinos, too. We want the observer to show not just that he refrained from affecting the results but that he could not have done so.

If, for some assessment of probability, that's the situation you want, then you ought to add a conditional to that effect: "If the observer does not influence the data". And if you want your conditional to refer to the "real world", you will need to make other observations in addition to the observed frequency to see that your conditional is factually true, and not just a hypothetical case.

You might want to determine, for example, how probable it is that the observer cheated. One way of doing that is simply to have a mental model that the observer will not cheat. That's a pretty unreliable model, especially as it has long been known that observers will cheat themselves without knowing it. It's called "the experimenter effect". So, you probably would want to observe (perhaps by controlling that perception) that the observer does not have the opportunity to cheat, as you say. All appropriately part of your hypothesis.

Given observations of the frequency data, and only those observations, you are stuck in a perfect "White Swan" situation. You have nothing on which to base your estimate of probability until you add a conditional, which might be something like "if things go on the way they have been so far". If you don't add that conditional, or something like it, you have no way to suggest that the next swan is more likely to be of any one colour rather than another, even if you have seen thousands of swans, all white. Only with the conditional can you turn an observed frequency into a probability perception.

To have only the conditional "if things go on as they have been" is pretty weak, I think. It's much better to have some kind of model in the mind as part of your hypothesis, such as that the dice are fair because you believe with high probability that the Casino couldn't afford to be caught with loaded dice. I would take that to be a much better reason to believe that the probability of a 4 is 1/6 than I would get from watching the dice for weeks at a time and counting how often each number came up.

···

---------------

(Sorry, somehow the rest of Bill's message got erased before I reached it in my response, and I can't figure out how to tell Thunderbird that what I copy below from Bill's original message is a quote)

[Bill said]
If probability were merely a sense of uncertainty, as you seem to be proposing, then of course it wouldn't matter why a person was unable to predict a sequence of events.

[Martin]
"A sense of uncertainty" is a very weak way of putting it. "A perception of likelihood, quantifiable in the same way as is the perception of the size or weight of something" is closer. But it is indeed true that it doesn't matter why a person is unable to predict a sequence of events. It matters only how able they are.

[Bill]
If the A's and B's alternated according to a Fibonacci series, but the observer didn't know about that series, the sequences of occurrances would appear random and probabilities would have to be invoked to find any order at all. The observer would be very confused and uncertain about the observations; they would seem to be unpredictable.

[Martin]
True. They would not just "seem to be unpredictable", they would _be_ unpredictable to that observer. But I don't know why the observer should be any more "confused" than when presented with any other apparently random series of A's an B's. After all, all "random" means is that you have no way of predicting what comes next, because you don't know the mechanism that produces it, or if you do, you can't emulate the mechanism fast enough to have come to a conclusion before "next" has happened.

[Bill]
So simple unpredictability or feelings of uncertainty can't be the sole criterion for claiming that probability calculations can be legitimately applied to the observations.

[Martin]
This seems a total non-sequitur. That someone ELSE knows what comes next doesn't mean you do. And if you are trying to say that any perception is illegitimate if it doesn't correspond to the real real world, I think I give up on PCT :slight_smile:

Suppose the sequence of A's and B's is based on a Fibonacci sequence, and you are not told so to begin with. At first, the sequence of observations seems random. Then, for unknown reasons, the hypothesis "It's a Fibonacci sequence" pops into your head. That hypothesis gives some fairly stringent predictions on what will come next, under the conditional "the sequence will continue as it has been going so far". Set against the alternative "the next one is very likely to be the same as the last" (which would be an appropriate hypothesis if the sequence has gone more than a few alternations), P(H|D) will soon be much higher for the Fibonacci hypothesis than for the simple correlation hypothesis.

Inauguration day tomorrow. What is the probability that someone who got on a bus in Montreal this evening to go to it will actually see the ceremony?

Martin

[From Bill Powers (2009.01.20.0202 EST)]

Martin Taylor 2009.01.19.13.25 --

Actually, I think you are offering a definition that is more
restricted, and moreover one that refers not to probability, but to
observed (and therefore already known) proportion. You cannot assert
that previously observed proportion is the same as probability
without accepting that seeing 100% white among all the swans you see
is proof that all swans are white. Put another way, you MUST have
additional assumptions to go with your observations before you can
derive a probability.

I would never conclude that ALL swans are white simply because of
having seen only white swans. I wouldn't say anything about ALL swans
-- only about all the swans I have seen. Of course as you know I do
make use of theoretical models to suggest what might be seen, but
instead of then basing my expectations on the model, I make
predictions and then test them to see if they actually hold true as
far as I can tell. If they pass the tests I accept them
provisionally, and the longer I go on testing them and they go on
passing, the smaller is my subjective uncertainty about their truth.
For me, the method of observing relative frequencies of occurance
takes precedence over all theories.

I don't believe predictions just because I make them. The method of
relative frequencies remains my main way of determining
probabilities, and always will, because theories are not even
indirect observations of nature. They are propositions about possible
facts, not observations. You, with your Zeno's Paradox of
Probability, apparently don't believe that observations can tell us
anything useful about the distribution of natural occurrances.

Your argument that observations are not probabilities is, of course,
literally correct. However, by your way of using this fact, there is
no such thing as a property of anything. Just observing that all
mercury in the past has had an observed density close to 13 g/cm^3,
according to you, is no reason to assume that the next observation
will have a similar value. I have to agree that the concept of
density involves probabilities and conditionals, but the chances of
error in the predictions, by now, are incalculably tiny by any normal
statistical analysis. We can speak of measurement errors by stating
the size of a standard deviation from the mean, but probable error
shrinks with the number of observations until for all practical
purposes we can assume the measurement to be what it appears to be.
To continue to insist that there is always some probability that a
swan with no head, or that is bright red, might swim by is just
stubbornness. If a bright red swan swam by, I'd drive upstream
looking for the joker with the food coloring dye. If you think that
one bright red swan materially changes any probabilities about swan
color, I have a very quiet parrot to sell you. Theories can make us
suspicious but they are no substitute for observations.

Some other perceptions occur and change in ways we can and do
affect. Those also indicate something important about the outside world.

Not if we're claiming that the perceptions reveal something about the
world that is independent of us. Casinos do not, for good reason,
allow manual placement of the dice. Are you deliberately ignoring the
point that some facts of nature are supposed to exist independently
of the observer?

Of course many important observations are under our control. But we
don't claim that the values of the controlled variables are
determined by something other than our own actions when in fact our
actions are almost their only determinant.

I don't see why uncontrolled perceptions should be more important
than controlled ones; in fact the opposite seems more commonly to be
the case. (I hope you won't say that the values of controlled
perceptions are always exactly and perfectly known in advance, which
would make them inappropriate for probability judgements :-).

Anything is appropriate for probability judgments if you insist that
there is an important difference between a probability of 0.999 and
one of 0.9999. But you'll have to search a long time to find a
fellow-fanatic interested in that degree of nit-picking. Despite your
disclaimer, I'm sure you can see why uncontrolled perceptions can be
more important than controlled ones; you're just refusing to admit
that class of uncontrolled perceptions into the discussion. We call
them "facts of nature" and mean by that facts which exist whether we
are here to observe them or not.

If, for some reason, you want to consider only perceptions of things
you don't control and of which you have no evidence anyone else
controls, then by all means go to it. Just don't require them to be
the only perceptions that relate to perceptions of probability.

No, of course not. But if I see you controlling something, the
easiest way to find out what its next value will be is to ask you,
not to concoct elaborate theories to predict the next value. We don't
try, yet, to predict reference values when they're set by higher
systems we can't yet model. Well, some people obvioualy do, by I
don't waste much time on that sort of guessing.

That includes perceptions of probability, given certain safeguards
needed to assure us that someone isn't trying to present a false
appearance, or that we're not fooling ourselves.

I'd amend the beginning of that to say "That includes some
perceptions of probability..." It includes perceptions of
probabilities about things for which you have a mental model that
"someone isn't trying to present a false appearance, or that we're
not fooling ourselves."

Yes. And before I believe an estimate of that particular apparent
probability, I want to make sure that the probability of that
conditional being true is extremely high. When p(A) is close to 1,
p(A)*p(B|A) gets pretty uninteresting. It's always just about the same as p(B).

All of this is gradually circling back toward a subject we're visited
before. Just how noisy is perception, anyway? If it turned out that
most ordinary perceptions had a noise level of just a few per cent,
speaking in terms of probabilities would become overkill, not to
mention monotonous. So just what is the so-called evidence that
neural signals corresponding to perceptions have a high random
component? As I've said before, I haven't seen any recordings of
impulse trains that looked particularly random to me, in terms of
frequency of firing. They looked to me pretty much like what I'd
expect of a frequency-modulated signal indicating normal variations
in perceptual variables. If all you mean is that some neurologist
without any workable theory of perception (or any experience with
artificial sensors) would have trouble understanding what the signals
might represent, that would tell us something about the neurologist's
uncertainties, but nothing about how that signal would be experienced
by the owner of the brain in which it was found. So unless some
evidence of real -- that is, objective, observer-independent --
uncertainty in neural signals can be established, this whole argument
becomes hypothetical, and if you like, of improbable relevance.
...

Inauguration day tomorrow. What is the probability that someone who
got on a bus in Montreal this evening to go to it will actually see
the ceremony?

I have no idea how such a probability might be arrived at, even if it
were very important to do so. One can always come up with
hypothetical conditionals and look at various probabilities that
result, but to verify that they have anything to do with reality
would take longer than the bus trip. A premise is not a fact. I have
very little interest in hypothetical conditionals until they've been
checked against observation. There are too many possible premises to
waste one's life on the untested ones, which far outnumber those we
can believe.

Best,

Bill P.

No virus found in this outgoing message.
Checked by AVG - http://www.avg.com
Version: 8.0.176 / Virus Database: 270.10.9/1900 - Release Date: 1/18/2009 12:11 PM

[Martin Taylor 2009.01.20.09.58]

[From Bill Powers (2009.01.20.0202 EST)]

Martin Taylor 2009.01.19.13.25 --

Actually, I think you are offering a definition that is more restricted, and moreover one that refers not to probability, but to observed (and therefore already known) proportion. You cannot assert that previously observed proportion is the same as probability without accepting that seeing 100% white among all the swans you see is proof that all swans are white. Put another way, you MUST have additional assumptions to go with your observations before you can derive a probability.

I would never conclude that ALL swans are white simply because of having seen only white swans. I wouldn't say anything about ALL swans -- only about all the swans I have seen.

Yes, you MUST have additional assumptions, as you next say you do.

Of course as you know I do make use of theoretical models to suggest what might be seen, but instead of then basing my expectations on the model, I make predictions and then test them to see if they actually hold true as far as I can tell. If they pass the tests I accept them provisionally, and the longer I go on testing them and they go on passing, the smaller is my subjective uncertainty about their truth. For me, the method of observing relative frequencies of occurance takes precedence over all theories.

So, you are informally using Bayesian methods. You are doing exactly what I have been talking about. You combine a model with your frequency data to make sense of them. You make predictions from the model, and then see how well the data agree with those predictions.

I don't believe predictions just because I make them.

That's good. To some extent, then, you are trying to obey Jaynes's Desideratum IIIb.

The method of relative frequencies remains my main way of determining probabilities, and always will, because theories are not even indirect observations of nature. They are propositions about possible facts, not observations.

Theories are propositions about possible facts. Precisely.

You, with your Zeno's Paradox of Probability, apparently don't believe that observations can tell us anything useful about the distribution of natural occurrances.

What nonsense you do attribute to me! And how divorced can this nonsense get from what I have been saying for the last few weeks? Why do you do this? What are you controlling for? Have you even read Episodes 1 through 3?

Your argument that observations are not probabilities is, of course, literally correct. However, by your way of using this fact, there is no such thing as a property of anything. Just observing that all mercury in the past has had an observed density close to 13 g/cm^3, according to you, is no reason to assume that the next observation will have a similar value.

It is true that JUST observing that property gives you no legitimate reason to assume the next observation will have a similar value. You MUST have background assumptions (a.k.a. mental models, formal or informal).

I have to agree that the concept of density involves probabilities and conditionals, but the chances of error in the predictions, by now, are incalculably tiny by any normal statistical analysis. We can speak of measurement errors by stating the size of a standard deviation from the mean, but probable error shrinks with the number of observations until for all practical purposes we can assume the measurement to be what it appears to be.

That's entirely irrelevant to the issue, which is that without some mental model, even one as simple as "things will continue as they have been in respect of the observations I am making", you simply can NOT get a probability distribution for your next measurement from the observations already taken.

To continue to insist that there is always some probability that a swan with no head, or that is bright red, might swim by is just stubbornness.

Who is saying making such idiotic suggestions? Are you imputing them to me? If so, why?

I was intending to say "Let me make clear..." but it seems obvious that I cannot. However, I will try once more. If you have a mental model of what constitutes a swan, and what are the abilities of a swan, then you have a set of prior probabilities that say P(object with no head swims by | object is a swan) `= 0. If you saw what seemed to be a headless swan swimming by, you would probably give more credence to the generic hypothesis "object is not a swan", or possibly to the more specific hypothesis "Somebody has made a mechanical model of a swan but its head has fallen off". Without a mental model of what as swan is and can do, you could not judge how likely it is that the next swan to swim by would have no head.

Theories can make us suspicious but they are no substitute for observations.

Why say this to ME of all people, given how often I have made the point that observations (I have used the word "data" more often) swamp prior probabilities? Let's reiterate the Bayes equation. It shouldn't be necessary so far into the discussion, but apparently it is. I'll use words instead of symbols and say "Observations" in place of "Data" since you prefer that word for the concept:

Posterior probability(Hypothesis given Observations and Conditionals) = Prior probability(Hypothesis given only the Conditionals) times Probability (that the Observations will occur if the Hypothesis is true, given only the Conditionals) divided by Probability (Observations given only the Conditionals)

I find the symbolic formula easier to follow, so I'll repeat it.

P(H | O, C) = P(H | C) * P(O | H, C) / P(O | C)

If you have two hypotheses to compare (even if one of them is just "theory T is wrong"), then you can use the formula to compare how much more likely one is than the other.

P(H1|O,C) P(H1|C) P(O|H1,C)
--------- = -------- * ---------
P(H2|O,C) P(H2|C) P(O|H2,C)

Puy that into common language: "If you have two possible hypotheses that you want to distinguish, observe enough data, and you will be able to tell which better describes the world, regardless of your prior bias."

Some other perceptions occur and change in ways we can and do affect. Those also indicate something important about the outside world.

Not if we're claiming that the perceptions reveal something about the world that is independent of us.

As I did say, did I not? That conditional is important. It says what conditions apply to your hypothesis and the observations. All I said in the bit you quote is that sometimes (often, I think) you are interested in aspects of the world that are not independent of us.

Casinos do not, for good reason, allow manual placement of the dice. Are you deliberately ignoring the point that some facts of nature are supposed to exist independently of the observer?

Of course many important observations are under our control. But we don't claim that the values of the controlled variables are determined by something other than our own actions when in fact our actions are almost their only determinant.

Oh, I understand. Control is always perfect. There's no lag between output and effect on perception in any control loop. There's no difference between reference and perception, ever. Even in conflict conditions, even in changing environments. Nobody ever slips and falls on ice. I see.

Inauguration day tomorrow. What is the probability that someone who got on a bus in Montreal this evening to go to it will actually see the ceremony?

I have no idea how such a probability might be arrived at, even if it were very important to do so.

It's certainly important to the people who boarded the bus (and there were lots of such buses fromToronto and Montreal, and probably elsewhere in Canada). Here's how I would start to judge such a probability: I'd start with the question of weather and road conditions. How likely is it that the bus would arrive in Washington before the inauguration? Then I would ask about the road conditions an access permissions in the Washington perimeter. What is the probability the bus would be allowed close enough to the area for people to get there in time. Then I would ask about the probability that a person could get through the crowd to a permissible viewing point. All of those have at least some reasonably accessible estimates on the Web, or at least I should expect that to be the case. I would have prior probabilities, but data obtained from the Web might get me to modify them.

The important point is that prior probabilities come from your mental models, posterior probabilities from your models modified by your observations. If your observations are sufficient, your prior probabilities (your personal biases) become unimportant. That is true whether your observations consist of frequency estimates, measurements, or anything else derived from sensory input. What DOES matter is that the import of the observations depends critically on the models (hypotheses/theories) and the conditionals that apply to the models and to the observations.

Martin

[From Rick Marken (2009.01.20.1340)]

Martin Taylor (2009.01.20.09.58) --

So, you are informally using Bayesian methods. You are doing exactly what I
have been talking about. You combine a model with your frequency data to
make sense of them. You make predictions from the model, and then see how
well the data agree with those predictions.

It sounds like Bayesian methods are just the scientific method.
Discussing this in terms of probability seems like unnecessary
verbiage. In science we have models (like PCT) from which we make
predictions. We test these predictions against properly collected data
by seeing how well the data match the predictions. I do this all the
time, with great success, knowing nothing of Bayesian methods.

P(H | O, C) = P(H | C) * P(O | H, C) / P(O | C)

If you have two hypotheses to compare (even if one of them is just "theory T
is wrong"), then you can use the formula to compare how much more likely one
is than the other.

It seems to me that it would be a lot more useful to evaluate the
hypothesis, H, in term of goodness of fit to observations rather than
in terms of probability. This is like the distinction between
significance testing vs measuring proportion of variance explained.
Significance tests tell the probability of O given Ho (the null
hypothesis); if the probability of low, then reject Ho and accept H.
Measures of proportion of variance explained tell how well H fits the
data, O. The latter measure shows that even if a result is
significant, H can do a lousy job of accounting for O and an
alternative H (model) should be sought. Measuring proportion of
variance explained (goodness of fit) is like doing science as you
describe it above (" You make predictions from the model, and then see
how well the data agree with those predictions").

Puy that into common language: "If you have two possible hypotheses that you
want to distinguish, observe enough data, and you will be able to tell which
better describes the world, regardless of your prior bias."

Again, I think it's better to approach it as fitting models to data
rather than as selecting between hypotheses. The latter seems so
limited.

Best

Rick

···

--
Richard S. Marken PhD
rsmarken@gmail.com

[Martin Taylor 2009.01.20.17.53]

[From Rick Marken (2009.01.20.1340)]
It seems to me that it would be a lot more useful to evaluate the
hypothesis, H, in term of goodness of fit to observations rather than
in terms of probability.

You really don’t get it, do you?

P(D|H), the starting point of the Bayesian analysis, IS the goodness of
fit.

What do the symbols say? They say “the probability of getting those
data given that hypothesis.” We START from the goodness of fit!

from there, we go on to evaluating the credibility of the hypotheses
under consideration, whether there is just one or an infinite number of
them. And whether you can believe in the hypotheses, or which is the
most believable, given the data you got, is usually what you really
want to know.

This is like the distinction between
significance testing vs measuring proportion of variance explained.
Significance tests tell the probability of O given Ho (the null
hypothesis); if the probability of low, then reject Ho and accept H.

Precisely what the Bayesian method opposes. Mike Acree’s essay, and my
“Dangers of Significance testing” went into
that at some length. Significance testing is not only philosophically
insupportable, but is also dangerous in practice.

Measures of proportion of variance explained tell how well H fits the
data, O.

Well, that’s one way, and its a way that works well if the system is
linear and the “misfit” distribution Gaussian, even though you never
really know whether accounting for 95% of the variance is really good
or not as good as it might be. Often, the linear-Gaussian
simplification is not a bad approximation even when those assumptions
fail, if they don’t fail too badly.

However, P(D|H) as a measure of fit works not only under those
conditions, but also whenever you can have a model of how the error
(misfit) might be expect to be distributed, given your hypothesis and
the relevant conditionals. You don’t need linearity and Gaussian
distributions, but you can use them if appropriate. And you can always
fall back on them if you don’t have a good model of the expected error
distribution.

Put that into common language: "If you have two possible hypotheses that you
want to distinguish, observe enough data, and you will be able to tell which
better describes the world, regardless of your prior bias."
Again, I think it's better to approach it as fitting models to data
rather than as selecting between hypotheses. The latter seems so
limited.

Fitting models to data IS selecting among hypotheses. When you optimize
your parameter values, you are selecting among an infinite number or
possible hypotheses. I demonstrated how the Bayesian approach does it
in the curves of Episode 2 [Martin Taylor 2009.01.04.15.32]. But you
can also have hypotheses that differ in the structure of the control
system, simultaneously fit the best parameter set to each of the
candidate structures, and wind up with which structure is overall most
probable. Or, you can assess the parameters in one experiment and test
the credibility of those parameter settings for a related experiment or
one repeated later (as used to be done annually by, I think, Tom
Bourbon).

What the Bayes approach gives you, if you want to use it overtly as an
analytic method, is flexibility combined with power, as compared to the
restrictive approach of seeing how well one particular hypothesis
(model) fits the data when you make the assumption that the effects are
linear and the errors uniformly Gaussian.

What it gives you when looking at human reasoning is a benchmark
against which you can compare the effectiveness of the human.

Martin

[From Rick Marken (2009.01.21.1230)]

Martin Taylor (2009.01.20.17.53)

Rick Marken (2009.01.20.1340)

It seems to me that it would be a lot more useful to evaluate the
hypothesis, H, in term of goodness of fit to observations rather than
in terms of probability.

You really don't get it, do you?

I'm tryin'.

P(D|H), the starting point of the Bayesian analysis, IS the goodness of fit.

I don't see how this could be. P(D|H) is similar to the random
variable use in significance testing, where H would be the null
hypothesis. This probability is not a measure of goodness of fit; it
is a basis for deciding whether or not to reject H. If P(D|H) is
sufficiently low we reject H: the results are "significant". A
"sufficiently low" value of P(D|H) is traditionally <.01 or .05. So if
P(D|H) = .009 we say that the result is significant at the .01 level
(we reject H). But the value of P(D|H) should not be taken as a
measure of "significance". A value of P(D|H) = .000001 is not "more
significant" than a value of P(D|H) = .009. This is because P(D|H)
depends not only on the data but also on the degrees of freedom used
to compute D; the number of data points. This is why I caution my
students not to take "highly significant" results, ones with very
small reported P(D|H), as evidence of the goodness of fit of D to H.

The appropriate measure of goodness of fit is eta2 or r2, which is
based on the data, D (which, in the case of eta2, is normalized as t
or F) and the degrees of freedom (df) used to compute D. These
measures of goodness of fit (eta2 and r2) are not probabilities but,
rather, are more like RMS error. Actually, they measure the proportion
of variance in D that is accounted for by H (which in psychology is
the general linear model which says that D = k*IV).

What do the symbols say? They say "the probability of getting those data
given that hypothesis." We START from the goodness of fit!

I still don't think P(D|H) is a measure of goodness of fit. It is the
basis for a binary decision: reject or don't reject H.

This is like the distinction between
significance testing vs measuring proportion of variance explained.
Significance tests tell the probability of O given Ho (the null
hypothesis); if the probability of low, then reject Ho and accept H.

Precisely what the Bayesian method opposes. Mike Acree's essay, and my
"Dangers of Significance testing" went into that at some length.
Significance testing is not only philosophically insupportable, but is also
dangerous in practice.

I have no objection to significance testing per se. What I object to
is the reliance on significance testing as a basis for understanding
phenomena. A far better approach to understanding behavioral
phenonena is measuring goodness of fit of model to data. This is
pretty easy to do in a non-Bayesian way; just measure the average
deviation of model behavior from the actual behavior of the system
under study. If Bayesian methods provide a way to measure goodness of
fit I would sure like to know how this is done.

Anyway, it is the fact that measures of goodness of fit (to the
general linear or causal model of behavior) are so low in conventional
psychology (on average about .3 -- only 30% of the variance in the
data is accounted for by the model) that leads me to reject this model
in favor of a closed loop model like PCT which regularly accounts for
98% of the variance in the data.

Measures of proportion of variance explained tell how well H fits the

>data, O.

Well, that's one way, and its a way that works well if the system is linear
and the "misfit" distribution Gaussian, even though you never really know
whether accounting for 95% of the variance is really good or not as good as
it might be.

This I don't understand at all. Why does the system have to be linear?
I've used r2 to analyze non-linear control systems. And why does the
misfit (it's called "residual") distribution have to be Gaussian? The
distribution of the residuals might give a hint about how to change
the model but there is no need for it to be gaussian unless you are
doing significance tests on the measure of goodness of fit itself.

However, P(D|H) as a measure of fit works not only under those conditions,
but also whenever you can have a model of how the error (misfit) might be
expect to be distributed, given your hypothesis and the relevant
conditionals.

Please show me how to calculate P(D|H). If I can see how to calculate
it then maybe I can see why it is a measure of goodness of fit. Since
probabilities are in the interval (0,1) then I presumed that goodness
of fit is good if P(D|H) is close to 1. So how do I compute P(D|H)?
Here is some example data that you can use to demonstrate the
calculation:

Model: 2 5 7 5 8 8 9 6 3 3 5 5 6 6 8 9 10 12 15 15 11 7 4

Think of these numbers as samples (at equal intervals) of handle
position in a tracking task. The data is the behavior of the system;
the model is the behavior of a model of that system. What is the
goodness of fit of the model to the data in terms of P(D|H)? (Please
show your work).

Fitting models to data IS selecting among hypotheses. When you optimize your
parameter values, you are selecting among an infinite number or possible
hypotheses. I demonstrated how the Bayesian approach does it in the curves
of Episode 2 [Martin Taylor 2009.01.04.15.32].

OK. But could you show me how it's done using the data above? I'm a
very concrete thinker, I'm afraid.

What the Bayes approach gives you, if you want to use it overtly as an
analytic method, is flexibility combined with power, as compared to the
restrictive approach of seeing how well one particular hypothesis (model)
fits the data when you make the assumption that the effects are linear and
the errors uniformly Gaussian.

Sounds great. But I would find it more interesting if I knew how to
use this Bayes magic myself. So, again, I'd appreciate it if you could
illustrate it's use with the data above.

Thanks

Rick

···

Data: 3 5 6 6 7 8 10 10 7 7 6 5 5 5 4 5 6 6 10 11 12 11 9
--
Richard S. Marken PhD
rsmarken@gmail.com

[Martin Taylor 2009.01.21.17.45]

[From Rick Marken (2009.01.21.1230)]
Martin Taylor (2009.01.20.17.53)
Rick Marken (2009.01.20.1340)
It seems to me that it would be a lot more useful to evaluate the
hypothesis, H, in term of goodness of fit to observations rather than
in terms of probability.
You really don't get it, do you?
I'm tryin'.
P(D|H), the starting point of the Bayesian analysis, IS the goodness of fit.
I don't see how this could be. P(D|H) is similar to the random
variable use in significance testing, where H would be the null
hypothesis.

No. firstly, it’s not a random variable at all. “H” is any hypothesis,
“D” is the data. P(D|H) is the probability of getting the precise data
given the hypothesis, and that’s not random. When P(D | H) is used for
fitting, H combines your control model and some model of how consistent
the person is, and D is the actual track made by the person (or a
hypothetical track you use to test out your algorithms). If the person
is extremely consistent, then the deviations from your control model
will matter because P(D|H) will be lower than it might be with a better
control model, even if the control model accounts for 95% of the
variance. On the other hand, if the person is quite variable, you might
well find you have a quite good fit even if the control model accounts
for only 50% of the variance. Of course, in this latter case, you would
have a hard time finding a better control model, whereas in the former
it might be easy. The point is that finding the control model to
account for X% of the variance does not, in itslef, tell you how good a
fit your model is to the human.

Ideally, your model of how consistent the person is should come from
some other experiment, but as with most ideals, that is usually not
realistic. You probably will have to do it from data contained in the
track itself. In your normal procedure, you use the RMS difference
between the person’s track and the model’s track, but that’s not
proper, since you are using this difference to assess the goodness of
fit. It is, however, not unreasonable as a fallback position.

You could, of course, also test a different hypotheses, such as “the
subject is not controlling”, which would be the “null hypothesis” you
suggest above.



Please show me how to calculate P(D|H). If I can see how to calculate
it then maybe I can see why it is a measure of goodness of fit. Since
probabilities are in the interval (0,1) then I presumed that goodness
of fit is good if P(D|H) is close to 1. So how do I compute P(D|H)?
Here is some example data that you can use to demonstrate the
calculation:
Data: 3 5 6 6 7 8 10 10 7 7 6 5 5 5 4 5 6 6 10 11 12 11 9
Model: 2 5 7 5 8 8 9 6 3 3 5 5 6 6 8 9 10 12 15 15 11 7 4
Think of these numbers as samples (at equal intervals) of handle
position in a tracking task. The data is the behavior of the system;
the model is the behavior of a model of that system. What is the
goodness of fit of the model to the data in terms of P(D|H)? (Please
show your work).

For reasons I mentioned above, it isn’t easy to find the reliability of
the subject from these data (which I assume are made up, rather than
from a real experiment). Anyway, I did the best I could, though someone
else might be able to do better. Here’s what I did to make a rough
guess at the subject’s reliability (using Excel 2008, spreadsheet
attached)

  1. I made a graph of the model, the track, and the error.

  2. By eye, it seemed clear that the model did not have as much lag as
    the track, so I shifted the track by one sample at a time and got an
    error series for all shifts between 0 and 4 samples (rows 3-7). For
    lack of a better measure, I took the sum of squared errors for each lag
    differential to get a rough estimate of the most reasonable one. Since
    each shift meant one sample had no comparison sample in the model, I
    used only those model samples for which all lags gave a proper error
    value (rows 9-13). Since lags 2 and 3 both gave the same minimum sum of
    squared errors, I used both for the next stage.

  3. Using the lagged errors for lags 2 and 3, I counted the number of
    samples for which the error was -5, -4, …,+5, took each value as a
    percentage, and plotted the result, which was quite irregular (as one
    might expect with so few samples to work with).

  4. In order to get a more realistic estimate of the subject’s
    reliability, I needed to smooth the error distribution. That is because
    I had as part of my hypothesis a prior that said it was highly
    improbable that the subject would have a non-monotonic error
    distribution. To smooth the distribution, I summed the error totals
    that were symmetric positive and negative (added the errors of -1 to
    those of +1, for example). The curve was still non-monotonic, so i did
    a by-eye smoothing to create a “final” error distribution representing
    the subject’s inconsistencies that I would use along with the model
    track to find P(D|H).

Comment at this point: The error distribution found in this way is
likely to overestimate the subject’s underlying inconsistency, but that
is good, because it gives a conservative estimate of the relative
goodness of fit for different possible models. You don’t want to assume
the subject is more precise than you have justification for, because if
you do, deviations from your model track that are due to subject
inconsistency will tend to show up as meaningful failures of the model.

The next thing is to find P(D|H), given the derived error distribution
for the subject. At this point, you probably can see why it is
inadvisable to derive the subject’s consistency from the track you are
trying to evaluate. Nevertheless, I had to do it in this case, because
of the lack of other useable data. Being able to try different lags
helps a little, but not much. So, we press on, assuming that the
derived error distribution is not terribly much too wide.

When we find P(D|H) sample by sample, the derived probability is going
to be VERY low. Before we start, let’s see why this is so, and why it
doesn’t matter for what you want to know, which may be how this model
compares to other possible models that might be tested against the same
disturbance pattern, or possibly how much information the model gives
you about the track.

Looking at the derived error distribution, the most probable error
value is zero, but this has a probability of only about 0.37. So, over
a series of 18 samples, the highest possible probability for the entire
series is (0.37)^18 = 1.7 *10^-8. Since the errors are distributed at
least as widely as the derived error curve suggests (they are due to
both model mismatch and subject error), the actual probability of any
realistic track will be much lower.

Why does this not matter? Because it will be true for any model, and
the denominator of the Bayes formula, which is P(D), is of the same
order of magnitude, though smaller (and probably undiscoverable). The
real problem is that with only one possible model to choose from, its
probability of being the right one is by definition 1.0. For this
reason, I’m going to do the analysis using six models for comparison,
five being the model track with added lags of zero to four samples, and
the sixth being the “null hypothesis” that the subject is not
controlling. I label this “null hypothesis” H0.

  1. Using the derived distribution of subject variability, we find the
    probability for each sample. For example, for the zero lag case, the
    first sample has an error of -1. The derived subject variability curve
    give a sample of -1 a probability 0.30 of occurring, so, for that datum
    we enter 0.3. Similarly for all the samples in the set of 18 that gets
    matched at all lags (spreadsheet lines 22-26).

  2. To get a raw P(D|H) we multiply the probabilities for each data
    sample, to get the values in cells U22-U26. These are, as forecast,
    VERY small numbers, ranging from about 10^-13 to 10^-17. P(D|H0) is
    even smaller (as it should be if the subject was really controlling) at
    about 3*10^-18.

The next step should incorporate prior probabilities, but since these
can be different for different people interpreting the data, they can’t
be publicly reportable. What can be publicly reported are the relative
likelihood ratios. To put this in symbols, the Bayes way of fully
comparing two hypotheses is given (ignoring the necessary conditionals)
by

`P(H1|D) P(D|H1)P(H1)

------- = ------------,

P(H2|D) P(D|H2)P(H2)

`

but when we have no publicly reportable priors, we are left with the
ratios of P(D|Hn), the values we computed above. In a useful
experiment, these ratios get large enough to swamp most reasonable
(unideological) priors, as they do in this hypothetical example.

  1. We look at the relative likelihoods, by dividing all of them by the
    most likely (in this case the hypothesis that the model should have an
    added lag of 2 samples). Lag 3 is almost as likely, the bet being only
    10 to 9 on Lag 2 being a better bet than Lag 3. But it’s about 1000 to
    1 against the “true” lag being zero and 12 to 1 against the true lag
    being 1 sample. The model as presented is a better bet than the null
    hypothesis by 24 to 1 (which I suppose would be called a “pretty good
    fit”), but the model with an added lag of 2 samples is a better bet
    than the “null hypothesis” by 36,000 to 1. You would be fairly safe in
    saying that the subject was actually controlling, if not controlling
    precisely the way the presented model suggests.

Only if you were previously prepared to bet 1000 to 1 against the
correct model having any added lag would you now be justified in
holding on to that possibility as a reasonable one.

Comment: The numbers above are subject to the usual caveat when I use
spreadsheets, that I may well have made some stupid typo, so the
numbers may not be correct. They could be wildly wrong, but you can
check if you want, on the spreadsheet. The method should be correct,
however, and should suffice to show the principles involved.

Sounds great. But I would find it more interesting if I knew how to
use this Bayes magic myself. So, again, I'd appreciate it if you could
illustrate it's use with the data above.

I don’t know if this example is adequate. I hope it helps to illustrate
the process. The biggest problem was to derive the subject’s
variability from the track data, which is not really legitimate, even
if it may often be necessary. I made quite a few assumptions I could
not properly justify, but I think they are conservative, in that they
probably make the subject variance seem larger than it probably should
be. In a real life situation, you might want to take samples more
widely separated, to remove the correlations among successive samples,
but that probably doesn’t matter much when computing the likelihood
ratios.

If you don’t even have the track data, as in the mindreading demo you
want me to analyze, then you make do with what you do have. The problem
in the mindreading demo is that without the track data it will be very
hard to get an estimate of the subject’s variability just from the
ongoing correlations. But I’ll do what I can with it.

Anyway, I hope this helps. The idea was not to solve the particular
example, but to show how P(D|H) can be a measure of model fit, and how
even though the fit for one model may be “good” (betting 24 to 1
against the null hypothesis), yet it may be much worse than the fit of
an alternative model.

Martin

ModelFit.xls (61 KB)

[Martin Taylor 2009.01.22.10.03]

When I do calculations after midnight (despite what the time-stamp
below says) I should wait till morning before posting !

[Martin Taylor 2009.01.21.17.45]

[From Rick Marken (2009.01.21.1230)]


Please show me how to calculate P(D|H). If I can see how to calculate
it then maybe I can see why it is a measure of goodness of fit. Since
probabilities are in the interval (0,1) then I presumed that goodness
of fit is good if P(D|H) is close to 1. So how do I compute P(D|H)?
Here is some example data that you can use to demonstrate the
calculation:
Data: 3 5 6 6 7 8 10 10 7 7 6 5 5 5 4 5 6 6 10 11 12 11 9
Model: 2 5 7 5 8 8 9 6 3 3 5 5 6 6 8 9 10 12 15 15 11 7 4
Think of these numbers as samples (at equal intervals) of handle
position in a tracking task. The data is the behavior of the system;
the model is the behavior of a model of that system. What is the
goodness of fit of the model to the data in terms of P(D|H)? (Please
show your work).

In the following calculations I mentioned, but did not take into
account the fact that the data for successive samples is correlated. If
on one sample, the error is, say, +2, then the error or in next sample
is more likely to be close to +2 than is the error taken at another
random sample somewhere else along the track far from the first sample.
The effect of taking this into account is discussed below.

For reasons I mentioned above, it isn’t easy to find the reliability of
the subject from these data (which I assume are made up, rather than
from a real experiment). Anyway, I did the best I could, though someone
else might be able to do better. Here’s what I did to make a rough
guess at the subject’s reliability (using Excel 2008, spreadsheet
attached)

  1. I made a graph of the model, the track, and the error.

  2. By eye, it seemed clear that the model did not have as much lag as
    the track, so I shifted the track by one sample at a time and got an
    error series for all shifts between 0 and 4 samples (rows 3-7). For
    lack of a better measure, I took the sum of squared errors for each lag
    differential to get a rough estimate of the most reasonable one. Since
    each shift meant one sample had no comparison sample in the model, I
    used only those model samples for which all lags gave a proper error
    value (rows 9-13). Since lags 2 and 3 both gave the same minimum sum of
    squared errors, I used both for the next stage.

  3. Using the lagged errors for lags 2 and 3, I counted the number of
    samples for which the error was -5, -4, …,+5, took each value as a
    percentage, and plotted the result, which was quite irregular (as one
    might expect with so few samples to work with).

  4. In order to get a more realistic estimate of the subject’s
    reliability, I needed to smooth the error distribution. That is because
    I had as part of my hypothesis a prior that said it was highly
    improbable that the subject would have a non-monotonic error
    distribution. To smooth the distribution, I summed the error totals
    that were symmetric positive and negative (added the errors of -1 to
    those of +1, for example). The curve was still non-monotonic, so i did
    a by-eye smoothing to create a “final” error distribution representing
    the subject’s inconsistencies that I would use along with the model
    track to find P(D|H).

Comment at this point: The error distribution found in this way is
likely to overestimate the subject’s underlying inconsistency, but that
is good, because it gives a conservative estimate of the relative
goodness of fit for different possible models. You don’t want to assume
the subject is more precise than you have justification for, because if
you do, deviations from your model track that are due to subject
inconsistency will tend to show up as meaningful failures of the model.

The next thing is to find P(D|H), given the derived error distribution
for the subject. At this point, you probably can see why it is
inadvisable to derive the subject’s consistency from the track you are
trying to evaluate. Nevertheless, I had to do it in this case, because
of the lack of other useable data. Being able to try different lags
helps a little, but not much. So, we press on, assuming that the
derived error distribution is not terribly much too wide.

When we find P(D|H) sample by sample, the derived probability is going
to be VERY low. Before we start, let’s see why this is so, and why it
doesn’t matter for what you want to know, which may be how this model
compares to other possible models that might be tested against the same
disturbance pattern, or possibly how much information the model gives
you about the track.

Looking at the derived error distribution, the most probable error
value is zero, but this has a probability of only about 0.37. So, over
a series of 18 samples, the highest possible probability for the entire
series is (0.37)^18 = 1.7 *10^-8. Since the errors are distributed at
least as widely as the derived error curve suggests (they are due to
both model mismatch and subject error), the actual probability of any
realistic track will be much lower.

The value of P(D|H) is indeed going to be very low, but not as low as
this calculation, which is based on the idea that successive samples
are independent, would suggest. To get a better estimate of
probability, we have to determine the probability value for a sample
given not only the hypothesis, but also the values of preceding
samples. Alternatively, losing some of the power potentially available,
we can determine from the track a rough separation of samples that
allows us to choose effectively independent ones.

In the revised spreadsheet, I have computed the autocorrelation of the
error values at lag zero, which becomes zero for a separation of about
3.5 samples. For ease of computation I will take it as 3.0, and average
successive 3-sample windows in computing the relative probabilities,
giving us six independent sample values rather than the 18 independent
values used in last night’s analysis. Doing so makes a BIG difference
in the likelihood ratios. This is very rough, and it can be done more
precisely, but we are interested here in the method, not the result.

Why does this not matter? Because it will be true for any model, and
the denominator of the Bayes formula, which is P(D), is of the same
order of magnitude, though smaller (and probably undiscoverable). The
real problem is that with only one possible model to choose from, its
probability of being the right one is by definition 1.0. For this
reason, I’m going to do the analysis using six models for comparison,
five being the model track with added lags of zero to four samples, and
the sixth being the “null hypothesis” that the subject is not
controlling. I label this “null hypothesis” H0.

  1. Using the derived distribution of subject variability, we find the
    probability for each sample. For example, for the zero lag case, the
    first sample has an error of -1. The derived subject variability curve
    give a sample of -1 a probability 0.30 of occurring, so, for that datum
    we enter 0.3. Similarly for all the samples in the set of 18 that gets
    matched at all lags (spreadsheet lines 22-26).

  2. To get a raw P(D|H) we multiply the probabilities for each data
    sample, to get the values in cells U22-U26. These are, as forecast,
    VERY small numbers, ranging from about 10^-13 to 10^-17. P(D|H0) is
    even smaller (as it should be if the subject was really controlling) at
    about 3*10^-18.

6a. To take into account the autocorrelation of the error values across
samples, average the values for sample windows long enough to permit
successive averaged samples to be effectively independent (these
windows should not be the rectangular ones I’m using, but should be
shaped to account for the shape of the autocorrelation function). The
averages for successive non-e=overlapping windows of 3 samples are in
cells 38-42. To get the values of P(D|H), we do the same for these as
noted above for step 6. We still get small numbers (on the order of
10^-4 or 10^-5) but they are orders of magnitude bigger than the
probabilities we get by assuming each sample is independent.

The next step should incorporate prior probabilities, but since these
can be different for different people interpreting the data, they can’t
be publicly reportable. What can be publicly reported are the relative
likelihood ratios. To put this in symbols, the Bayes way of fully
comparing two hypotheses is given (ignoring the necessary conditionals)
by

`P(H1|D) P(D|H1)P(H1)

------- = ------------,

P(H2|D) P(D|H2)P(H2)

`

but when we have no publicly reportable priors, we are left with the
ratios of P(D|Hn), the values we computed above. In a useful
experiment, these ratios get large enough to swamp most reasonable
(unideological) priors, as they do in this hypothetical example.

  1. We look at the relative likelihoods, by dividing all of them by the
    most likely (in this case the hypothesis that the model should have an
    added lag of 2 samples). Lag 3 is almost as likely, the bet being only
    10 to 9 on Lag 2 being a better bet than Lag 3. But it’s about 1000 to
    1 against the “true” lag being zero and 12 to 1 against the true lag
    being 1 sample. The model as presented is a better bet than the null
    hypothesis by 24 to 1 (which I suppose would be called a “pretty good
    fit”), but the model with an added lag of 2 samples is a better bet
    than the “null hypothesis” by 36,000 to 1. You would be fairly safe in
    saying that the subject was actually controlling, if not controlling
    precisely the way the presented model suggests.

Only if you were previously prepared to bet 1000 to 1 against the
correct model having any added lag would you now be justified in
holding on to that possibility as a reasonable one.

7a. With the corrected numbers for P(D|H) for the different hypotheses,
these betting ratios become much less extreme. Now Lag 3 is the most
likely, a 5:3 favourite over Lag 2, and about 10:1 over Lag 0 (the
original model). The bet as to whether the subject was controlling is
better than 40:1, and even the original model is nearly 5:1 favourite
over “not controlling”.

Comment: The numbers above are subject to the usual caveat when I use
spreadsheets, that I may well have made some stupid typo, so the
numbers may not be correct.

That certainly was true, though the “typo” was a conceptual brain fart!
It may still be true in this revision.

They
could be wildly wrong, but you can
check if you want, on the spreadsheet. The method should be correct,
however, and should suffice to show the principles involved.

Sounds great. But I would find it more interesting if I knew how to
use this Bayes magic myself. So, again, I'd appreciate it if you could
illustrate it's use with the data above.

I don’t know if this example is adequate. I hope it helps to illustrate
the process. The biggest problem was to derive the subject’s
variability from the track data, which is not really legitimate, even
if it may often be necessary. I made quite a few assumptions I could
not properly justify, but I think they are conservative, in that they
probably make the subject variance seem larger than it probably should
be. In a real life situation, you might want to take samples more
widely separated, to remove the correlations among successive samples,
but that probably doesn’t matter much when computing the likelihood
ratios.

This last “that probably doesn’t matter much” was the brain fart. It
matters a lot, and I knew that. Pure sleepiness at 2am (I hope).

If you don’t even have the track data, as in the mindreading demo you
want me to analyze, then you make do with what you do have. The problem
in the mindreading demo is that without the track data it will be very
hard to get an estimate of the subject’s variability just from the
ongoing correlations. But I’ll do what I can with it.

Anyway, I hope this helps. The idea was not to solve the particular
example, but to show how P(D|H) can be a measure of model fit, and how
even though the fit for one model may be “good” (betting 24 to 1
against the null hypothesis), yet it may be much worse than the fit of
an alternative model.

Martin

I hope the revision makes more sense, and that either version is
helpful.

Martin

ModelFit1.xls (72 KB)

[From Bill Powers (2008.01.22.1043 MST)]

Martin Taylor 2009.01.22.10.03 --

I'll try to understand your way of analyzing Rick's data, but while I
do that I attach a file of real data generated by the "TrackAnalyze"
program that comes with the new book and is discussed in chapter 4.

Default.txt (94.9 KB)

Default.par (144 Bytes)

···

============================================================================
In default.txt, the space-delimited format is

  title
  difficulty disturbance_number
  time_60ths_of_sec cursor_position target_position
   (repeated a total of 3600 times)

in default.par (a text file readable with notepad) the
space-delimited format is

  title
  path_and_filename
  difficulty disturbance_number
  delay_60ths_of_sec gain damping ref level [best fit values]
  model_fit_error_% tracking_error_%
   [percentages are percent of max - min target range]

Here is the model with some added annotations:

procedure TAnalysisForm.RunModel;
var
   T: Integer;
begin
   FillChar(DelayBuffer,SizeOf(DelayBuffer),0); {prepare delay buffer}
   Damping := 0.001*DampSet; {DampSet = displayed damping}
   Gain := 0.1*GainSet; {GainSet = displayed gain}
   TimeLag := Round(DelaySet); {DelaySet = displayed delay}
   ModelRef := Round(RefSet/10); {RefSet =
displayed ref lev}
   mHand := MouseVal[1]; {first model mouse position}
                                               {set to first real mouse pos}
   mCurs := mHand; { cursor = mouse position
   T := 1;
     {MODEL RUN STARTS HERE}
   while T <= LastData do {LastData = 3600}
   begin
     mPerc := (mCurs - TargetVal[T]); {model perception = targ - cursor}
     mDelP := TransportLag(mPerc,TimeLag); {delayed perception}
     mErr := ModelRef - mDelP; {error = reference - del percep}
      {mHand, model hand position, = integral(error) minus leakage factor}
     mHand := mHand +(Gain*mErr - Damping*mHand)*dt; {dt is 1/60 sec}
          { prevent crash if model gets unstable}
     if mHand > 1000 then mHand := 1000
     else if mHand < -1000 then mHand := -1000;

     mCurs := mHand; {model cursor pos = model hand position}
     ModelPercep[T] := mPerc; { Save data in arrays for output}
     ModelDelPerc[T] := mDelP;
     ModelCursor[T] := mCurs;
     ModelHandle[T] := mHand;
     FitErr[T] := mHand - MouseVal[T]; { compare mouse positions}
     PredictErr[T] := MouseVal[T] - TargetVal[T]; {really is tracking error}
     Inc(T); { next 60th of a second}
   end;
end;

For purposes of computation you can no doubt simplify the equations.
The data file (.txt) is generated in this format for every tracking
run, with the file name determined while setting up for the run (see
chapter 4). If you skip assigning a file name, the "default.txt" file
is overwritten. The parameter file (.par) is generated after
analyzing a run, using the appropriate root file name.

I assume that you can automate your method of applying Bayes' Theorem
to this data (which is the real data for the plot I transmitted
yesterday). My method is quite simple: for each parameter in turn,
I start with an assumed value, then increase it by fixed steps until
the fit error begins to increase, at which time the size of the fixed
step is divided by five and its sign is reversed. The initial step
size is set to half the estimated range of each parameter (larger
than any observed range). The process terminates when the absolute
value of the step size is less than 0.001 or 20 reversals have taken
place. This whole process is repeated in the order gain, reference,
delay, damping, and that pattern is repeated five times (found
sufficient to reach asymptotic values).I checked the parameters
obtained this way with a Vensim simulation, which contains a "Powell
Optimizer", and got the same results to three decimal places.

Best,

Bill P.

[Martin Taylor 2009.01.22.14.14]

[From Bill Powers (2008.01.22.1043 MST)]

Martin Taylor 2009.01.22.10.03 --

I'll try to understand your way of analyzing Rick's data, but while I do that I attach a file of real data generated by the "TrackAnalyze" program that comes with the new book and is discussed in chapter 4.

Thanks. Could you re-send it including the sample-by sample disturbance values? Or just send the disturbance values (it will be easier for me if all three are in the same file).

My Excel programming is not up to Rick's standard, but maybe I can automate it. meanwhile, these data will be very helpful.

It occurs to me that I could use my old sleep-study tracks, too, if I can set up an Excel program to do it. Maybe there is something yet to be found in those old data.

Martin