Chapter 10 The Second
Attack on Bayesianism and a Response to it
A number of math formulas have been
devised to represent the Bayesian way of explaining how we learn, and then know,
a new theory of reality. They look complicated, but they really aren’t that
hard. I have chosen one of the more intuitive ones to use in my discussion of
the main theoretical flaw that critics think they see in Bayesian Confirmation
Theory.
The Bayesian model of how a human’s way of
thinking evolves can be broken down into a few basic parts. When I, a typical
human, examine a new way of explaining what I see going on in the world – a
theory – I try to judge how useful a picture of the world this new theory may
give me. I look for ways to test it, ways that will show me whether this new
theory will help me to get useful results in the real world. I want to
understand events in my world better and respond to them more effectively. Get real,
physical world results. Avoid pain.
For Bayesians, any theory worth
investigating must imply specific hypotheses that can be tested in the real
world. I can’t test the Theory of Gravitation by moving the moon around, but I
can drop balls from towers here on Earth to see whether they take as long to
fall as the theory says they will. I can’t watch the evolution of the living
world for three billion years, but I can observe a particular insect species
that is being sprayed with a new pesticide every week and, by basing my
response on the Theory of Evolution, predict that these insects will be immune
to the pesticide by the end of this summer.
When I encounter a real-world situation in
which I can formulate a specific hypothesis based on the theory, and then I test
that hypothesis, I am more inclined to believe the hypothesis and the theory
underlying it if it enables me to make accurate predictions. I tend
toward discarding it if the predictions it leads me to make keep turning out
wrong. I am especially inclined to believe the theory if all my other theories
are silent or inaccurate when it comes to predicting and explaining the test
results.
In short, I tend to believe a new idea
more and more if it successfully explains what I see. This model of how we learn
and come to know new ideas can be expressed in a math formula.
Let Pr(H/B) be the probability
Pr I assign to a hypothesis H based just on
the background beliefs B that I had before I considered this new
hypothesis. If the hypothesis seems far-fetched to me, this term will be small,
maybe under 1%.
Now, let Pr(E/B) be
the degree to which I expected to see the evidence E based only on
my old familiar background ideas B about how reality works. If
I am about to see an event that I will find hard to believe, then my expectation
before that event Pr(E/B) will also be low, again likely under 1%.
Note that these terms are not fractions.
The forward slash in the way they’re written is not being used in its usual way.
The term Pr(H/B) is called my prior confidence in
the hypothesis. It stands for my estimate of the probability that the
hypothesis is correct if I base that estimate only on how
well the hypothesis fits my familiar old set of background assumptions about
reality. It doesn’t say anything like “hypothesis divided by background”.
The term Pr(E/H&B) is
my estimate of the probability that the evidence will happen if I assume –
just for the sake of this term – that my background assumptions and
this new hypothesis are both true, i.e. if for a short while I
try to think as if the hypothesis, along with the theory it comes from, is right.
The most important part of the equation
is Pr(H/E&B). It represents how much I am starting to believe
that the hypothesis H must be right, now that I’ve seen this
new evidence, all the while assuming the evidence E is as I saw it, not a trick
of some kind, and the rest of my old beliefs B are still in place.
Thus, the whole probability formula that
describes this relationship can be expressed in the following
way:
Pr(H/E&B) = Pr(E/H&B) X Pr(H/B)
Pr(E/B)
While this formula looks daunting, it
actually says something fairly simple. A new hypothesis/theory that I am trying
to understand seems more likely to be correct the more I keep encountering
evidence that the hypothesis can predict and that my old models of reality
can’t predict. When I set the values of these terms – as probabilities expressed
by percentages – I will assume, for the time being, that the evidence E
is as I saw it, not a mistake or trick, and that I still accept the rest of my
background ideas B about reality as being valid, which I have
to do if I am to make sense of what I see in my surroundings at all.
I more and more tend to believe a
hypothesis is true the bigger Pr(E/H&B) gets and the smaller Pr(E/B) gets.
In other words, I increasingly tend to
believe that a new way of explaining the world is true the more it works to
explain evidence I keep encountering in real world experiments and studies, and
the less I can explain that evidence if I don’t accept the new hypothesis.
So far, so good.
Now, all of this may begin to seem
intuitive, but once we have a formula set down it is open to attack; the
critics of Bayesianism see a flaw in it that they consider fatal. The flaw they
see is called “the problem of old evidence”.
One of the ways a new hypothesis/theory
gets more respect among experts in the field the hypothesis covers is by its
ability to explain old evidence that old theories in the field have so far been
unable to explain. For example, physicists all over the world felt that the
probability that Einstein’s Theory of Relativity was correct took a huge jump
upward when he used his theory to account for the gradual changes in the orbit
of the planet Mercury – changes that were familiar to physicists, but that had
defied explanation by the old Newtonian model of the universe and the equations
it offers.
The constant shift in Mercury’s orbit had
baffled astronomers since they had first acquired telescopes that enabled them
to detect that shift. The shift could not be explained by any pre-Relativity
models. But Relativity Theory could explain this shift and make extremely
accurate predictions about it.
Other examples of theories that worked to
explain old evidence in many other branches of Science could easily be listed.
Kuhn gives lots of them.1
What is wrong with Bayesianism, according
to its critics, is that it can’t explain why we give more credence to a theory
when we realize it can be used to explain unexplained old evidence. Critics say
when the formula above is applied in this situation, Pr(E/B) has to
be considered equal to 100 percent (absolute certainty) because the old
evidence has been seen so many times by so many people.
For the same reasons, Pr(E/H&B) has
to be thought of as equal to 100 percent again because the evidence has been reliably
observed and recorded many times – since long before we ever had this new
theory/hypothesis to consider.
When these two 100 percent probabilities
are put into the equation, it becomes this:
Pr(H/E&B)
= Pr(H/B)
This new version of the formula emerges
because Pr(E/B) and Pr(E/H&B) are now both equal to 100
percent, or a probability of 1.0, and thus they can be cancelled out of the
equation.
But that means when I realize this new
theory that I’m considering can be used to explain some nagging old problems in
my field, my confidence in the new theory does not rise at all. Or to put the
matter another way, after seeing the new theory explain some troubling old
evidence, I trust the theory not one jot more than I did before I realized it
might explain that old evidence.
This is simply not what happens in real
life. When we realize that a new model or theory that we are considering can be
used to explain some old evidence that previously had not been explainable, we
are definitely more inclined to believe that the new theory is correct.
Pasteur in his
laboratory (artist: Eldelfeldt) (credit: Wikimedia Commons)
An indifferent reaction to a new theory’s
being able to explain confusing old evidence is simply not what happens in real
life. When physicists around the world realized that the Theory of Relativity
could be used to explain the shift in the orbit of Mercury, their confidence
that the theory was correct shot up. Most humans are not just persuaded but
exhilarated when a new theory they are beginning to understand gives them solutions
to unsolved old problems.
Hence, the critics say, Bayesianism is
obviously not adequate as a way of describing human thinking. It can’t
account for some of the ways of thinking that we know we use. We do indeed test
new theories against puzzling, old evidence all the time, and we do feel much
more impressed with a new theory if it can account for that same evidence when
all the old theories can’t.
No comments:
Post a Comment
What are your thoughts now? Comment and I will reply. I promise.