Chapter 7
Bayesianism: A Major Theoretical Criticism And A Response
Part A
The Bayesian way of explaining how we think
about, test, and then adopt a new model of reality has been given a number of
mathematical formulations. They look complicated, but they really aren’t that
hard. I have chosen one of the more intuitive ones below because I intend to
use it to discuss the theoretical criticism of Bayesianism which I mentioned in
the last chapter.
The Bayesian model of how a human being’s
thinking evolves can be broken down into a few basic components. When I, as a
typical, modern human, am examining a new way of explaining what I see going on
in the world, I am considering a new hypothesis, and as I try to judge just how
true – and therefore how useful – a picture of the world this new hypothesis
may give me, I look for ways of testing the hypothesis that will tend to show
decisively one way or the other whether this new hypothesis and the model of
reality that it is based on really work. What I am trying to determine is whether
or not this hypothesis will help me to understand, anticipate, and respond
effectively to, events in my world.
When I encounter a test situation that
fits within the range of events that the hypothesis is supposed to be able to
explain and make predictions about, I tend to become more convinced that the
hypothesis is a true one if it does indeed enable me to make accurate predictions. (And I tend to be
more likely to discard the hypothesis if the predictions that it leads me to
make keep failing to be realized.) I am especially more inclined to accept the
hypothesis and the model of reality that it is based on, if it enables me to make
reliable predictions about the outcomes of these test situations and, if also, in
the meantime, all of my other theories and models are silent or inaccurate when
it comes to explaining my observations of these same test situations.
It is worth noting again here that this
same process occurs in a whole nation when some citizens become convinced that
a new way of doing things, that is making the rounds in their society, and is
starting to push some old ways of doing those same things aside, is effective, and
is a way that is better than the status quo practices for achieving the desired
results. In other words, both individuals and societies as wholes do learn,
grow, and change by the Bayesian model.
In the case of the whole society, the
clusters of ideas that the individual sorts through and tries to work into a
more coherent system are simply replaced by clusters of citizens, arguing as members
of factions within society for the way of thinking that each faction, in its turn,
favors. The leaders of each faction search for reasoning and evidence to
support their positions in ways that are closely analogous to the ways in which
the various biases in an individual mind struggle to establish their hegemony.
The difference is that the individual usually does not settle very heated internal
debates by blinding his right eye with his left hand.
In societies, factions sometimes
work out their differences, reach consensus, and move on without violence. But
sometimes, of course, as noted above, they seem to have to fight it out. Then
violence between factions within society, or violence with the neighboring
society that is perceived as being the carrier of the threatening new ideas,
settles the matter. But Bayesian calculations are always in play in the minds
of the participants, and these same calculations almost always eventually
dictate the outcome. One side gives in and learns the new ways. The most
extreme alternative, one tribe’s complete and genocidal extermination of the other,
is only rarely the final outcome.
But back to the so-called theoretical flaw
in the formula for Bayesian decision-making.
Mathematically, the Bayesian situation can
be represented if we let Pr(H/B) be the degree to which we trusted the hypothesis
before we observed a bit of new evidence,
Pr(E/H&B) be the degree to which we expected the
evidence if, for the sake of argument, we briefly assumed that the hypothesis
was true, and Pr(E/B) be the degree to which we expected this
evidence to happen based on what we knew before we ever met this
hypothesis (using our old, familiar background models, in other words).
Note that these terms are not fractions in
the normal algebraic sense at all. The term Pr(H/B) is called my “prior expectation” and should be
read “my estimate of the probability that the hypothesis is a correct one if I
base my estimate just on how well the hypothesis fits together with my whole
familiar set of background assumptions about the world.”
The term Pr(E/H&B) should be read “my estimate of the
probability that the evidence will happen if I assume just for the sake of this
term that my background assumptions and
this new hypothesis are both true”. Finally, the term Pr(E/B)
can be read “my estimate of the probability
that the evidence (the event that the hypothesis predicts) will occur if I base
my estimate only on my ordinary set of background assumptions and do not use
the new hypothesis at all”.
The really important symbol in the equation
comes now, and it is Pr(H/E&B). It stands for how much I now am inclined to
believe that the hypothesis gives a correct picture of reality after I have
seen this new bit of evidence, while taking as a given that the evidence is as
I saw it - not a trick or illusion of some kind - and that the rest of my
background beliefs are still in place.
Thus, the whole probability formula that
describes this relationship can now be expressed in the following form:
Pr(H/E&B) = Pr(E/H&B) x Pr(H/B)
Pr(E/B)
Now this formula
looks daunting, but it actually says something fairly simple. A new hypothesis
that I am thinking about, and trying to understand, seems more and more likely
to be correct the more that I keep encountering new evidence that the
hypothesis can explain and that I can’t explain using any of the models
of reality that I already have in my background stock of ideas. When I set the values
of these terms, I will assume, at least for the time being, that the evidence
that I saw (E) was as I saw it, not some mistake or trick or delusion, and that
the rest of my background ideas/beliefs about reality (B) are valid.
I tend more and
more, then, to believe that a hypothesis is a true one the bigger Pr(E/H&B) gets and the smaller Pr(E/B)
gets.
In other words, I more and more tend to
believe that a new way of explaining the world is true, the more it can be used
to explain the evidence that I keep encountering in this world, and the less I
can explain that evidence if I don’t accept this new hypothesis into my total
set of ways of explaining and understanding the world.
Now all of this is
beginning to seem intuitive, but once we have a formula set down it also is open
to criticism and attack, and the critics of Bayesianism see a flaw in it that
they think is fatal. The flaw that they point to is usually called “the problem
of old evidence”.
No comments:
Post a Comment
What are your thoughts now? Comment and I will reply. I promise.