Monday, 21 April 2014

Chapter 7 

Bayesianism:  A Major Theoretical Criticism And A Response

Part A


The Bayesian way of explaining how we think about, test, and then adopt a new model of reality has been given a number of mathematical formulations. They look complicated, but they really aren’t that hard. I have chosen one of the more intuitive ones below because I intend to use it to discuss the theoretical criticism of Bayesianism which I mentioned in the last chapter.

The Bayesian model of how a human being’s thinking evolves can be broken down into a few basic components. When I, as a typical, modern human, am examining a new way of explaining what I see going on in the world, I am considering a new hypothesis, and as I try to judge just how true – and therefore how useful – a picture of the world this new hypothesis may give me, I look for ways of testing the hypothesis that will tend to show decisively one way or the other whether this new hypothesis and the model of reality that it is based on really work. What I am trying to determine is whether or not this hypothesis will help me to understand, anticipate, and respond effectively to, events in my world.

When I encounter a test situation that fits within the range of events that the hypothesis is supposed to be able to explain and make predictions about, I tend to become more convinced that the hypothesis is a true one if it does indeed enable me to make accurate predictions. (And I tend to be more likely to discard the hypothesis if the predictions that it leads me to make keep failing to be realized.) I am especially more inclined to accept the hypothesis and the model of reality that it is based on, if it enables me to make reliable predictions about the outcomes of these test situations and, if also, in the meantime, all of my other theories and models are silent or inaccurate when it comes to explaining my observations of these same test situations.

It is worth noting again here that this same process occurs in a whole nation when some citizens become convinced that a new way of doing things, that is making the rounds in their society, and is starting to push some old ways of doing those same things aside, is effective, and is a way that is better than the status quo practices for achieving the desired results. In other words, both individuals and societies as wholes do learn, grow, and change by the Bayesian model. 

In the case of the whole society, the clusters of ideas that the individual sorts through and tries to work into a more coherent system are simply replaced by clusters of citizens, arguing as members of factions within society for the way of thinking that each faction, in its turn, favors. The leaders of each faction search for reasoning and evidence to support their positions in ways that are closely analogous to the ways in which the various biases in an individual mind struggle to establish their hegemony. The difference is that the individual usually does not settle very heated internal debates by blinding his right eye with his left hand. 

In societies, factions sometimes work out their differences, reach consensus, and move on without violence. But sometimes, of course, as noted above, they seem to have to fight it out. Then violence between factions within society, or violence with the neighboring society that is perceived as being the carrier of the threatening new ideas, settles the matter. But Bayesian calculations are always in play in the minds of the participants, and these same calculations almost always eventually dictate the outcome. One side gives in and learns the new ways. The most extreme alternative, one tribe’s complete and genocidal extermination of the other, is only rarely the final outcome.
But back to the so-called theoretical flaw in the formula for Bayesian decision-making.
  
Mathematically, the Bayesian situation can be represented if we let Pr(H/B) be the degree to which we trusted the hypothesis before we observed a bit of new evidence,  Pr(E/H&B)  be the degree to which we expected the evidence if, for the sake of argument, we briefly assumed that the hypothesis was true, and Pr(E/B) be the degree to which we expected this evidence to happen based on what we knew before we ever met this hypothesis (using our old, familiar background models, in other words).

Note that these terms are not fractions in the normal algebraic sense at all. The term Pr(H/B) is called my “prior expectation” and should be read “my estimate of the probability that the hypothesis is a correct one if I base my estimate just on how well the hypothesis fits together with my whole familiar set of background assumptions about the world.” 

The term Pr(E/H&B) should be read “my estimate of the probability that the evidence will happen if I assume just for the sake of this term that my background assumptions and this new hypothesis are both true”. Finally, the term Pr(E/B) can be read “my estimate of the probability that the evidence (the event that the hypothesis predicts) will occur if I base my estimate only on my ordinary set of background assumptions and do not use the new hypothesis at all”.

The really important symbol in the equation comes now, and it is Pr(H/E&B). It stands for how much I now am inclined to believe that the hypothesis gives a correct picture of reality after I have seen this new bit of evidence, while taking as a given that the evidence is as I saw it - not a trick or illusion of some kind - and that the rest of my background beliefs are still in place.

Thus, the whole probability formula that describes this relationship can now be expressed in the following form:



Pr(H/E&B) =  Pr(E/H&B) x  Pr(H/B)
                      Pr(E/B)  


       Now this formula looks daunting, but it actually says something fairly simple. A new hypothesis that I am thinking about, and trying to understand, seems more and more likely to be correct the more that I keep encountering new evidence that the hypothesis can explain and that I can’t explain using any of the models of reality that I already have in my background stock of ideas. When I set the values of these terms, I will assume, at least for the time being, that the evidence that I saw (E) was as I saw it, not some mistake or trick or delusion, and that the rest of my background ideas/beliefs about reality (B) are valid.
               
        I tend more and more, then, to believe that a hypothesis is a true one the bigger Pr(E/H&B) gets and the smaller Pr(E/B) gets. 

        In other words, I more and more tend to believe that a new way of explaining the world is true, the more it can be used to explain the evidence that I keep encountering in this world, and the less I can explain that evidence if I don’t accept this new hypothesis into my total set of ways of explaining and understanding the world.

               
         Now all of this is beginning to seem intuitive, but once we have a formula set down it also is open to criticism and attack, and the critics of Bayesianism see a flaw in it that they think is fatal. The flaw that they point to is usually called “the problem of old evidence”. 

No comments:

Post a Comment

What are your thoughts now? Comment and I will reply. I promise.