Saturday 31 October 2015

Chapter 7The Second Attack on Bayesianism and a Response to It

The Bayesian way of explaining how we think about, test, and then adopt a new model of reality has been given a number of mathematical formulations. They look complicated, but they really aren’t that hard. I have chosen one of the more intuitive ones below to discuss the theoretical criticism of Bayesianism.

The Bayesian model of how a human being’s thinking evolves can be broken down into a few basic components. When I, as a typical human, am examining a new way of explaining what I see going on in the world, I am considering a new hypothesis, and as I try to judge how true—and therefore how useful—a picture of the world this new hypothesis may give me, I look for ways of testing it that will show decisively whether it and the model of reality it is based on really work. I am trying to determine whether this hypothesis will help me to understand, anticipate, and respond effectively to events in my world.

When I encounter a test situation that fits within the range of events that the hypothesis is supposed to be able to explain and make predictions about, I tend to become more convinced the hypothesis is a true one if it enables me to make accurate predictions. (And I tend to be more likely to discard the hypothesis if the predictions it leads me to make keep failing to be realized.) I am especially more inclined to accept the hypothesis and the model of reality it is based on if it enables me to make reliable predictions about the outcomes of these test situations and if all my other theories and models are silent or inaccurate when it comes to explaining my observations of these same test situations. 

In short, I tend to believe a new idea more and more if it fits the things I’m seeing. This is especially true when none of my old ideas fit the events I’m seeing at all. All Bayes’ Theorem does is try to express this simple truth mathematically. 

It is worth noting again that this same process can also occur in an entire nation when increasing numbers of citizens become convinced that a new way of doing things is more effective than the status-quo practices. Popular ideas, the few that really work, are lasting. In other words, both individuals and whole societies really do learn, grow, and change by the Bayesian model.

In the case of a whole society, the clusters of ideas an individual sorts through and shapes into a larger idea system become clusters of citizens forming factions within society, each faction arguing for the way of thinking it favours. The leaders of each faction search for reasoning and evidence to support their positions in ways that are closely analogous to the ways in which the various biases in an individual mind struggle to become the idea system that the individual follows. The difference is that the individual usually does not settle heated internal debates by blinding his right eye with his left hand. That is, we usually choose to set aside unresolvable internal disputes rather than letting them make us crazy. Societies, on the other hand, have revolutions or wars.

In societies, factions sometimes work out their differences, reach consensus, and move on without violence. But sometimes, as noted in the previous chapter, they seem to have to fight it out. Then violence settles the matter—whether between factions within a society or between a given society and one of its  neighbouring societies that is perceived as being the carrier of the threatening new ideas. But Bayesian calculations are always in play in the minds of the participants, and these same calculations almost always eventually dictate the outcome: one side gives in and learns the new ways. The most extreme alternative, one tribe’s complete, genocidal extermination of the other, is only rarely the final outcome.

But let’s get back to the so-called flaw in the formula for Bayesian decision making.

Suppose I am considering a new way of explaining how some part of the world around me works. The new way is usually called a hypothesis. Then suppose I decide to do some research and I come up with a new bit of evidence that definitely relates to the matter I’m researching. What kind of process is going on in my mind as I try to decide whether this new bit of evidence is making me more likely to believe this new hypothesis is true or less likely to do so. This thoughtful, decision-making time of curiosity and investigation, for Bayesians, is at the core of how human knowledge forms and grows. 

Mathematically, the Bayesian situation can be represented if we set the following terms: let Pr(H/B) be the degree to which I trust the hypothesis just based on the background knowledge I had before I observed any bit of new evidence. If the hypothesis seems like a fairly radical one to me, then this term is going to be pretty small. Maybe less than 1%. This new hypothesis may sound pretty crazy to me.

Then let Pr(E/B) be the degree to which I expected to see this new evidence occur based only on my old familiar background models of how reality works. This term will be quite small if, for example) I’m seeing some evidence that at first I can’t quite believe is real because none of my background knowledge had prepared me for it.

These terms are not fractions in the normal sense. The slash is not a division sign. The term Pr(H/B), for example, is called my “prior expectation”. The term refers to my estimate of the probability (Pr) that the hypothesis (H) is correct if I base that estimate only on how well the hypothesis fits together with my old, personal, already established, familiar set of background assumptions about the world (B).

The term Pr(E/H&B) means my estimate of the probability that the evidence will happen if I assume just for the sake of this term that my background assumptions and this new hypothesis are both true.

The most important part of the equation is Pr(H/E&B). It represents how much I now am inclined to believe that the hypothesis gives a correct picture of reality after I’ve seen this new bit of evidence, while assuming that the evidence is as I saw it and not a trick or illusion of some kind, and that the rest of my background beliefs are still in place.

Thus, the whole probability formula that describes this relationship can be expressed in the following form:                                     



 Pr(H/E&B) =  Pr(E/H&B) x  Pr(H/B)
                        Pr(E/B)  
          

While this formula looks daunting, it actually says something fairly simple. A new hypothesis that I am thinking about and trying to understand seems to me increasingly likely to be correct the more I keep encountering new evidence that the hypothesis can explain and that I can’t explain using any of the models of reality I already have in my background stock of ideas. When I set the values of these terms, I will assume, at least for the time being, that the evidence I saw (E) was as I saw it, not some mistake or trick or delusion, and that the rest of my background ideas/beliefs about reality (B) are valid.

Increasingly, then, I tend to believe that a hypothesis is a true one the bigger Pr(E/H&B) gets and the smaller Pr(E/B) gets.


In other words, I increasingly tend to believe that a new way of explaining the world is true, the more it can be used to explain the evidence that I keep encountering in this world, and the less I can explain that evidence if I don’t accept this new hypothesis into my set of ways of understanding the world.

No comments:

Post a Comment

What are your thoughts now? Comment and I will reply. I promise.