Tuesday, 25 May 2021

 

Chapter 10               The Second Attack on Bayesianism and a Response to it



 

A number of math formulas have been devised to represent the Bayesian way of explaining how we learn, and then know, a new theory of reality. They look complicated, but they really aren’t that hard. I have chosen one of the more intuitive ones to use in my discussion of the main theoretical flaw that critics think they see in Bayesian Confirmation Theory.

 

The Bayesian model of how a human’s way of thinking evolves can be broken down into a few basic parts. When I, a typical human, examine a new way of explaining what I see going on in the world – a theory – I try to judge how useful a picture of the world this new theory may give me. I look for ways to test it, ways that will show me whether this new theory will help me to get useful results in the real world. I want to understand events in my world better and respond to them more effectively. Get real, physical world results. Avoid pain.

 

For Bayesians, any theory worth investigating must imply specific hypotheses that can be tested in the real world. I can’t test the Theory of Gravitation by moving the moon around, but I can drop balls from towers here on Earth to see whether they take as long to fall as the theory says they will. I can’t watch the evolution of the living world for three billion years, but I can observe a particular insect species that is being sprayed with a new pesticide every week and, by basing my response on the Theory of Evolution, predict that these insects will be immune to the pesticide by the end of this summer.

 

When I encounter a real-world situation in which I can formulate a specific hypothesis based on the theory, and then I test that hypothesis, I am more inclined to believe the hypothesis and the theory underlying it if it enables me to make accurate predictions. I tend toward discarding it if the predictions it leads me to make keep turning out wrong. I am especially inclined to believe the theory if all my other theories are silent or inaccurate when it comes to predicting and explaining the test results. 

 

In short, I tend to believe a new idea more and more if it successfully explains what I see. This model of how we learn and come to know new ideas can be expressed in a math formula. 

 

Let Pr(H/B) be the probability Pr I assign to a hypothesis H based just on the background beliefs B that I had before I considered this new hypothesis. If the hypothesis seems far-fetched to me, this term will be small, maybe under 1%.

 

Now, let Pr(E/B) be the degree to which I expected to see the evidence E based only on my old familiar background ideas B about how reality works. If I am about to see an event that I will find hard to believe, then my expectation before that event Pr(E/B) will also be low, again likely under 1%.

 

Note that these terms are not fractions. The forward slash in the way they’re written is not being used in its usual way. The term Pr(H/B) is called my prior confidence in the hypothesis. It stands for my estimate of the probability that the hypothesis is correct if I base that estimate only on how well the hypothesis fits my familiar old set of background assumptions about reality. It doesn’t say anything like “hypothesis divided by background”.

 

The term Pr(E/H&B) is my estimate of the probability that the evidence will happen if I assume – just for the sake of this term – that my background assumptions and this new hypothesis are both true, i.e. if for a short while I try to think as if the hypothesis, along with the theory it comes from, is right.

 

The most important part of the equation is Pr(H/E&B). It represents how much I am starting to believe that the hypothesis H must be right, now that I’ve seen this new evidence, all the while assuming the evidence E is as I saw it, not a trick of some kind, and the rest of my old beliefs B are still in place.

 

Thus, the whole probability formula that describes this relationship can be expressed in the following way:             

 

 

                        

                          Pr(H/E&B) =  Pr(E/H&B)    X    Pr(H/B)

                                                                               Pr(E/B)  

          

 

While this formula looks daunting, it actually says something fairly simple. A new hypothesis/theory that I am trying to understand seems more likely to be correct the more I keep encountering evidence that the hypothesis can predict and that my old models of reality can’t predict. When I set the values of these terms – as probabilities expressed by percentages – I will assume, for the time being, that the evidence E is as I saw it, not a mistake or trick, and that I still accept the rest of my background ideas B about reality as being valid, which I have to do if I am to make sense of what I see in my surroundings at all.

 

I more and more tend to believe a hypothesis is true the bigger Pr(E/H&B) gets and the smaller Pr(E/B) gets.

 

In other words, I increasingly tend to believe that a new way of explaining the world is true the more it works to explain evidence I keep encountering in real world experiments and studies, and the less I can explain that evidence if I don’t accept the new hypothesis.

 

So far, so good. 

 

Now, all of this may begin to seem intuitive, but once we have a formula set down it is open to attack; the critics of Bayesianism see a flaw in it that they consider fatal. The flaw they see is called “the problem of old evidence”.

 

One of the ways a new hypothesis/theory gets more respect among experts in the field the hypothesis covers is by its ability to explain old evidence that old theories in the field have so far been unable to explain. For example, physicists all over the world felt that the probability that Einstein’s Theory of Relativity was correct took a huge jump upward when he used his theory to account for the gradual changes in the orbit of the planet Mercury – changes that were familiar to physicists, but that had defied explanation by the old Newtonian model of the universe and the equations it offers.

 

The constant shift in Mercury’s orbit had baffled astronomers since they had first acquired telescopes that enabled them to detect that shift. The shift could not be explained by any pre-Relativity models. But Relativity Theory could explain this shift and make extremely accurate predictions about it.

 

Other examples of theories that worked to explain old evidence in many other branches of Science could easily be listed. Kuhn gives lots of them.1

 

What is wrong with Bayesianism, according to its critics, is that it can’t explain why we give more credence to a theory when we realize it can be used to explain unexplained old evidence. Critics say when the formula above is applied in this situation, Pr(E/B) has to be considered equal to 100 percent (absolute certainty) because the old evidence has been seen so many times by so many people.

 

For the same reasons, Pr(E/H&B) has to be thought of as equal to 100 percent again because the evidence has been reliably observed and recorded many times – since long before we ever had this new theory/hypothesis to consider.

 

When these two 100 percent probabilities are put into the equation, it becomes this:

 

 

                                             Pr(H/E&B) = Pr(H/B)

 

 

This new version of the formula emerges because Pr(E/B) and Pr(E/H&B) are now both equal to 100 percent, or a probability of 1.0, and thus they can be cancelled out of the equation.

 

But that means when I realize this new theory that I’m considering can be used to explain some nagging old problems in my field, my confidence in the new theory does not rise at all. Or to put the matter another way, after seeing the new theory explain some troubling old evidence, I trust the theory not one jot more than I did before I realized it might explain that old evidence.

 

This is simply not what happens in real life. When we realize that a new model or theory that we are considering can be used to explain some old evidence that previously had not been explainable, we are definitely more inclined to believe that the new theory is correct. 

 

 


                      

     


 

      Pasteur in his laboratory (artist: Eldelfeldt) (credit: Wikimedia Commons)

 




 

An indifferent reaction to a new theory’s being able to explain confusing old evidence is simply not what happens in real life. When physicists around the world realized that the Theory of Relativity could be used to explain the shift in the orbit of Mercury, their confidence that the theory was correct shot up. Most humans are not just persuaded but exhilarated when a new theory they are beginning to understand gives them solutions to unsolved old problems. 

 

Hence, the critics say, Bayesianism is obviously not adequate as a way of describing human thinking. It can’t account for some of the ways of thinking that we know we use. We do indeed test new theories against puzzling, old evidence all the time, and we do feel much more impressed with a new theory if it can account for that same evidence when all the old theories can’t.

No comments:

Post a Comment

What are your thoughts now? Comment and I will reply. I promise.