Saturday, 30 November 2019


Chapter 10       The Second Attack On Bayesianism And A Response To It


A number of Math formulas have been devised to try to represent the Bayesian way of explaining how we learn a new theory of reality. They look complicated, but they really aren’t that hard. I have chosen one of the more intuitive ones to use in my discussion of the main theoretical flaw that critics think they see in Bayesian Confirmation Theory.

The Bayesian model of how a human’s way of thinking evolves can be broken down into a few basic parts. When I, a typical human, examine a new way of explaining what I see going on in the world, I’m considering a new theory. As I try to judge how true and useful a picture of the world this new theory may give me, I look for ways of testing it, ways that will show me decisively whether this new model/theory of reality helps me to get good results, i.e. whether or not it works. I’m trying to decide whether the theory will help me to understand events in my world better and respond to them more effectively.

For Bayesians, theories worth investigating always enable us to form more specific hypotheses that can be tested in the real world. I can’t test the Theory of Gravitation by manipulating Mars or the moon, but I can drop objects from towers here on Earth to see whether they take as long to fall as the theory’s hypothesis for each case leads me to expect. I can’t observe the evolution of the living world in all its amazing detail for three billion years, but I can observe a particular insect species that is being sprayed with a new pesticide every week and predict, based on the Theory of Evolution and this species’ mutation rate, that it will be immune to the pesticide by the end of this summer.

When I encounter a real-world situation that will let me formulate a specific hypothesis based on the theory, and then test that hypothesis, I tend to lean more toward believing the hypothesis and the theory underlying it if it enables me to make accurate predictions. I lean toward discarding it if the predictions it leads me to make keep failing to turn out right. I am especially inclined to believe the hypothesis, and the theory of reality it’s based on, if all my other theories are silent or inaccurate when it comes to explaining the test results. 

In short, I tend to believe a new idea more and more if it explains what I see. This idea about how we form new ideas, Bayes’ Theorem, can be expressed in a Math formula. 

It is worth noting again that this same process can occur in a whole nation when increasing numbers of citizens become convinced that a new way of doing things is more effective than the status-quo way. Popular ideas that work get followers who use the idea to get more effective work done faster. Then, they multiply. Thus, both individuals and societies learn and change by the Bayesian model.

In the case of a society, the clusters of memories and theories about them that an individual sorts through and shapes into his/her whole idea system are analogous to clusters of citizens forming factions within society, each faction arguing for the way of thinking it favors. The leaders of each faction search for reasoning and evidence to support their positions. They do this in ways that are closely analogous to the ways in which the varied ideas in one person’s mind struggle to become the ones that the individual will use to handle life. The difference is that a normal individual does not settle his internal debates by blinding his right eye with his left hand. As individuals, we usually choose to set aside unresolvable internal debates rather than let them make us crazy. On the other hand, societies sometimes do harm themselves.

In societies, factions sometimes work out their differences, reach consensus, and move on without violence. But sometimes, as noted in the previous chapter, on core value matters, they fight it out. Then violence settles the matter - whether between factions in a society or between societies. But Bayesian calculations are always in play in the minds of the participants, and these same calculations almost always eventually dictate the outcome: one side wins and the other side loses, gives in, and accepts large parts of the other’s culture. The most extreme option, one tribe’s extermination of the other, is only rarely the final outcome.

But let’s get back to the flaw the critics see in Bayesian Confirmation Theory.

Suppose I am considering a new way of explaining how some part of the world around me works. The new way is usually called a theory. Then, suppose I see a way to form a specific hypothesis based on the theory and I decide to do some research to see whether real world results will provide me with evidence that definitely relates to the matter I’m studying. What kind of process goes on in my mind as I try to decide whether this new bit of evidence is making me more likely to believe my hypothesis or less likely? This time of curiosity and testing, for Bayesians, is the core of their model of how human knowledge grows. 

Mathematically, Bayesian Theory can be represented if we set the following terms: let Pr(H/B) be the degree to which I trust the hypothesis H based just on the background beliefs that I had before I began to consider this new theory and the hypothesis it has led me to formulate. If the hypothesis and its theory seem like a fairly radical ones to me, then this term is going to be small. Maybe, less than 1%. Given my background beliefs, the new theory and its hypothesis may seem pretty far-fetched to me.

Then let Pr(E/B) be the degree to which I expected to see this new evidence E based only on my old familiar background models B of how reality works. This term will be quite small if, for example, I see some evidence that at first, due to my old set of beliefs still dominating my thinking, I can’t believe is real. None of my old background knowledge B had prepared me for seeing this evidence.

These terms are not fractions in the normal sense. The forward slash in the way they’re written is not working in its usual sense here. For example, the term Pr(H/B) is called my prior expectation. The term refers to my estimate of the probability Pr that the hypothesis H is correct if I base that estimate only on how well the hypothesis fits my familiar old set of background assumptions, B, about reality. It doesn’t say anything like “hypothesis divided by background”.

The term Pr(E/H&B) means my estimate of the probability that the evidence will happen if I assume just for the sake of this term that my background assumptions and this new hypothesis are both true, i.e. if for a short while, I try to think as if the hypothesis and its base theory are true.

The most important part of the equation is Pr(H/E&B). It represents how much I am starting to believe that the hypothesis H must be right, now that I’ve seen this new evidence, all the while assuming that the evidence E is as I saw it, not an illusion of some kind, and that the rest of my old beliefs B are still in place.

Thus, the whole probability formula that describes this relationship can be expressed in the following way:             


                        
                               Pr(H/E&B) =  Pr(E/H&B)    X    Pr(H/B)
                                                                               Pr(E/B)  
          


While this formula looks daunting, it actually says something fairly simple. A new hypothesis that I am trying to understand seems more likely to be correct the more I keep encountering new evidence that the hypothesis can explain and that my old models of reality can’t explain. When I set the values of these terms – probabilities that we’d normally express as percentages – I will assume, for the time being, that the evidence E is as I saw it, not a mistake or trick, and that I still accept the rest of my background ideas, B, about reality as being valid so that I can think and try to make sense of what I’m seeing at all.

I more and more tend to believe that a hypothesis is a true one the bigger Pr(E/H&B) gets and the smaller Pr(E/B) gets.

In other words, I increasingly tend to believe that a new way of explaining the world is true the more it works to explain evidence I keep encountering in the world, and the less I can explain that evidence if I don’t accept the new hypothesis and its base theory.

So far, so good. 




Spacely bound orbit in the Schwarzschild space-time surrounding the sun. The precession of the perihelion is clearly visible. 

                            

             perihelion procession of Mercury (credit: Wikimedia Commons) 

Now, all of this may begin to seem intuitive, but once we have a formula set down it also is open to attack, and the critics of Bayesianism see a flaw in it that they consider fatal. The flaw they see is called “the problem of old evidence.”

One of the ways a new hypothesis gets more respect among experts in the field the hypothesis covers is by its ability to explain old evidence that old theories in the field have been unable to explain. For example, physicists all over the world felt that the probability that Einstein’s theory of relativity was right took a huge jump upward when he used his theory to account for the regular changes in the orbit of the planet Mercury – changes that were familiar to physicists, but that had long defied explanation by the old Newtonian model of the universe.

The constant shift in Mercury’s orbit had baffled astronomers since they had first acquired telescopes that enabled them to detect that shift. The shift could not be explained by pre-relativity models. But Relativity Theory could describe the gradual shift and make predictions about it that were extremely accurate.

Other examples of theories that worked to explain old evidence in many other branches of Science could easily be listed. Kuhn gives lots of them.1

What is wrong with Bayesianism, according to its critics, is that it can’t explain why we give more credence to a theory when we realize it can be used to explain old evidence that had long defied explanation by the established theories in the field. When the formula above is applied in this situation, critics say Pr(E/B) has to be considered equal to 100 percent, or absolute certainty, since the old evidence E has been accepted as real for a long time.

For the same reasons, Pr(E/H&B) has to be thought of as equal to 100 percent because the evidence has been reliably observed and recorded many times – since long before we ever had this new theory to consider.

When these two 100% probabilities are put into the equation, it looks like this:


                                             Pr(H/E&B) = Pr(H/B)


This new version of the formula emerges because Pr(E/B) and Pr(E/H&B) are now both equal to 100 percent, or a probability of 1.0, and thus they can be cancelled out of the equation. But that means that when I realize this new theory that I’m considering adding to my mental programming can be used to explain some nagging old problems in my field, my confidence in the new theory does not rise at all. Or to put the matter another way, after seeing the new theory explain some troubling old evidence, I trust the theory not one jot more than I did before I realized it might explain that old evidence.

This is simply not what happens in real life. When we suddenly realize that a new theory or model can be used to solve some old problems that previously had been not solvable, we are impressed and definitely more inclined to believe that this new theory or model of reality is true. 


                      

        Pasteur in his laboratory (artist: Eldelfeldt) (credit: Wikimedia Commons)



An indifferent reaction to a new theory’s being able to explain confusing old evidence is simply not what happens in real life. When physicists around the world realized that the Theory of Relativity could be used to explain the shift in the orbit of Mercury, their confidence that the theory was correct shot up. Most humans are not just persuaded but exhilarated when a new theory they are beginning to understand gives them solutions to unsolved old problems. 

Hence, the critics say, Bayesianism is obviously not adequate as a way of describing human thinking. It can’t account for some of the ways of thinking that we’re certain we use. We do indeed test new theories against old, puzzling evidence all the time, and we do feel much more impressed with a new theory if it can account for that same evidence when all the old theories can’t.

The response in defense of Bayesianism is complex, but not that complex. What the critics seem not to grasp is the spirit of Bayesianism. In the deeply Bayesian way of seeing reality and our relationship to it, everything in the human mind is morphing and floating. The Bayesian picture of the mind sees us as testing, reassessing, and restructuring all our ways of understanding reality all the time.

In the formula above, the term for my degree of confidence in the evidence, when I take only my background beliefs as true – i.e. Pr(E/B) – is never 100%. Not even for very familiar old evidence. Nor is the term for my degree of confidence in the evidence if I include the hypothesis in my set of mental assumptions – i.e. Pr(E/H&B) – ever equal to 100%. I am never perfectly certain of anything, not of my background assumptions and not even any physical evidence I have seen repeatedly with my own eyes.

To closely consider this situation in which a hypothesis is used to try to explain old evidence, we need to examine the kinds of things that occur in the mind of a researcher in both the situation in which the new hypothesis does fit the old evidence and the one in which it doesn’t.

When a hypothesis successfully explains some old evidence, what the researcher is affirming is that, in the term Pr(E/H&B), the evidence fits the hypothesis, the hypothesis fits the evidence, and the background assumptions can be integrated with the hypothesis in a comprehensive way. She is delighted to see that, if she commits to this hypothesis and the theory underlying it, that will mean she can feel reassured that the old evidence did happen in the way in which she and her colleagues observed it. In short, she can feel reassured that they did the work well. She did not make any mistakes. She did see what she thought she saw.

Sloppy observing is a haunting fear for all scientists. It's nice to learn that you didn't mess up.  

All these logical and psychological factors raise her confidence that this new hypothesis and the theory behind it must be right.

This insight into the workings of Bayesian confirmation theory becomes even clearer when we consider what the researcher does when she finds that a hypothesis does not successfully account for the old evidence. In research, only rarely does a researcher in this situation simply drop the new hypothesis. Instead, she examines the hypothesis, the old evidence, and her background assumptions to see whether any or all of them may be adjusted, using new concepts or new calculations involving newly proposed variables or closer observations of the old evidence, so that all the elements in the Bayesian equation may be brought into harmony again. She gives the hypothesis really careful consideration. Every chance to prove itself.

When the old evidence is examined in light of the new hypothesis, if the hypothesis does successfully explain that old evidence, the scientist’s confidence in the hypothesis and her confidence in that old evidence both go up. Even if her prior confidence in that old evidence was really high, she can now feel more confident that she and her colleagues – even ones in the distant past – did observe that old evidence correctly and did record their observations well.

The value of this successful application of the new hypothesis to the old evidence may be small. Perhaps it raises the E value in the term Pr(E/H&B) only a fraction of 1 percent. But that is still a positive increase in the value of the whole term and therefore a kind of proof of the explicative value, rather than the predictive value, of the hypothesis being considered.

Meanwhile, Pr(H/E&B), i.e. the scientist’s degree of confidence in the new hypothesis, also goes up another notch as a result of the increase in her confidence in the evidence. A scientist, like all of us, finds reassurance in the feeling of mental harmony she gets when more of her perceptions, memories, and concepts about the world are brought into consonance with each other. (She feels relieved as her cognitive dissonance now goes down.)

A human mind experiences much cognitive dissonance when it keeps observing evidence that does not fit any of its models. The person attempting to explain evidence that is inconsistent with his world view, clings to his background beliefs and shuts out the new theory his colleagues are discussing. He keeps insisting that this new evidence can’t be correct. Some systemic error must be leading those other researchers to think they have observed E, but they must be wrong. E is not what they say it is. “That can’t be right,” he says.

In the meantime, his more subversive colleague down the hall, even if only in her own mind, is arguing “I know what I saw. I know how careful I’ve been. E is right. Thus, the probability of H, at least in my mind, has grown. It’s such a relief to see a way out of all the cognitive dissonance I’ve been experiencing for the last few months. I get it now. Wow, this feels good!” Settling a score with a stubborn bit of old evidence that refused to fit into any of a scientist’s models of reality is a bit like finally whipping a bully who picked on her in elementary school – not really logical, but still very satisfying.

Normally, testing a new hypothesis involves performing an experiment that will generate new evidence. If the experiment delivers evidence that was predicted by the hypothesis, but not by our background concepts, then the hypothesis, as a way of explaining the real world, seems more likely or probable to us. The new evidence confirms the hypothesis. That’s Bayesianism and it fits us exactly.

But I may also decide to try to use a hypothesis and the theory it is based on to explain some problematic old evidence. I’ll be studying whether the hypothesis and its predictions do in fact fit the old evidence situations. If I find that the hypothesis and the theory it is based on do successfully explain that problematic old evidence, what I’m confirming is not just the hypothesis and the theory it is based on, but also a new consistency between the evidence, the hypothesis, and all, or nearly all, of my background set of concepts. (I likely will have to drop a few of my old ways of thinking to make room for the new theory.)



   File:Levitaatio.jpg
  
                               Levitation (Is it real?) (credit: Wikimedia Commons)



And no, it is not obvious that evidence seen with my own eyes is 100 percent reliable, not even if I’ve seen a particular phenomenon repeated many times. Neither my longest-held, familiar background concepts nor the ordinary sense data I see in everyday experiences are trusted that much. If they were, then I and anyone who trusts gravity, light and human anatomy would be unable to watch a good magic show without having a nervous breakdown. Elephants disappear, men float, and women get sawn in half. By pure logic, if my most basic concepts were believed at the 100 percent level, then either I would have to gouge my eyes out or go mad. 

But I know the magic is all a trick of some kind. And I choose, for the duration of the show, to suspend my desire to harmonize all my sense data with my set of background concepts. It is supposed to be a performance of fun and wonder. If I explain how the trick is done, I ruin my grandkids’ fun … and my own.

It’s important to point out here that the idea behind H&B, the set of the new hypothesis/theory plus my background concepts, is more complex than the equation can capture. This part of the formula should be read: “If I integrate the hypothesis into my whole background concept set.” The formula attempts to capture in symbols something that is almost not capturable. This is because the point of positing a hypothesis, H, is that it doesn’t fit neatly into my background set of beliefs. It is built around a new way of comprehending reality, and thus, it will only be fully integrated into my old background set of concepts and beliefs if some of those concepts are adjusted, by careful, gradual tinkering, and then, some are removed entirely.

Similarly, in the term Pr(H/E&B), the E&B part is trying to capture something no math term can capture. E&B is trying to say: “If I take both the evidence and my set of background beliefs to be 100% reliable.” 

But that way of stating the E&B part of the term merely highlights the issue of problematic old evidence. This evidence is problematic because I can’t make it consistent with my set of background concepts and beliefs, no matter how I tinker with them.

All the whole formula really does is try to capture the gist of human thinking and learning. It is a useful portrayal,  a kind of metaphor, but we can’t become complacent about this formula for the Bayesian model of human thinking and learning any more than we can become complacent about any of our concepts. And that thought is consistent with the spirit of Bayesianism. It tells us not to become too blindly attached to any of our concepts; any of them may have to be radically updated and revised at any time.

Thus, on closer examination, the criticism of Bayesianism which says the Bayesian model can’t explain why we find a fit between a hypothesis and some problematic old evidence reassuring turns out not to be a fatal criticism, but more a useful tool, one that we may use to deepen our understanding of the Bayesian model of human thinking. 

We can hold onto the Bayesian model if we accept that all the concepts, thought patterns, and patterns of neuron firings in the brain – hypotheses, evidence, and assumed background concepts – are forming, reforming, aligning, realigning, and floating in and out of one another all the time – even concepts as basic as the ones we have about gravity, matter, space, and time. This whole view of the scary idea called “Bayesianism” arises if we simply apply Bayesianism to itself.

In short, Bayesianism says we keep adjusting our thinking until we die. 

The Bayesian way of thinking about our own thinking requires us to be willing to float all our concepts, even our most deeply held ones. Some are more central, and we use them more often with more confidence. A few we may believe almost absolutely. But in the end, none of our concepts is irreplaceable.

For humans, the mind is our means of surviving. It will adapt to almost anything. Let war, famine, plague, economics, technology and so on do what they may, rattle living styles and ways to their foundations. We choose to go on.

We gamble heavily on the concepts we routinely use to organize our sense data and memories of sense data. I use my concepts to organize the memories already stored in my brain and the new sense data that are flooding into my brain all the time. I keep trying to acquire more concepts – including concepts for organizing other concepts – that will enable me to utilize my memories more efficiently to make faster and better decisions and to act increasingly effectively. In this constant, restless, searching mental life of mine, I never trust anything absolutely. If I did, a simple magic show would mesmerize and paralyze me. Or reduce me to catatonia.

But I choose to stand by my concepts in almost every such case, not because I am certain they’re perfect, but because they’ve been tested and found effective over so many trials and for so long that I’m willing to keep gambling on them. At least until someone proposes something even more promising to me. I don’t know for certain that the theories of the real world that my culture has gained via the scientific method are sure bets; they just seem very likely to be the most promising options available to me now. And I need at least some theories about reality in place every day. I have to see, recognize, and act. I can’t sit catatonic.



                            File:Houdini-Elephant.jpg
                             
                              Harry Houdini with his “disappearing” elephant, Jennie 
                                             (credit: Wikimedia Commons)



Life is constantly making demands on me to move and keep moving. I have to gamble on some models of reality just to live my life; I go with my best horses, my most successful and trusted concepts. And sometimes, I change my mind.

This flexibility on my part is not weakness or lack of discipline; it is just life. Bayesianism tells us Kuhn’s thesis in The Structure of Scientific Revolutions. We are constantly adjusting all our concepts as we try to make our ways of dealing with reality more effective.

And when a researcher begins to grasp a new hypothesis and the theory it is based on, the resulting experience is like a religious “awakening” – profound, even life-altering. Everything changes when we accept a new model or theory because we change. How we perceive and think changes. In order to “get it”, we have to change. We have to eliminate some old beliefs from our familiar background belief set and literally see in a new way.

And what of the shifting nature of our view of reality and the gambling spirit that is implicit in the Bayesian model? The general tone of all our mental experiences tells us that this overall view of our world and ourselves – though it may seem scary or maybe, for confident individuals, challenging – is just life.

We have now arrived at a point where we can feel confident that Bayesianism gives us a good base on which to build further reasoning. Solid enough to use and so to get on with all the other thinking that must be done. It can answer its critics – both those who attack it with real-world counterexamples and those who attack it with pure logic. And it outperforms Rationalism and Empiricism every time.

Bayesianism is not logically unshakable. But in a sensible view of our world and ourselves, Bayesianism serves well. First, because it makes sense when it is applied to our real problem-solving behavior; second, because it works even when it is applied to itself; third, because we must have a foundational belief of some kind in place in order to get on with building a universal moral code; and, finally, because – as was shown earlier – we have to build that code. That task is required of us. Without it, we aren’t going to …anything.

We are now at a good place to pause to summarize our case so far. The next chapter is devoted to that summing up.




Notes

1.       Thomas Kuhn, The Structure of Scientific Revolutions (Chicago: The University of Chicago Press, 3rd ed., 1996).








Monday, 25 November 2019


Chapter 9    The Practical Criticism of Bayesianism

In the first place, say its critics, Bayesianism simply can’t be an accurate model of how humans think because humans violate Bayesian principles of rationality every day. Every day, we commit acts that are at odds with what both reasoning and experience have shown is rational. Some societies still execute murderers. Men continue to bully and exploit women. Some adults still spank children. We fear people who look different from us on no other grounds than that they look different from us. We shun them even when we have evidence showing there are many trustworthy individuals in that other group and many untrustworthy ones in the group of people who look like us. We do these things even when experience indicates such behaviors and beliefs are counterproductive.

Over and over, we act in ways that are illogical by Bayesian standards. We stake the best of our human and material resources on ways of behaving that both reasoning and evidence say are not likely to work, and in fact, are often counterproductive. Can Bayesianism account for these glaring bits of evidence that are inconsistent with its model of human thinking?

The answer to this critique is disturbing. The problem is not that the Bayesian model doesn’t work as an explanation of human behavior and thinking. The problem is rather that the Bayesian model of human thinking and the behaviors driven by that thinking works too well. The irrational, un-Bayesian behaviors individuals engage in are not proof of Bayesianism’s inadequacy, but rather parts of a larger proof of how it applies to the thinking, learning, and behavior not just of individuals, but of whole communities and even whole nations.

Societies continually evolve and change because every society contains at least a few people who are naturally curious. Curious people constantly imagine and test new ideas and new ways of doing daily things like getting food, raising kids, fighting off invaders, healing the sick – any of the things the society must do in order to carry on. Often, other subgroups in society view any new concept or way of doing things as threatening to their most deeply held beliefs. If adherents of the new idea keep demonstrating that their idea works and that the reactionaries’ ways are obsolete, then the larger society usually marginalizes the less effectual members and their ideas. In this way, a society mirrors what an individual does when he finds a better way of growing corn or teaching kids or easing Grampa’s arthritic pain. In this way, we adapt – as individuals, but more profoundly, as societies – to changes in our environments, and to new lands and markets and new technologies such as vaccinations, cars, televisions, computers, and so on. Farmers, carpenters, teachers, healers, etc. who cling to obsolete ways are simply passed by, eventually even by their own grandchildren.

But then there are the more disturbing cases, the ones that caused me to write in my last chapter that we are almost completely devoid of any unshakable beliefs. Sometimes large minorities or even majorities of citizens do hang on to obsolete concepts and ways, in spite of mounds of evidence which say those ideas don’t work as well as the new ones others are using.

The Bayesian model of human thinking works well, most of the time, to explain how individuals form and evolve their basic idea systems. Most of the time, the model also can explain how a whole community, tribe, or nation can grow and change its sets of beliefs, thinking styles, customs, and practices. But can it account for the times when majorities in a society do not embrace a new way, in spite of the Bayesian observations and calculations showing the idea is sound and useful? In short, can the Bayesian model explain the dark side of tribalism?
                    
                              

   Tiedosto:Nazi party rally grounds (1934).jpg

        Nazi party rally, 1934. Tribalism at its worst (credit: Wikimedia Commons)




As we saw in our last chapter, for the most part, individuals become willing to drop a set of ideas that seems to be losing its effectiveness when they encounter a new set of ideas that looks more promising. They embrace the new ideas that perform well, that more effectively guide the individual, the family, or even their whole society through the challenges and hazards of real life. 

At the tribal level, whole societies usually drop paradigms, and the ways of thinking and living based on those paradigms, when citizens repeatedly see that the old ideas are no longer working and that a set of new ideas is getting better results. When your neighbors are producing bigger crops, you want to know how and why, and you want to implement new practices that work.

Sometimes, on the level of social change, this mechanism can cause societies to marginalize or ostracize subcultures that refuse to let go of the old ways. Cars and "car people" marginalized the horse culture within a generation. Assembly line factories brought the unit cost of goods down until millions who had once thought that they would never have a car or an icebox bought one on credit and owned it in a year. When assembly line factories came in, old, small-scale shops in which teams of ten men made whole cars, one at a time, were obsolete.

The point is that when a new subculture with new beliefs and ways keeps getting good results, and the old subculture keeps proving ineffectual by comparison, the majority usually do make the switch to the new way – of chipping flint, growing corn, spearing fish, making arrows, weaving cloth, building ships, forging gun barrels, dispersing capital to the enterprises with the best growth potential, or connecting a computer to the worldwide net.

It is also important to note here that, for most new paradigms and practices, the tests applied to them only confirm that the old way is still better. Most new ideas are tested and found to be less effective than the established ones. Only rarely does a superior one come along.

But the crucial insight into why humans sometimes do very un-Bayesian things is the one that comes next.

Sometimes, if a new paradigm challenges a tribe’s core beliefs, Bayesian calculations about what a society will do next break down. Sometimes tribes continue to adhere to obsolete beliefs. The larger question here is whether the Bayesian model of human thinking, when taken up to the level of human social/cultural evolution, can account for these un-Bayesian choices and actions.

Our most deeply held beliefs are those that guide our interactions with other humans – family, friends, neighbors, colleagues, fellow citizens, and foreigners. These are the parts of our lives that we usually see as being guided not by reason but by deep moral beliefs – beliefs grounded in sources much more profound than our beliefs about the physical world. In anthropological terms, these are the beliefs that enable the members of the tribe to achieve social solidarity – to live together without violence, interact, achieve teamwork, and get along.

The continued exploitation of women and execution of murderers described above are both irrational, but are both consequences of the fact that, in spite of our worries about the failures of our moral code in the last hundred years, much of that code lingers on. In many aspects of our lives, we are still drifting with traditional ways, even though our confidence in those ways is eroding steadily. We don’t know what else to do. In the meantime, these traditional ways are so deeply ingrained and familiar as to seem to us to be “natural”, in spite of mounds of evidence showing that they are counterproductive.

When we study the deepest and most profound of these “traditional” beliefs, we are dealing with those beliefs that are most powerfully programmed into every growing child by nearly all of his tribe’s adult members. These beliefs aren’t subject to the Bayesian models that usually govern the learning processes of the individual human. In fact, they are almost always viewed by the individual as being the most crucial parts of his culture and himself. They are guarded in the mind by programmed emotions of fear and anger. We get scared and mad when we think our values are being threatened. They are the beliefs that our parents, teachers, storytellers, and leaders enjoined us to hang on to at all cost. In fact, for most people in most societies, these beliefs and the morĂ©s that grow from them are seen as being “normal”. Varying from them is viewed as “abnormal”.

For centuries, in the West, our moral meta-belief – that is to say, our belief about our moral beliefs – was that they had been set down by God and, thus, were universal and eternal. When we took that view, we were in effect placing our moral beliefs in a separate category from the rest, a category meant to guarantee their inviolability. Non-Western societies do parallel things. 
                                                               


                           File:John Stuart Mill by London Stereoscopic Company, c1870.jpg

                                       John Stuart Mill (credit: Wikimedia Commons) 




But are our moral beliefs really different from our beliefs in areas like Science, Athletics, farming, cooking, or automotive mechanics?

The answer is “yes and no”. We are eager to learn better farming practices and medical procedures, and to win at track meets. But, in their attitudes about the executing of our worst criminals or the exploitation of women, many in our society are reluctant to change. Historical evidence shows societies can change in these areas, but only grudgingly. (J. S. Mill, a nineteenth-century British philosopher, discussed the obstinacy of old ways of thinking about women, for example, in the introduction to his essay, The Subjection of Women.1)

So, do these core beliefs – our values – still operate under the Bayesian model? Yes. But in a very harsh way. Sometimes, the moral beliefs that humans hold most deeply only get changed in an entire nation when experience shows by pain that the old values no longer work, i.e. when the values fail to provide guidelines by which the humans who hold them can make choices, act, and live their lives effectively. In the extreme cases, the values fail so totally that the people who hold the old values begin to die out. They become ill and die young, or they fail to reproduce, or they fail to program their values into their young. Or the whole tribe may be overrun. By one of these mechanisms, a tribe’s entire culture and value system can die out. The tribe’s genes may go on in children born from the merging of two tribes, the victors and the losers, but most of the losing tribe’s set of values, beliefs, and morĂ©s – i.e. its culture – become footnotes in History.

And so it is that, as the critics of Bayesianism point out, humans often do behave in ways that seem irrational by purely Bayesian standards. We fly in the face of what reason and evidence say would be our best policy. 

Even in our time, some adults still spank kids. Some men still bully women. Some states still execute their worst criminals. Research based on observation and analysis of these patterns of behavior says that they don’t work; these behaviors do not achieve the results that they aim for. In fact, they reduce the chances that we will achieve those results. These patterns of behavior and the beliefs underlying them exactly fit the term counterproductive

Why? Because our culture’s most profound programming institutions – the family, the schools, and the media – continue to indoctrinate us with these values so deeply that once we are adults, we refuse to examine them. Instead, our programming causes us to bristle, and then to defend our “good old ways”, violently if need be. If the ensuing lessons are harsh enough, and there is a reasonable amount of available time, a whole society can sometimes learn, change its ways, and then adapt. But deep social change is always difficult. Alfred Whitehead, in his 1927 essay Symbolism: Its Meaning and Effect, wrote:
… the major advances in civilization are processes which all but wreck the societies in which they occur.”2
                


   File:SQ Lethal Injection Room.jpg

                                 Lethal injection room, used to execute criminals  
                                              (credit: Wikimedia Commons) 




                                 
       
                                                       Alfred North Whitehead
                                    (credit: Internet Encyclopedia of Philosophy) 





It is also worthwhile to say the obvious here, however politically incorrect it may be: all our obsolete-but-obstinate beliefs, values, and behavior patterns did serve useful ends at one time. That is why we acquired them in the first place. 

For example, in some but not all early societies, women were taught to be submissive, first to their fathers and brothers, then to their husbands. The majority of men in such societies were thus rendered more likely to help to nurture the children of their socially sanctioned marriages because each man was confident the children born to “his” women were biologically his own.

Raising kids is hard work. In early societies, if both parents were committed to the task, the odds were better that those children would grow up, marry, have kids of their own, and then program into those kids the values and roles that the parents themselves had been raised to believe in. Other non-patriarchal societies taught other roles for men and women and other designs for the family, but they weren’t as prolific as patriarchy was over the long haul.

Patriarchy isn’t fair. But it makes babies who become adult citizens. Workers. Soldiers. Lots of them. This view of patriarchy is harsh, but real.

Traditional beliefs about male and female roles didn’t work to make people happy. But they did give some tribes numbers and power. They are obsolete today, partly because child nurturing has been largely taken over by the state (public schools), partly because no society in a post-industrial, knowledge-driven economy can afford to stifle half of its human resources (i.e. the female half), and partly because there are too many humans polluting this planet now. 

Population growth is no longer a wise goal because it no longer brings a nation power. In today’s world, millions of poor are more likely to be a liability than an asset for a nation. If they suffer too much, they might even start a violent revolution and unravel their own way of life, i.e. overthrow their government. 

Like patriarchy, all our traditional values, morĂ©s, and roles once served useful purposes. Many of them don’t anymore. But it is like pulling teeth without anaesthetic to get the reactionaries among us to admit that many of their cherished “good old ways”, in today's world, are only in the way.

But in general, in all areas of our lives, even those areas we think of as sacred, traditional, and timeless, we humans do change our beliefs, values, and patterns of behavior over time by the Bayesian way. The change may take a generation or two, but we eventually adopt a new view of reality and the human place in it if that new view is more coherent with the facts we are observing, and especially if our lives clearly do improve when we switch over to the new way (of growing food, making tools, curing diseases, etc.) Societies that won’t change die out.  

We’ve come a long way in the West, for example, in our treatment of women and minorities. Values do evolve. Our justice systems aren’t race or gender neutral yet, but they’re much better than they were even sixty years ago.

The larger point can be reiterated. For deep social change, we undergo the Bayesian decision process, but only in the most final of senses. Sometimes it’s not the individual who has to learn to adopt new beliefs, values, and morĂ©s; sometimes it is a whole community or even nation. And once in a while, a nation that simply gets culturally overwhelmed - by too much change too fast - dies out, as a nation/culture, completely. 

The El Molo ethnic group in Kenya is almost gone. The Canaanite, Bo, Anasazi, and Beothuk peoples are gone. Troy and Carthage are gone.

None of this is fair. It’s just over.


                              

                       File:Shanawdithit portrait.jpg

                                                Demasduit, last of the Beothuk
                                                (credit: Wikimedia Commons) 




In the more gradual adjustments that some societies manage to achieve, it also sometimes happens that subcultures within a society die out without the whole tribe dying out. Thus, some values and beliefs in the culture disappear while the larger culture itself, after sustaining trauma and healing, adjusts and goes on.

For example, Hitler and his Nazi cronies ranted until their last hour that their “race” should fight on until they all went down in a sea of blood because they had shown in the most vital of arenas, namely war, that they were weaker than the Russians. Hitler sincerely believed his Nazi philosophy. In the same era, the Japanese cabinet and High Command contained members who were adamant in arguing that the Japanese people should fight on, even in the face of hopeless odds. To do anything other than fight on was inconceivable to these men. (Yukio Mishima’s case was a curious last gasp of Japanese imperialism.3) Fortunately, people who could face reality and adapt prevailed, in both Germany and Japan.

                                      

                                     File:Mishima..jpg

                                   Yukio Mishima (credit: Wikimedia Commons) 




In a Computing Science metaphor, a culture is just the software of a nation. Or in another metaphor, we can say a culture evolves and survives, or else falls behind and dies, in ways that are analogous to the ways in which a genome thrives or dies. If a nation’s culture – that is, its software – gets good practical results over generations, its carriers multiply; if not, they don’t, and then they and it fade out of homo sapiens’ total population and culture pool. 

What was sad but true for centuries was that a culture’s fitness was sometimes tested by famine or epidemic or natural disaster, but most often it was tested by war. For centuries, when a tribe, operating under its culture, was no longer tough enough to hold its territory against invasions by neighbouring tribes, it fought and lost. Its men were killed, its women and children were carried off by the enemy; its way of life dwindled and was absorbed, or in some cases, vanished entirely. Thus, Joshua smote Hazor, the ancient Greeks crushed Troy, the Romans crushed Carthage. The examples could go on.

                  


   File:Ruins of Carthage.jpg

                                         Ruins of Carthage in modern Tunisia 
                                                  (credit: Wikimedia Commons)





But was Hitler right? Is war inevitable? Even desirable? It depends. The key question is whether we will ever rise above our present, mainly war-driven system of cultural evolution into living by something even more effective. I think it is clear that we have to. Our weapons have grown too big. We have to learn a new way if our species is to live. By reason or suffering or both, we are going to have to arrive at a new way of regularly updating our values and our patterns of group behavior. Either war is obsolete, or we are.

Changes in our circumstances are always coming at us. Some of them we even cause. We can cushion our way of life against them for a while, but over time, reality demands that we either evolve or die out, and in this case, “evolve” means “update our culture”. However, for now, I will leave the war digression and the sociocultural mechanism of human evolution to be more thoroughly discussed in later chapters.

For now, then, let’s settle for saying that the point critics of Bayesianism make about the way in which some human behaviors do not seem to be based on Bayesian types of calculations only looks at first like a successful criticism. If we study the matter more deeply, we see that we do indeed have attachments to some of our most counterproductive values and morĂ©s, but there are reasons for those attachments. Repulsive, warmongering programs that lie embedded deep in us. They are software design features that have become design flaws because they have long since fallen out of touch with the physical reality that surrounds us and with the dilemma in which we find ourselves. As Kennedy said, “Mankind must put an end to war or war will put an end to mankind.”4
                                                

                           


                           File:John F. Kennedy, White House color photo portrait.jpg

                           John F. Kennedy, 35th president of the United States
                                                (credit: Wikimedia Commons) 





The point to be drawn from this chapter then is simply this: the Bayesian model of human thinking still holds. Bayesianism can explain why humans hold on to backward, obsolete ideas. Deeply held beliefs and morĂ©s do get changed by the Bayesian way, but nearly always this change has come by national-scale pain – war, famine, or plague – over generations. In modern times, we must do better. 

I will have more to say on these matters in later chapters. The first big criticism of Bayesianism has been dealt with. The Bayesian model, when it is applied at the tribal level of human behavior, can fully account for the apparently un-Bayesian behaviors of individuals. 

I now must go on to deal with the second large criticism of Bayesianism, the theoretical one.

And perhaps this is the point at which I should also say that the next chapter is fairly technical, and it isn’t essential to my case. If you want to skip a chapter, the next chapter is one you can skip and still not lose the train of thought leading to the conclusion of the full argument.



Notes

1. John Stuart Mill, The Subjection of Women (1869 essay). The Constitution Society website. http://www.constitution.org/jsm/women.htm.

2. Albert North Whitehead, Symbolism: Its Meaning and Effect (University of Virginia: Barbour-Page Lectures, 1927).

3. Biography of Yukio Mishima, Wikipedia, the Free Encyclopedia. Accessed April 8, 2015. http://en.wikipedia.org/wiki/Yukio_Mishima.

4. John F. Kennedy, Address to the United Nations General Assembly, New York, NY, September 25, 1961.