Great Clarendon Street, Oxford, OX2 6DP, United Kingdom
Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by licence or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above
You must not circulate this work in any other form and you must impose this same condition on any acquirer
Published in the United States of America by Oxford University Press 198 Madison Avenue, New York, NY 10016, United States of America
British Library Cataloguing in Publication Data Data available
Library of Congress Control Number: 2021937968
ISBN 978–0–19–886902–3
DOI: 10.1093/oso/9780198869023.001.0001
Printed in Great Britain by Bell & Bain Ltd., Glasgow
Links to third party websites are provided by Oxford in good faith and for information only. Oxford disclaims any responsibility for the materials contained in any third party website referenced in this work.
Individual–Collective
Randomness–Meaning
Foresight–Hindsight
Uniformity–Variability
Disruption–Opportunity
To Beverley, my wife. For sharing the journey, with all its challenges and chances.
PREFACE
Writing about uncertainty during a pandemic
The bulk of this book was written during 2019, before the outbreak of the virus that led to the Covid-19 pandemic which made 2020 such an extraordinarily difficult year. To someone writing a book about how people understand risk and uncertainty, this posed a challenge. How much of the text would have to be rewritten? I could see myself adapting examples, passages, even adding whole chapters to accommodate the disease and our response to it. On reflection, though, I decided that wholesale revision was a bad idea. If the ideas I had expressed on chance and uncertainty were right before the Covid-19 outbreak, then they were right after it, too.
Perhaps the world is now better informed about risks and more sensitive to the kinds of issues that can arise from different ways of understanding risk. But there is still little consensus on how to assess levels of risk, and the most appropriate personal and public responses. In my opinion, the pandemic has only increased the need for a better appreciation of chance in public affairs and shown how difficult clear thinking about uncertainty can be. Nowhere is this more evident than in considering the choices to be made between individual rights and collective responsibility. One of the themes in this book is the difficulty of reconciling contrasting ways of thinking about probability. Should I act in a way that best serves my own interests, exercising my own judgement and evaluation of risk, taking into account personal factors, or should I rather follow the rules, approximate and imperfect though they may be, even if I feel that my circumstances are exceptional? These issues have been highlighted by the choices we have all needed to make, and they will still be relevant when the crisis has passed. We will always have to deal with the challenges posed by uncertainty. I did not want this book to become primarily about this pandemic
or for its general themes to be overshadowed by the particular issues the year 2020 has raised.
While there has been no wholesale revision of the book, Covid-19 has certainly left its mark. Where it seemed necessary to add something, I have made small changes to the text to include references to the virus in examples and in analysis. The major change is at the beginning of the chapter ‘What’s there to worry about?’. It seemed fitting to devote that opening to a snapshot description of the state of the outbreak in late February 2020, and my personal worries at that moment in light of the themes of the book. I have not revised that section: it can stand as an example of how we regard risks differently in prospect and in retrospect.
Peper Harow July 2020
INTRODUCTION
Living in an Uncertain World
The laws of probability, so true in general, so fallacious in particular.
Sucker bet or sure thing?
Edward Gibbon
You’re standing at the hotel bar. Not far along the counter is a well-dressed gentleman with a whisky in front of him who is quietly flicking through a deck of cards.
‘Funny thing,’ he says idly.
‘What?’
‘Did you know that the court cards—the king, the queen and the jack—are rather pushy characters and always turn up more often than it seems they should?’
What nonsense, you think to yourself. ‘Are you sure?’ you say. ‘That can’t be true.’
‘Oh yes,’ he responds, ‘look here. I’ll take the hearts from this pack. Thirteen cards. Ace to 10, and the jack, queen and king. I’ll shuffle them. Now, you choose any three cards from these 13. If you can avoid all three court cards, you win. If jack, queen or king shows up, I win. Of course, there’s a chance you might win, but I’m betting you’ll lose.’
Three cards, you think. That should be easy. The odds are definitely with me at the start, but even after I’ve drawn two cards, there will still be eight noncourt cards and only three court cards left. I’d still have a great chance of avoiding those, even on the third draw. Seems like it might be a good bet.
Well, do you take the bet or not? What do you reckon the chances are?
Answers at the end of this introduction.
A matter of chance; a matter of life or death
In 1999, a British solicitor, Sally Clark, was convicted of killing her two children in 1996 and 1998, at the ages of 11 weeks and 8 weeks respectively. The first child’s death had been ascribed to Sudden Infant Death Syndrome (SIDS), but when a second death occurred in the family, the mother was charged, and after a long and involved trial, she was convicted.
The medical evidence presented was confusing and sometimes contradictory. One piece of non-medical evidence attracted attention. A consultant paediatrician, Sir Roy Meadow, presented as evidence a calculation that the chance of two SIDS deaths in the same family was around 1 in 73 million. He likened this to the chance of winning a bet on a rank outsider running in the Grand National horse race for four years in succession. This probability—or rather improbability—when presented as a compelling and dramatic comparison stood out in contrast to the confusing medical evidence, and is thought to have had considerable influence on the jury.
There was just one problem: it was wrong. The Royal Society of Statistics later prepared a report on the case, pointing out the statistical errors. When further information came to light that a forensic pathologist had withheld evidence, Sally Clark’s conviction was quashed in 2003. Four years later, she died of acute alcohol poisoning.
The misleading statistical evidence had arisen from an elementary error in reasoning about probability. Meadow had taken a single statistic from a study that reported that, for a household like Sally Clark’s, the prevalence of SIDS was 1 in 8,543. He interpreted this as a chance of 1 in 8,543 that a child in such a household would suffer SIDS, and to calculate the odds of this happening twice in the same household, he multiplied that figure by itself, to get to a chance of 1 in 72,982,849. Several errors of statistical reasoning were made, but the most important is one that any student of statistics should easily have spotted. It is correct that, to
reckon the combined probability of two independent events occurring, you should multiply together their individual probabilities. But take note: this rule only applies to independent events, where there are no plausible linking factors. Although the underlying cause or causes of the deaths of the two infants was not, and is not, known, it is strongly plausible that there was some common linking factor (or factors) in that family, or that household, that contributed to the deaths. The events were not independent, and so the calculation of the probability of two deaths as presented was incorrect. And because of the colourful way it was explained, it carried weight.
Sir Roy Meadow got his statistics wrong. Perhaps more worrying is that he was allowed to get his statistics wrong, and the error was neither detected by the judge, nor by the prosecution team of lawyers, nor even by Sally Clark’s defence team. Even at the first appeal to the conviction, the statistical errors were dismissed by the appeal judges as of little significance.
Reasoning about chance is difficult. It is a subject where your intuition can easily be misled, and where there are few natural touchpoints to validate your conclusions. A definite calculation, flashily and confidently presented, carries a lot of weight. But if we want to be smarter in the way we reason about uncertainty, and avoid making errors, we need to understand where the difficulties arise.
Chance is everywhere
The true logic of this world is in the calculus of probabilities.
James Clark Maxwell
Even when we have a measurement of chance, it is not always easy to understand what the numbers mean. Some examples:
• On the eve of the 2016 US Presidential Election, the polling aggregator website FiveThirtyEight reported that their election model rated Donald Trump’s chance of victory as around 29 per cent. At the same time, the Princeton Election Consortium had Trump’s chances at around 7 per cent. Were the websites wrong? What does it mean to assign a percentage chance to a one-off event like this? After all, an election is not a repeatable event like the flipping of a coin.1
1 In fact, running the election very many times is close to what the polling models at FiveThirtyEight.com do. Not the actual election, of course, but their constructed mathematical model of the election, including many chance elements.
• I look at the weather app on my smartphone. For 11.00 tomorrow it notes a 20 per cent chance of rain. What does this mean? Is it a 1 in 5 chance that at least one drop of rain will fall in my locality in the hour after 11.00, or does it mean that I can expect 12 minutes of rain in that hour? Or does it mean something else entirely?
• In the UK National Lottery main draw there is a 1 in 45 million chance that the numbers that come up will match my chosen six numbers. What am I to make of such a huge number? How should I understand such a small probability?
• The Covid-19 pandemic generated a flood of information about the risks involved, and what might need to be done to keep people safe. How could I tell reliable information from rumour? What was I to make of it when different authorities issued different advice?
• We hear that many species of insect are at risk of extinction. But how big is that risk? What are the consequences? How sure are the scientists? How worried should we be? And, perhaps more to the point, what actions might have a chance of addressing the problem?
We engage with uncertainty everywhere. We play childish (and not-so-childish) games that rely on chance. The stock market behaves (at least in part) as a random process. Medical diagnoses are couched in terms of probability and treatments are recommended based on statistical measures of their efficacy. Discoveries and artistic inventions happen when random influences bump up against each other.
Striving to be rational and numerate, we attempt to get a grip on this by measuring the uncertainty and assigning numbers to the chances. But, as the Sally Clark case illustrates, few non-specialists are thoroughly confident in their understanding of how chance works. Our theory of probability is relatively recent. There was no coherent way to measure uncertainty in the ancient world, and it was not until the sixteenth and seventeenth centuries that the mathematical foundations of probability were first clearly articulated. Systematic thinking about chance does not come naturally, so perhaps it should not be surprising that it still remains difficult for most of us.
For there is more to the way we think about chance than simply understanding the numbers: our feelings about uncertainty turn into hopes and fears. The statistician in me says that such emotive responses represent barriers to clear thinking. But I know—and perhaps we all know—how our own hopes and fears
can distort our capacity for rational assessment. We live in a time where manipulation of sentiment is rife, and injurious to public debate. ‘Fake news’ deliberately plays on those hopes and fears.
Anti-vaccination campaigners, to take one example, exploit uncertainty; they amplify parents’ understandable fears of doing harm to their children while downplaying the risk to the health of the population at large. Paradoxically, this increases the danger for all children. We see tensions of this kind over and over again when we look at the public presentation of uncertainty. News programmes clearly understand that the testimonies of individuals draw better audiences than the summaries of statisticians.
Modern politics is driven by fear, often based on uncertainty. A measured argument made in cautious terms, expressing a reasonable degree of doubt, is dismissed with the words ‘but you can’t be sure, can you?’ Journalists tempt their interviewees into expressing greater certainty than is wise: ‘Minister, can you guarantee that…?’ These attempts to elicit absolute guarantees are in vain. The minister will either hedge, and appear weak, or they will lie, and risk being exposed when their guarantees prove to be worthless. Most of the time, complete certainty is impossible. It is better by far to develop a clear-sighted appreciation of risk and chance, and understand how they may be measured and weighed.
When an argument depends on clear facts, it should be straightforward to confirm the facts and settle the matter. But much public debate goes beyond clear facts. Predictions of rising global temperatures are probabilistic in nature. Concerns for the extinction of species are based on statistical surveys. The evidence may be strong, clear, and compelling, but few experts would claim to know absolute truth, or to be able to predict the future perfectly. In part, this is because how things turn out depends on our actions: a disaster forestalled by informed action does not mean an incorrect prediction.
Where arguments are based on likelihoods, probabilities, and predictions, there is never any chance of final certainty, and everything is a question of balanced judgement. Clear reasoning about uncertain matters is essential, and for that we need to quantify uncertainty. The scientific method always leaves room for doubt and challenge: that is part of its strength. But falling short of absolute certainty is sometimes presented as a fatal weakness. We need to be able to understand the workings of chance, how to quantify risks and opportunities, how much to trust what we hear, and so how to successfully navigate this uncertain world.
Thinking about uncertainty
I am often asked why people tend to find probability a difficult and unintuitive idea, and I reply that, after forty years researching and teaching in this area, I have finally concluded that it is because probability really is a difficult and unintuitive idea.
David Spiegelhalter
This book ranges through a variety of topics in exploring different ways that chance operates. Despite the varied contexts, though, in trying to understand why chance is so tricky to grasp, there are some patterns of thinking that keep recurring. Here’s how I’ve organized them for this book.
Five dualities
Often, when thinking about chance, we find two points of view tugging our thoughts in different directions. These dualities (as I term them) are confusing, since the contrasting perspectives may both be valid. The tensions that are created go some way to explaining why chance can be so slippery to grasp. It’s not that one way of thinking is right and the other is wrong: to think clearly about chance, you need to be aware of both perspectives, and hold them both in your mind at the same time.
Individual–Collective
Probability makes most sense when big numbers are involved. To understand the patterns that lie behind the uncertainties of life, we need data—and the more of it, the better. And yet we live our lives as individuals, as a succession of unique moments. It is hard, sometimes, to reconcile conclusions drawn from collective statistics with individual circumstances.
Randomness–Meaning
We love to make sense of what happens, and we baulk at the thought that some things happen randomly, just by chance. We want stories that explain the world, and we want ways to control what happens. Religion and science are both responses to our urge first to explain and then to control what would otherwise seem to be the meaningless operation of pure chance.
Foresight–Hindsight
Things that in prospect seemed uncertain and unpredictable can, in retrospect, appear inevitable. Hindsight is a powerful filter that distorts our understanding of chance. We are bad at prediction and very good at after-the-fact explanation: but having those explanations doesn’t always mean that we will be able to avoid the next random shock.
Uniformity–Variability
The world is complex, and to understand it, we need to generalize and simplify. We often pay insufficient attention to the variety of everything around us, and are then surprised when things don’t always conform to our neat, averaged expectations. We need to understand better what falls within normal ranges of variation, and which outliers genuinely merit our attention.
Disruption–Opportunity
We usually experience chance through the negative way it often affects our lives. Accidents and mistakes disrupt our plans and thwart our expectations, and this can be costly. But there are many ways in which chance can benefit us. Chance breaks stale patterns of behaviour and experience. Chance makes new opportunities, and allows new combinations. And wouldn’t life be dull if all was certain?
I’ll focus on each of these dualities in its own right in short sections tucked between the longer chapters of the book. And throughout the text I’ll point out where one of them seems to throw particular light on the subject at hand.
What
this
book is not, and what it is
This is not a textbook. It won’t turn you into an expert wrangler of statistics and probability. The aim is rather to develop your intuitive feel for the operation of chance. We often need to make quick decisions based on gut reactions. But when it comes to thinking about chance, our intuition is often wrong. That’s why this book contains a series of explorations of the way in which chance works, the way it affects us, and the way we can use it to our benefit.
Nor is this a very mathematical book. Chance is studied by mathematicians, in the form of probability theory, but chance is also part of our lives in many other ways. Chance is embedded in our language: how we talk about it affects the way we think about it, and so I have indulged my enthusiasm for etymology and paid attention to the words we use when we talk about uncertainty.
Probability and statistics often go hand in hand, but in this book, the focus is on probability. That’s not to say that I’ve entirely neglected statistical thinking; after all, it is an essential part of how we make sense of the randomness around us, but the weight of attention here falls on probability.
I touch on many subjects, all linked by the common factor of chance. There’s a lot about gambling, but I hope you won’t treat this as an invitation to gamble, still less as an instruction manual. There is discussion of chance in financial markets, but this is not a guide to investment. There’s mention of disease and medical conditions, but any medical content should not be taken as authoritative. There’s a section on genetics and evolution, but written from an angle that highlights the role that chance plays. I touch how computer algorithms exploit ever-growing data sets to arrive at more or less useful conclusions which are not certain, but merely probable, but this is no recipe book for machine learning.
I am treading on the turf of dozens of specialisms, and I can only hope that, in my enthusiasm to follow the trail of chance and randomness, I have not offended too many specialists. For one of the purposes of the book is precisely to go cross-country, to show that there are connected ways of thinking that disrespect boundaries and cut across the domains of finance and gambling, and genetics, and creativity, and futurism. I hope to take you to a few vantage points from which you can get a broad view of the landscape, and see how these different areas of life and knowledge are connected.
Most importantly, however serious the subject, I hope that you will find the book an entertaining read, and that it might provoke you to think about the role that chance plays in our lives, how we can strike a balance between hope and anxiety, how we can avoid the risks that are avoidable, and how we can make the most of the chances that life offers.
Getting technical
The aim of this book is to explore the subject of chance, how it affects our world and our lives, and how we think about it. I hope that almost all of the material will be easily understandable by most readers. I want to show the ways of thinking that lead to an understanding of chance, and not just a bunch of formulas and
Still, there will be times when a technical explanation will provide a little something extra for the interested reader, and in this book, they will be marked by ‘Getting technical’ sections like this one. Feel free to skip over these technical sections: you will lose nothing essential to the arguments being made. But if you want a little more of a technical diversion, dive in!
Talking about chance
Like other occult techniques of divination, the statistical method has a private jargon deliberately contrived to obscure its methods from non-practitioners.
G. O. Ashley
It sometimes seems as if the language of chance is designed to confuse. Chances are, it is. Probability is a slippery fish. People at various times and for various purposes have tried to catch it in a net of words, and they’ve gone about it in different ways. The words they use to describe chance may come from different vocabularies, and because the ideas can be rather hard to grasp, they acquire a mystique. The language used becomes a protective jargon, shared by those in the know and obscure to the outsider.
A statistician’s ‘expected value’ may not be what you expect, and their ‘high level of significance’ might not relate to anything that seems very significant. A financial trader will talk of ‘call’ and ‘put’ options when they might just as well say ‘buy’ and ‘sell’. A physicist might characterize the result of an experiment in terms of ‘sigmas’ as a measure of confidence. In this book I will try to navigate these shoals of language, by using language that is as plain as possible and trying to explain specialist terms when we need them.
The way we choose to express probability depends on the context. If we’re thinking about predicting the future, be it in regard to the weather or an impending election, it’s often in terms of percentages. On the other hand, if we are thinking of populations and of how many cats prefer one brand of cat food to all others, we often talk in terms of proportions using whole numbers: 9 out of 10. If you have a gambling frame of mind, you might talk in terms of odds. A medical researcher may talk about the results of her experiment in terms of a p-value. When the particle physicists at CERN talk about how certain they are of having discovered something new, they count sigmas (standard calculations. So wherever possible I will provide an intuitive and reasoned explanation rather than a technical one.
deviations)—convention says that you need five sigmas to announce a discovery. Forecasts of future global temperatures will show widening ‘funnels of doubt’ that express how uncertainty increases the further into the future the projection extends. Scientists attach error bars to their published results to show their degree of confidence in the results.
Representing chances
If we want to compare how chance works across different fields, it will be helpful to use consistent methods of expressing and representing probability. Let’s take a concrete example. Imagine yourself at a roulette wheel in a casino, and you bet on a single number. What are the chances that you will win on any spin of the wheel?2 Here are some ways of showing the chances.
• Probability: 0.02632
• Percentage: 2.632%
• Proportion: 1 in 38
• Fair betting odds: 37 to 1
Sometimes it will be helpful to show this graphically:
• Proportion: 1 in 38
• Fair betting odds: 37 to 1
Another way to illustrate this is to show its effect when applied to a number of items, multiple times. For example, suppose 100 independent roulette wheels were spun 50 times. How many times would you expect your number to come up?
2 Wherever roulette is mentioned in the book, I have used for my examples the so-called American wheel, which includes both a ‘0’ and a ‘00’. This wheel was the original form used in Europe, and the double zero was dropped to make a more attractive game (for the players) by François and Louis Blanc, managers of a casino in Bad Homburg, in 1843. The modern European or French wheel has a single zero, which more or less halves the advantage that the casino has.
The chances are:
• For 100 wheels, over 50 spins, each with probability 2.632%, on average there will be 131.6 wins.
Then, as a simulated example:
• 100 wheels × 50 spins = 5,000 chances for wins
• Observed wins in this example: 139 = 2.780% (expected average: 131.6 = 2.632%)
In the diagram above, the 100 vertical columns might represent the 100 roulette wheels, while the 50 horizontal rows represent the 50 spins for each wheel. The dark red cells represent wins on single-number bets, theoretically averaging 1 in 38. In the example simulation shown, as chance would have it, there are 139 wins, which is about 1 in 36—slightly more than average. The relative density of red cells gives a visual idea of how common or rare wins are, while the unevenness of the distribution shows how randomness can easily appear surprisingly ‘clumpy’. Random does not mean evenly mixed. Note that in all these illustrations, the ratio of red to grey represents the ratio of wins to losses at the roulette wheel.
Sometimes the random events we are interested in are not ones that can be repeated, and we want to show what happens when a chance ‘hit’ is fatal. It’s reckoned that in the Battle of Britain, each time a pilot flew a sortie, there was a 3 per cent chance that they would not make it back. Here’s how we can show examples of probability of this kind, where repetition is out of the question: one hit removes everything remaining in the column. In this example, we assume that each pilot flies a maximum of 30 missions:
Read this diagram upwards: starting at the bottom with 100 pilots, the red cells mark the fatalities, the grey cells represent missions survived. So the grey columns that extend all the way to the top mark those 41 pilots who (in this simulation) survived all 30 missions.
Notable chances
In my book Is That A Big Number? I introduced the notion of a ‘landmark number’, a number worth remembering because it provides a mental landmark. These landmarks are useful as comparisons or mental measuring sticks to help put other numbers into context.
Understanding probability is more complex than simply grasping a single measurement, so in this book I use an adaptation of that idea. At various points in the text, I identify what I call ‘notable chances’, and use some of the techniques above to show the probability in numbers, words, and illustrations.
‘Chance’
English offers us many words to talk about the uncertainties around us: risk, hazard, fortune, luck, peril, randomness, chaos, uncertainty. Each has its nuances and its specific connotations. For this book I have settled upon ‘chance’ as the central concept. It’s broad in its application and evenly balanced between good and bad associations. So:
chance (n.)
‘something that takes place, what happens, an occurrence’ (good or bad, but more often bad), especially one that is unexpected, unforeseen, or beyond human control, also ‘one’s luck, lot, or fortune’, good or bad, in a positive sense ‘opportunity, favourable contingency’; also ‘contingent or unexpected event, something that may or may not come about or be realised’. From
cheance (Old French) ‘accident, chance, fortune, luck, situation, the falling of dice’. From cadentia (Vulgar Latin) ‘that which falls out’, a term used in dice, from cadere ‘to fall’. From kad- (Proto-Indo-European root) ‘to fall’
So I see that the language itself is guiding me to a natural starting point. For it all begins with the playing of games.
Answer to:
Sucker bet or sure thing?
You have a shuffled deck of 13 cards before you, containing all the hearts: one each of the ranks ace to 10, jack, queen, and king. For you to win this bet, three events must happen.
• First, you must select 1 card from the 13, and it must not be the jack, queen, or king.
○ The chances of this happening are 10 in 13 (roughly 76.9%)
• If that happens, then you must select 1 card from the remaining 12, and it too must not be the jack, queen, or king.
○ The chances of this happening are 9 in 12 (75%)
• If both of those happen, then you must select yet another 1 card from the remaining 11, and it too must not be the jack, queen, or king.
○ The chances of this happening are 8 in 11 (approximately 72.7%)
These events are independent of one another. To work out the chances of all three of them happening we must multiply probabilities. Either multiply the chances expressed as fractions: