The Signal and The Noise Why So Many Predictions Fail–but Some Don’t by Nate Silver is the 2012 best-seller from the then New York Times columnist who now runs the FiveThirtyEight election analysis and prediction site in the US.
TL;DR – This is a great read that explains why and how you need to mix statistics and experience in your predictions and decision-making. It can feel a little technical and heavy on the math at first glance, but Silver explains everything clearly, making it an easy read, given the subject matter.
Who is Nate Silver?
From his Amazon bio: “Nate Silver is a statistician, writer, and founder of The New York Times political blog FiveThirtyEight.com. Silver also developed PECOTA, a system for forecasting baseball performance that was bought by Baseball Prospectus. He was named one of the world’s 100 Most Influential People by Time magazine. He lives in New York.”
Prediction versus forecast
A vital distinction Silver makes in the book, and one we should pay more attention to as risk managers, is between predictions and forecasts. In the chapter on earthquakes, Silver notes the US Geological Survey (USGS) differentiates between the two:
A prediction is a definitive and specific statement about when an earthquake will strike: a major earthquake will hit Kyoto, Japan, on June 28.
A forecast is a probabilistic statement usually over a long period of time: there is a 60% chance of an earthquake in Southern California over the next 30 years.
Being clear on these distinctions will stand us in good stead as we discuss risks because it is too easy to mistake an evaluation of a threat (which would be a forecast) for a prediction. This is why the term ‘a 100-year storm‘ causes confusion: it’s too easy for people to think that we are counting down to some point 100 years in the future.
We’ll return to the importance of probabilistic data and prior histories shortly when we discuss the section on Bayes’ Theorem, but making this distinction between predictions and forecasts is essential.
Fundamental errors we make
A large proportion of the book explains how and why we get predictions wrong. Silver uses a range of case studies to show how biases, misunderstandings, ignorance, and even lousy math can cause us to be wildly inaccurate. The explanations of biases and heuristics are brief and assume that you have some understanding of how these affect decision-making already. (For a deeper dive into how we think about risk and make decisions, I’d recommend Thinking Fast and Slow by Daniel Kahneman and the work of Gary Klein.)
Where the book really shines, in my opinion, is how Silver tackles the statistical aspects of decision-making which is unsurprising as this is his area of real expertise both professionally (as the founder of FiveThirtyEight election prediction site) and personally (as a top-tier poker player and baseball stats nerd). Importantly, you don’t need to be a maths whiz to keep up with his explanations as he skates over the top of a complex field. And it’s worth persevering as there are some examples he uses, which are situations we might all find ourselves confronted with at work. His example of how the banks miscalculated the risks associated with CDOs in the run-up to the 2008 financial crisis is equally clear and terrifying. (*Collateralized debt obligations that pooled vast numbers of subprime mortgages.)
The pros and cons of data
This kind of example reflects another broad trend in the book: how data is equally helpful and dangerous in precision and forecasting. Well-curated, thoughtfully contextualized data will provide the signals we need to make more accurate predictions. Conversely, poorly-complied or incomplete data sets will be useless or, worst of all, lead us in the wrong direction. Sometimes, these errors are made despite the best intentions, but data can also be cherry-picked to support an existing narrative.
Making matters even more complicated is that the signal is often buried in the noise, which grows exponentially as societies and businesses become more complex. Worst of all, Silver observes that the critical signs are often only noticeable in retrospect as this can be drowned out by other signals
If you’re not a fan of math(s) the section on Bayes’ theorem might be something you’d prefer to skip or at least skim quickly.
Don’t do that.
I’ve heard Bayes’s theorem mentioned in all kinds of situations without any attempt to explain it and, when I looked it up online, the explanations were bewildering and I was none the wiser. Thankfully, that’s not what happens here. Silver explains the underlying philosophy and the mathematical framework simply and understandably.
If you recall, we noted earlier that probabilistic data is helpful for forecasting but with Bayes’s theorem, we also see how it affects predictions. In short, the frequency of prior events has a powerful influence on the probability that a similar event will occur under similar conditions. That means that something where the split between two options is 50/50, may be significantly skewed in one direction based on precedent. Equally, even a high likelihood of an outcome would be considerably less if the historical record is low.
Understanding this methodology is essential for decision-makers who use mathematical models or those of us who have to interpret the forecasts we’re hearing. Similar to the CDO example above, understanding the underlying math is essential to determining the quality of the result so you can use the data appropriately.
Again, even if you aren’t a ‘fan’ of math, spend some time in this section and enjoy a clear, simple explanation of something that, up until now, has never been explained clearly (to me, anyway).
(As an aside, this is another section of the book where I want to spend more time thinking about the simple application of this model in risk assessments and analysis.
The need to merge statistics and emotions
Silver’s message is that we need to merge rigorous statistical analysis with emotional insight, and he pulls this off well using some excellent examples. From how baseball teams combine statistical models with the observations of talent scouts, to how sports bettors use Bayesian models of a team’s performance just as much as they scour the players’ social media accounts, he shows how you can merge these two approaches effectively. And while this isn’t necessarily novel, he makes a strong case for not replying on one system alone, even in instances where being 100% data- or instinct-driven would seem the best approach.
He also explains how, when making predictions for US political races, her couples the statistical model with the insights he gains from interviewing candidates and observing their behavior.
Getting this balance between objective data and subjective analysis is the key to making successful predictions but, in the end, there is no secret formula. Silver admits the difficulties in getting this balance right but does note that the quality of your predictions can improve over time if you record and review your predictions regularly. This helps you see where the signals and analysis were pointing you in the right direction and where you misinterpreted things, allowing you to correct your analysis in the future. Notably, he reminds us that a prediction can still be ‘good’, even if the predicted event doesn’t occur. After all, a 70% chance of something happening still means there’s also a 30% chance it won’t.
(As an aside, there may be some way to determine the right mix of objective / subjective based on the situation. Something more objective and statistically ‘clean’, say how often aircraft crash, could benefit from being more data-heavy. In contrast, voting patterns which are heavily influenced by our emotions would be better predicted with a higher proportion of subjective weighing. This is just an initial thought and I need to do more research and thinking about how to strike this balance.)
For all of its depth, The Signal and The Noise is a relatively easy read and moves fairly quickly with well-told stories and interesting anecdotes. I’m not a fan of books that are too story-heavy, and Silver gets the mix about right here as the stories support and illustrate the theories instead of being loosely associated. I enjoyed his simplification of the mathematical aspects of making bets and playing poker, as I was completely ignorant of both.
No doubt, there are parts that are not absolutely correct technically, and a lot of nuance will have had to be jettisoned, but that is the price of simplification of which (I am obviously a proponent).
The ending is fairly abrupt, and the conclusion doesn’t wrap up the book’s big ideas as I had expected, so if you were hoping to get a good sense of the book from the last 20-pages, you’d be disappointed. And, as the book is almost a decade old, some comments on the likelihood of a global pandemic or the degree of political polarization in the US seem almost quaint. Nevertheless, it’s an excellent read for any risk manager or decision-maker who wants to see how to use both data and emotions when making predictions. Quants will get an insight into how our brains process decisions, while social scientists will better appreciate how to use and interpret data effectively.
Want more book reviews and recommendations? See my recommended reading list and lots more review here ?