Book Review of The Signal and The Noise by Nate Silver

This post may contains affiliate links. If you click and buy we may make a commission, at no additional charge to you. Please see our disclosure policy for more details.

This Book Review of The Signal and The Noise by Nate Silver is brought to you from Chris Duin from the Titans of Investing.

Genre: Business Planning and Forecasting
Author: Nate Silver
Title: The Signal and The Noise (Buy the Book)


The Signal and the Noise by Nate Silver is centered on the art of prediction and understanding the necessity of isolating the quality data a researcher needs. He examines the art of prediction in various sectors such as baseball statistics, poker, terrorism, and the economy, to name a few.

Within each sector, Silver explains how predictions are typically made and what the resulting pros and cons are. He will very often offer up his own interpretation because Silver himself claims to be a great forecaster. In addition, he interviews many esteemed names in the world of forecasting in order to get their ideas on prediction.

Writing is a gateway to presence. And so much more! Start a book blog to pursue huge profits, enriching presence, meaningful work.  these tips  helped us earn $5,400+ in December 2018.

The idea of two types of forecasters is conveyed through the analogy of the foxes and hedgehogs, and a distinction being drawn between the two. Hedgehogs tend to focus on big ideas, while those who are foxes focus on the more granular details.

Hedgehogs make for good television because they tend to make more radical predictions, but oftentimes these predictions are faulty and skewed due to the fact that they often overlook critical data, and inject their own personal beliefs as well. With such a high accuracy in drawing predictions, foxes make for much better forecasters than hedgehogs.

The reason for high accuracy in prediction-making is that foxes don’t overlook important details just because they are small. Although many foxes will not be seen on television, it is pertinent to note that in order to make precise predictions, one must think like a fox.

Another concept that is clearly emphasized by Silver is Bayes’s theorem. While this Brief thoroughly discusses Bayes’s Theorem, essentially it conveys to its audience the necessity to always evaluate new evidence as it arises and then adjust the currently used probability estimates accordingly. As you will see, Silver applies this method to nearly every situation where a prediction must be or will be made.

A plethora of different fields is taken into consideration where prediction is utilized and the state of prediction in some fields is much better than others. As the results show weather prediction is one of the great success stories in this book.

From where the industry has come since the 1970s to where we are today, weather prediction has drastically improved and is quite reliable. On the other hand, predicting earthquakes and the effects of global warming are quite the opposite. At this time we cannot make accurate predictions in these two sectors and most likely won’t be able to until there are new scientific discoveries of some sort.

Predicting the stock market is somewhat of a gray area with respect to its effectiveness. Silver explains that you can make reasonable predictions, but it’s difficult because the market is always changing. If you think you’ve found a discrepancy in the market, chances are lots of other people have also found it and the collective reaction to that discrepancy eliminates any benefit you could have achieved from it. This is why it’s difficult to beat the market, but Silver contends that it is possible.

With regards to forecasting the economy, baseball, chess, politics and the rest of the fields Silver examines not previously mentioned, all the data needed to make accurate predictions is available but we just need the insight to recognize it.

All data sets used in prediction are made up of signal and noise. The signal is described as the data a researcher is looking for to make a prediction. This is the data that will enable the most accurate prediction possible to be made. The noise, however, is the exact opposite. It is the data that is available but has no bearing on the outcome of the prediction and therefore decreases the prediction accuracy.

Ultimately, the goal in prediction is to try and find the signal amidst all of the noise, and although this can be an extremely difficult task, the key to making accurate predictions is utilizing the given data correctly after filtering out the noise.

That is, of course, much more easily said than actually done. Ultimately, Silver has three guidelines for making good predictions. Think probabilistically, have a good base knowledge of whatever field your prediction resides in, and spend lots of time practicing making predictions.

A Catastrophic Failure of Prediction

Silver contends there were four main prediction failures that added to the 2007 financial crisis. The first of which was the housing bubble. Homeowners expected housing prices to continue to rise which caused more homes to be sold and subsequently more people severely affected by the financial crisis.

If you love writing, it’s time to start a book blog.  start today  (we show you HOW and WHY)

The second was the rating agencies’ failure to accurately assess the risk associated with mortgage-backed securities. Standard and Poor’s rated collateralized debt obligations at AAA, which gave them a .12% chance of default, when in reality 28% of CDOs defaulted.

Silver argues that it wasn’t ignorance by the rating agencies but rather that the models were “Full of faulty assumptions and false confidence about the risk that a collapse in housing prices might present” (42). The third failed prediction was the failure to foresee how the housing crisis could spur a global financial crisis.

The market was extremely levered toward the housing market, with $50 invested for every $1 that Americans were willing to invest in a new home.

With that much leverage in the market, even a small percentage change could be catastrophic and the failure to see that was a major reason why the financial crisis became a global event. Finally, there was a failure to predict how long the recession was going to last.

Policymakers believed that with the stimulus package of $800 billion, the unemployment rate would reach a maximum of 8% and begin to come down in mid-2009. In reality, the unemployment rate rose to 10.1% in late 2009 and declined much slower than originally anticipated.

Silver explains that with each of these four predictions, there was an important piece of information that was essentially universally ignored. With regards to housing prices, homeowners and lenders were too confident in housing prices, which Silver assumes is due to the lack of a decline in housing prices in recent years.

With respect to the poor credit ratings, banks had high confidence in the rating agency’s ability to accurately rate CDOs, but these agencies had never encountered anything like a CDO before. Thirdly, economists believed that our economy could withstand a housing crisis, but never in history had so many people bet on the housing market.

Finally, policymakers believed there would be a swift recovery to the recession, which was probably based on some of the recent recessions that had rapid recoveries. The problem with comparing past recessions was the fact that previous downturns did not stem from a financial crisis.

Are You Smarter Than a Television Pundit?

Silver also describes a show called The McLaughlin Group where there is a panel of political pundits and at the end of every show they are asked to make predictions about a variety of different political events. Silver watched hundreds of episodes of this show and recorded each pundit’s respective predictions. It turned out that each person was correct only about 50% of the time.

In searching for why these so-called expert’s predictions were no better than a coin flip, Silver came across a paper by Philip Tetlock, a professor of political science and psychology. In his paper, Tetlock had surveyed hundreds of political scientists and asked them to predict political events.

His results showed that many of the political scientists were also only correct about 50% of the time, but there were some that did much better.

He also asked a series of psychological questions to determine if there was any correlation between those predictors who did better or worse. He classified all of the survey participants as either foxes or hedgehogs, with foxes being much better at prediction.

He describes hedgehogs as “Type A personalities who believe in big ideas” and foxes as “Scrappy creatures that believe in a plethora of little ideas and in taking a multitude of approaches toward a problem” (53). Hedgehogs make for better television because they make big radical predictions, but foxes make better predictions because they take into account all of the little details. Another problem with hedgehogs is that they take information and spin it to align with their particular biases. This is because hedgehogs have difficulty distinguishing between the facts and their rooting interest.

Silver has his own website where he made predictions about the general election in 2008.

He describes his predictions as very fox-like and that they all follow three key principles. The first principle is to think probabilistically. Each prediction that Silver made didn’t show just one result, but rather a range of results.

He does this in order to acknowledge the uncertainty related to any real world prediction. Silver also states the importance of learning from your mistakes, as it can help you make better predictions. The second principle is to change your predictions as new data becomes available. Silver says, “You should make the best forecast possible today, regardless of what you said last week, last month, or last year” (65).

The willingness to change your prediction will only result in more accurate predictions. Lastly, the final principle is to look for consensus. If there are a large number of models that are all predicting a much different outcome than your model, there might be something wrong. There are always outliers, but for the most part a group of people is more accurate than an individual.

Silver also talks about the need to account for both quantitative and qualitative data.

An example he uses is predicting races in the House of Representatives. In these races, there are often times candidates who are not well-known or not even politicians by profession. In these situations it is difficult to find quantitative data to base your predictions on since the candidates have no prior political experience.

Here you can increase the accuracy of your prediction by taking into account the qualitative data such as the candidate’s demeanor, likeability, or reputation in the community. The main point is that you cannot always rely on quantitative data to make good predictions.

Silver does caution us that when using qualitative data, a judgment must be made and therefore your prediction is exposed to personal bias. He states that it is important to take a fox-like approach to forecasting, especially when dealing with qualitative data.

All I Care About is W’s and L’s

Silver states that a good baseball prediction system must take into account the context of a player’s statistics, changes in performance over time, and the distinction between luck and skill. The context of a player’s statistics refers to a player’s batting average at home versus his average at the other stadiums in the league or other statistics of that nature.

Silver explains that the key to distinguishing between luck and skill is to determine what statistics are least susceptible to luck.

In other words, what statistics contain the least amount of noise. He gives the example of predicting a pitcher’s win/loss record. If a pitcher strikes a batter out, he can’t get on base and therefore can’t score a run. If the opposing team can’t score any runs then they won’t win. With each of these statistics, as you move closer to the big picture, you introduce more and more noise into your prediction.

In trying to predict a pitcher’s win/loss record, looking at his record from last year is a much worse predictor than his number of strikes and number of walks. When explaining how a player’s performance changes as he ages, Silver introduces his own baseball prediction system called PECOTA. Other forecasters relied on the aging curve, which said that a player peaked at age 27.

Silver recognized an immense amount of noise associated with that average because many players either peaked earlier in their careers or later. He came up with a different system that grouped players together based on a plethora of different statistics. With these groups, Silver could more accurately predict when a player would reach his peak performance based on when other players in the same group peaked.

At the time when Silver created PECOTA, there was heated debate as to whether scouts or statistics predicted the future success of baseball players more accurately. Silver put his system up against Baseball America (a top scouting agency) at ranking the top 100 minor league prospects.

It ended up that Baseball America was slightly more accurate at predicting the future top players in the league. Silver says this happened because they were using a hybrid approach. The scouts could look at the same statistics that Silver could, but they could also identify certain physical or mental traits by talking to and watching a prospect that PECOTA could not.

Silver then talks about the standard way scouts would analyze prospects, which was the five tools: hitting for power, hitting for average, speed, arm strength, and defensive range. These are the standard metrics, but Silver interviews a long-time scout named John Sanders who explains that players also need mental tools.

These tools are: work ethic, concentration and focus, competitiveness and self-confidence, stress management and humility, and adaptiveness and learning ability. Silver suggests that these are applicable to more than just baseball players.

In almost any profession, having specific mental tools will help you succeed.

Silver uses the example of Dustin Pedroia. Pedroia is a very successful player for the Red Sox who was written off by every scout due to his lack of size. Silver’s system PECOTA actually predicted that Pedroia would become a successful player. Pedroia had all of the fundamentals and good statistics but was deemed a bust because he didn’t fit the mold of your prototypical professional baseball player.

This illustrates a problem that many people have with prediction. When something doesn’t fit the mold, we tend to disregard it simply because we don’t know how to classify it. Silver was able to accurately predict Pedroia’s success because his system groups players by similarity rather than having just one prototypical idea of what a successful baseball player is like.

In the world of prediction, information is king. As Silver says, “The key to making a good forecast is not limiting yourself to quantitative information. Rather it’s having a good process for weighing the information appropriately” (100).

For Years You’ve Been Telling Us Rain is Green

Silver introduces the idea of chaos theory, which basically states that certain types of systems are hard to predict. This theory applies to systems that are both dynamic and nonlinear. With these types of systems even the slightest difference in an input could cause an enormous error in your calculation.

Silver gives an example of a meteorologist named Edward Lorenz who was working on a forecasting model.

Their forecasting program was predicting thunderstorms one simulation and then clear skies for the next with what they thought was the exact same data. It turns out in the clear skies test the barometric pressure was off by 1/100th. This is why it is so difficult to predict the weather. Even with this difficulty, weather prediction has come a long way. Both hurricane and precipitation forecasts have been drastically improved.

Silver also describes the competition between the public and private weather forecasting operations in order to determine which forecasting system is better. Silver explains an essay written by Allan Murphy, a meteorologist at Oregon State University, in which he attempts to define what makes a good forecast.

He says there are three ways to judge a forecast. The first is, was the forecast correct? The second is, was that the best forecast he or she could give? This goes back to the idea of the hedgehog where they would manipulate the facts in order to achieve a more favorable outcome for themselves or their alliances. “The third way to define a good forecast is to understand the economic value.

In other words, did it help the public make a better decision?” With Murphy’s essay in mind, Silver set out to determine who had the best forecast. It turns out that the Weather Channel, AccuWeather, and the government forecasts were all pretty equal.

The alarming fact was that forecasts made ten days in advance were actually worse than the historical average of conditions on a particular date in a particular area.

This means that a ten-day forecast is worse than what the average person could do with an almanac and a calculator. The reason for this is that weather is both dynamic and nonlinear. In Silver’s search for the best forecast, he found that the National Weather Service forecasts were the best calibrated, meaning that if they predicted a 30% chance of precipitation, it rained 30% of the time.

The Weather Channel, on the other hand, was not as well calibrated in that when they predicted a 20% chance of precipitation, it only rained 5% of the time. Silver explains, though, that the Weather Channel does this on purpose. If the weather channel predicts a chance of rain and it doesn’t, the public is pleasantly surprised.

If the Weather Channel does not predict rain and there ends up being a downpour, the public is furious. The Weather Channel predicts as it does for economic reasons. Silver also stresses the importance of accuracy in a forecast. Some government officials downplayed the need to evacuate New Orleans during Hurricane Katrina, which led to many people losing their lives.

There was a poll taken of people who stayed through Katrina and a third of the residents said the evacuation instructions were unclear, and another third said they didn’t even hear them at all. As Silver says, “It is forecasting’s original sin to put politics, personal glory, or economic benefit before the truth of the forecast” (141).

Desperately Seeking Signal

Unlike weather prediction, earthquake prediction hasn’t advanced much since the ninth century. Many scientists have tried to determine a precursor to earthquakes, or some kind of phenomenon to give an indication of when one will hit but none have been successful.

Silver contacted the United States Geologic Survey and met with Dr. Susan Hough who is one of the top seismologists in the nation.

She told Silver that it is impossible to predict an earthquake, but that they can be forecasted. In the seismology world, a prediction is a “Specific statement about when and where an earthquake will strike” (149). A forecast on the other hand is a “Probabilistic statement, usually over a longer time scale” (149).

An example of a forecast would be there is a 50% chance an earthquake will happen in Japan in the next 50 years. One such forecasting tool is the Gutenberg-Richter law. This law states that the frequency of earthquakes versus their magnitude follows the power law distribution, meaning that if the frequency decreases, the magnitude exponentially increases.

This is useful because it now allows seismologists to forecast the amount of large earthquakes based on the amount of smaller ones. “For every increase of one point in magnitude, an earthquake becomes about ten times less frequent” (151). The main flaw with this law is that it gives a large time frame for when the events will actually occur.

Susan Hough describes the “Holy Grail” of seismology as a time-dependent forecast.

There have been many attempts at creating such a model by looking at foreshocks and aftershocks. The idea is to try and find some type of pattern in foreshocks that led up to the main event, which could help forecast the timing of when another large event might occur.

Unfortunately, there has been no decisive pattern that fits all scenarios. One idea from David Bowman, chair of Geological Sciences at Cal State Fullerton, was to determine the root cause of earthquakes. This is just like the chaos theory employed in weather prediction, but weather prediction is quite successful while earthquake prediction is not.

Silver also talks about the problem of overfitting in earthquake prediction.

There is typically little data to work with, and scientists are desperate to identify some kind of pattern that will help their predictions. Often times, they will mistake the noisy data for the signal, which makes your model look better but will end up giving you worse predictions. One such example is the devastating earthquake that hit Japan in 2011.

Japanese officials built the Fukushima nuclear reactor to withstand an earthquake of magnitude 8.6. They overfit the data, showing that the chance of a magnitude 9 earthquake was once every 13,000 years. Following the Gutenberg-Richter law, the chance was actually once every 300 years. There is still a small chance, but Japanese officials might have built a stronger reactor had they employed this data set.

Overall the Holy Grail or time-dependent forecast will most likely never be achieved for earthquakes. There is promising data on forecasting the location of aftershocks from major earthquakes, but right now the science of forecasting earthquakes is still extremely difficult.

How to Drown in Three Feet of Water

In 1997, the city of Grand Forks, North Dakota, was flooded, and nearly the entire town was destroyed. The town had levies that could withstand up to 51 feet of water, and the National Weather Service predicted the water would rise to 49 feet, so the residents thought they would be fine.

In reality, the margin of error on the forecast was plus or minus nine feet, but they didn’t convey that. As a result, the water reached 54 feet, and the town flooded. Had the uncertainty in the forecast been articulated, more people would have been better prepared for a worse outcome.

Expressing uncertainty in predictions is important, but it is something that economists tend not to do.

Silver explains that in late 2007, when there were numerous signs of trouble, such as the number of foreclosures doubling, economists in the Survey of Professional Forecasters predicted the economy would grow at a rate of 2.4% in 2008. In reality, the GDP shrank by 3.3%. Economists said there was only a one in 500 chance that the GDP would shrink by more than 2%.

When they made this prediction, economists were very confident about it. Since they were so wrong, this leads one to assume that they were making biased forecasts. Silver states, “If they’re making biased forecasts, perhaps this is a sign that they don’t have much incentive to make good ones?” (184). He interviews Jan Hatzius, chief economist at Goldman Sachs.

Hatzius actually predicted that a recession might happen and that unemployment would rise even after the stimulus package went into effect. He states that one of the main reasons why economic forecasting is so difficult is because the business cycle is always changing. Economic behavior that is true in one cycle might not be true in another.

In a 2003 article in Inc. they listed the top seven economic indicators for the current cycle. Only two of those indicators had a major cause in the recession of 2007. To describe this Silver uses the phrase, correlation without causation, meaning that just because two variables are statistically related to each other, doesn’t mean one is the cause of the other.

Another difficulty with forecasting the economy is that you also have to take into account political decisions.

Silver explains that if the government takes action to increase housing prices, an economist not only has to account for that in their forecast but also housing prices are no longer a useful variable in economic prediction because they are now artificially inflated.

One theory on why economists did so poorly in predicting the financial crisis is that up until that point we had been in a period known as the Great Moderation. From 1983 – 2006, the economy was in recession only 3% of the time, whereas from 1900-1945 the economy was in recession 36% of the time. Silver explains that economists looked at the Great Moderation as the norm when in all reality it was an outlier.

During the Great Moderation, there were only two mild recessions, therefore the economic data from that time painted a bright picture. If economists had based their forecasts on data from the last century rather than just the last quarter century, their forecast would have been much closer to the recession that actually began in 2007.

Silver examines if some economists are better forecasters than others.

He found that individuals tend to be more or less the same, but that an aggregate of forecasters were 20% more accurate than the typical individual. With regards to individual forecasters, he found that those who inferred their own judgment on a statistical model were 15% more accurate than those who simply took the output from a statistical model.

This makes sense because economic data is very noisy, and it would be very difficult to create a model that could account for that amount of noise without any human judgment. The one danger to introducing human judgment is you also introduce bias. Silver believes that human bias is a major reason why economic forecasters sometimes make inaccurate forecasts.

He interviews an economist named Robert Hanson who has an idea on how to reduce bias. He believes there should be markets where you can bet on economic forecasts. One example would be betting on whether GDP would grow by some percent or decline. This will give a financial incentive for economists to produce the most accurate forecasts possible and therefore eliminate bias.

Role Models

In the mid 70’s a soldier at Fort Dix contracted what doctors initially thought was the common flu but turned out to be the H1N1 virus. This soldier died because of it. There was worry that the virus would spread and kill many Americans. Gerald Ford was president at the time and his secretary of health, F. David Matthews predicted that one million Americans would die.

President Ford had 200 million doses of the vaccine for H1N1 created and strongly urged the public to get the vaccine. After millions of dollars were spent, it turned out there was never another confirmed case of H1N1. Medical experts believed that the chance of one million Americans dying was between 2% and 35%. Ford chose to ignore these low prediction numbers, which was catastrophic for him because this fiasco played a part in Ford losing his bid for re-election to Jimmy Carter.

In 2009 the H1N1 virus resurfaced and again certain U.S officials predicted that half of the US population would become infected and about 90,000 Americans would die. In reality only one sixth of the forecasted amount became infected and only 11,000 died because of it.

Silver explains that the problem with these predictions is that forecasters were extrapolating the data to make their predictions.

Silver states that one of the best variables for predicting disease is called the basic reproduction number. The basic reproduction number “Measures the number of uninfected people that can expect to catch a disease from a singe infected individual” (214). In other words a reproduction number of five means that the infected person will give the disease to five other people before they either get better or die.

Different diseases have estimates for this number, but you can’t make an accurate estimate until a disease has gone through a community and there are enough cases to analyze. Because of this, forecasters must extrapolate the data from only a few early data points, which can cause extreme error. Another key measure of disease prediction is the fatality rate, but it has the same problem as the basic reproduction number in that if you want to make an early prediction you have to extrapolate from a few data points.

Silver introduces the notions of self-fulfilling and self-cancelling predictions. A self-fulfilling prediction occurs when the prediction causes people to act in such a way that increases the accuracy of the prediction. One example is autism.

In recent years autism in children has been in the news quite a bit, and in turn autism diagnoses have also increased at a similar rate. The idea is that the more a disease is brought to the public’s attention, the more likely people are to notice symptoms of the disease and therefore diagnose it.

This can also make your predictions falsely accurate in that people might have similar symptoms of the disease that is highly publicized and therefore be misdiagnosed. One needs to be cautious when making these types of self-fulfilling predictions. On the other hand self-cancelling predictions are the opposite. In these predictions your forecast causes the public to act in a way making it less accurate.

In the realm of disease prediction this isn’t necessarily a bad thing because if you make a prediction that many people will die from a disease, the hope is that people will become fearful and in turn practice healthier choices so less people will become affected by the disease.

Looking back at the failed prediction of disease outbreak at Fort Dix, Silver asserts that another reason for this failure was an over-simplistic model.

Forecasters did not take into account that a barracks holds a much higher risk of spreading infection than your typical community. Soldiers are in close proximity at all times. They are also expected to continue training even if they are sick. Forecasters modeled their prediction as if the entire US was similar to a barracks type environment when that simply wasn’t the case. Silver does suggest that disease forecasters tend to err on the high side because an inaccurately low prediction of the number of people that will die from a disease could cause many people to die.

Less And Less And Less Wrong

Haralabos Voulgaris is a man who makes millions of dollars a year betting on NBA games. Voulgaris takes a statistical approach to placing his bets. He watches almost every single game played and is continuously looking for patterns that will improve his predictions on who will win each game.

Silver explains that Voulgaris’s approach to gambling is very similar to Bayes’s theorem. Bayes’s theorem gives us the “Probability that a theory or hypothesis is true if some event has happened” (243). To explain this concept Silver uses the example of a married woman who finds a pair of women’s underwear that do not belong to her.

In this example, the woman wants to know the probability her husband is cheating on her. To use Bayes’s theorem she first needs to decide the probability that the underwear is there due to cheating. Next, she needs to determine the probability the underwear is there for reasons other than he is cheating.

Finally and most importantly, she needs to determine the probability of her husband cheating on her before she found the underwear. These three variables are plugged into a rather simple algebraic equation and you get what Bayes called a posterior probability. This is the probability that her husband is cheating on her, given that she found the underwear.

The key to Bayes’s theorem is that you must continually update your probability estimates as soon as new evidence arises.

The idea is as you continue to update your estimates you will ultimately reach a point where you become almost 100% certain of the outcome. One man who disagreed with Bayes’s theorem was Ronal Aylmer Fisher. Fisher believed Bayes’s method was too subjective. He sought to create a new method that would eliminate all personal bias. His idea later became known as frequentism.

This theory states that uncertainty in a prediction results from collecting data from a sample population rather than the entire population. The Frequentist method is designed to determine how much error is introduced by taking a sample rather than the whole. The frequentist theory also says that the more data you collect, the closer you get to zero error.

Silver explains that the one big problem with the frequentist theory is that since there is no personal bias involved, you cannot invoke human judgment to help give context to whatever situation you are dealing with, unlike Bayes’s theorem. Contrary to the frequentist mentality, Silver believes we can never be completely objective. He says we can only try to become less subjective and less wrong. Ultimately, Silver asserts that data without context is useless.

Race Against the Machines

The game of chess can be viewed as a game of predictions. Every turn you are trying to determine the probability that each move results in victory. Silver describes a series of games between chess grandmaster Garry Kasparov and a supercomputer, built by IBM, called Deep Blue. They played multiple games with Kasparov winning at first but Deep Blue ultimately won the series.

Silver goes on to explain the pros and cons to both man and machine.

Deep Blue could calculate every possible move and determine each one’s probability of leading to a win in a matter of seconds, a feat that no human could ever achieve. The problem with computers is that they can’t see the big picture. They tend to identify near-term objectives to accomplish.

Computers have great difficulty knowing which one of those short-term objectives is more important to the overall goal of winning the game.

This is again a problem of forecasting with little data. There are more possible chess moves in a game than there are atoms in the universe and Deep Blue was trying to take a small amount of that data and forecast it. Humans have the advantage in this sense. Kasparov couldn’t get close to matching the calculations Deep Blue could, but he could see the end game.

He was willing to sacrifice certain pieces knowing that it was part of his long-term plan. Kasparov ended up losing to Deep Blue in the end because as the game continues there are fewer calculations for Deep Blue to make and therefore it becomes more accurate in its forecasts. When there are only six pieces left on the board, Deep Blue can calculate every single move possible and the probabilities of victory associated with each. In other words, it can look 20 moves ahead when a human simply cannot.

Computers are great tools to use for prediction but they aren’t perfect.

Silver explains the concept of garbage in, and garbage out, meaning that a computer-simulated forecast is worthless if the inputs that you give it are not accurate. This is why computers work great for chess and weather forecasting where the systems abide by laws that are well understood.

For earthquakes, computers aren’t very helpful because we don’t understand the laws that earthquakes abide by. Ultimately, Silver states that the best predictions involve both man and machine. You need a computer to do the calculations a human can’t, but you need the human to provide the insight a computer can’t.

The Poker Bubble

Like chess, poker is also a game where prediction plays a large role. Silver explains that playing poker is purely Bayesian. In Texas Hold’em, the most popular poker game, everyone is given two cards face down. There is a round of betting then three cards are drawn from the deck for anyone to use, then another round of betting. Another card comes out and then another round of betting. Finally, the fifth card comes out and a final round of betting occurs.

The player that can make the best five-card hand between his two cards and the five community cards wins.

This process is Bayesian because you apply probabilistic estimates based on the condition that you will win, and you alter those estimates as new cards (or evidence) come into play. There is an aspect of poker that involves reading your opponent for “tells” or physical signs that they might be bluffing, but these are accounted for in Bayesian prediction because they are part of the evidence you use to alter your probability estimates.

Silver states that in addition to making good probability estimates; a good poker player is unpredictable. He explains that you should be unpredictable in your play, because that makes it more difficult for the other players to make accurate probability estimates of what cards you hold.

He also introduces the Pareto Principle of Prediction, which describes the learning curve toward becoming a successful predictor. On the x-axis is effort and on the y-axis is accuracy. The curve is steep at first but then levels out once you have become pretty good at prediction.

The idea is that it is easy to learn the basics quickly, but once you reach the higher levels of knowledge it becomes more difficult to improve.

Silver states that poker players, much like anyone who makes forecasts for a living, are measured by results. With regards to poker this can be dangerous. If you are playing a hand and you correctly predict your opponent is bluffing but then he catches a miracle card on the last card and wins, you most likely get angry.

Silver explains that this is the wrong answer. You should be happy that you made the correct prediction because in the long run you will be a better player for continuing with that level of prediction accuracy.

If You Can’t Beat’em…

A major theme from this book is that “Past performance is not indicative of future results” (339). This is especially true with regard to the stock market. Silver looked at how a few different mutual funds performed from 2002-2006 and then from 2007-2011 and found no correlation.

All of the funds he looked at outperformed the market in the initial time frame but then either performed equally to the market or underperformed for the next time frame. Eugene Fama also did a study like this in the 50s and came up with three forms of efficient market hypothesis. The first is the weak form of the efficient market hypothesis stating that stock prices cannot be predicted by only looking at past prices.

The second is the semi-strong form of the efficient market hypothesis which states that doing a complete financial analysis of a company, (combing over financial statements, checking the effectiveness of the business model, etc…) would not consistently produce returns that beat the market.

Lastly, the strong form of the efficient market hypothesis states that even knowing private information will not produce returns that outperform the market. If the efficient market hypothesis is true, then the market is inherently unpredictable. The way investors are able to get above-average returns is to take on risk. The more risk you take on the bigger the potential return, but also the loss.

A Yale economist named Robert J. Shiller completed a study trying to prove that the market was in fact predictable.

He determined that the average price-to-earnings ratio (P/E) for the market as a whole was 15. If the efficient market hypothesis is true then the market P/E should be pretty consistent as time goes on, but this wasn’t the case. Shiller found that market P/E fluctuated anywhere from 5 to 44.

He found that when the P/E was low say around 10, stocks historically gave a return of 9%, and when the P/E was high (around 25) returns were quite low. The only problem with his findings is that it only benefits investors in the long term. Shiller showed that there is some predictability in the market but only over many years. It is much more difficult to find a pattern in the short term because if you notice what you think is a pattern so will other investors and when everyone reacts to it, the pattern becomes nonexistent.

This is similar to the problem of overconfidence that Silver describes. When traders have a certain value in mind for some stock and everyone else seems to have a completely different value, that trader is wrong almost every time. The problem is that the overconfidence of many traders have caused the market to act irrationally. Symptoms of an irrational market are below-average returns and extreme volatility in the market, things that no trader wants.

These poor decisions made by traders led to Silver’s take on market efficiency.

He explains that the efficient market hypothesis is self-defeating because if you believe that the market is efficient and there is no way to beat it, there is no point in making any trades. If there are no trades made then there is no market.

Instead, Silver believes that the market cannot be completely efficient but that it takes novice traders with less skill to create a marketplace where skilled individuals can prevail. His theory is that you need variance in the skill set of traders within the market in order for the most skilled ones to have a chance at beating the market. These unskilled traders are called noise traders.

Silver also discusses the importance of recognizing what the consensus is. He explains that whenever he makes a prediction he looks at the consensus and the farther away from it his prediction is, the more evidence he needs to be comfortable with it. Silver states that the same philosophy applies to investing. If you stray far away from the consensus, you need to have good reason to or else you’ll end up like the many noise traders.

A Climate of Healthy Skepticism

Just as with earthquakes, Silver finds that global warming is also difficult to predict. He explains that global warming is practically an accepted phenomenon in the scientific community, with the greenhouse effect being the cause.

We as a people are adding to the greenhouse effect and therefore adding to global warming. In a survey of climate scientists, 94% believed climate change was occurring and 84% believed it is a result of human activity. Nevertheless, predicting climate change is still difficult.

Gavin Schmidt, co-author of the blog, states three principles as to why predicting climate change is so difficult.

The first is the large fluctuations in temperature throughout the year make it hard to indentify the climate change. The overall climate change is only predicted to rise 2 degrees Celsius over the next century, which is only .02 degrees per year. In places where the temperature can vary as much as 15 degrees in a day, it would be difficult to notice a .02 degree change.

The second is in order to have a good climate-forecasting model you must forecast the amount of CO2 in the atmosphere. This is difficult because no one knows if in 50 years there will be more stringent pollution laws or if society will actually shift away from burning fossil fuels.

Lastly, he explains the principle of structural uncertainty. The climate system is extremely complex and creating a mathematical model that accurately represents it is quite difficult. This is the biggest problem for climate forecasters.

Another problem that forecasters of climate change face, that forecasters in other fields don’t, is dealing with political intervention. If someone forecasts an increase in climate change due to global warming, there will be lots of lobbyists for car companies trying to refute that claim.

This can make forecasters hesitant for a reason completely unrelated to their model. Ultimately forecasting climate change is similar to forecasting earthquakes in that they are both very difficult and not a whole lot of progress has been made. Predicting climate change is also difficult because you can’t know the results of your prediction every day like you can with weather forecasting. This is one of the reasons why weather forecasting is so successful.

What You Don’t Know Can Hurt You

In the days after 9/11, many people believed that no one could have predicted a terrorist attack of that magnitude on American soil, but Silver disagrees. He quotes an economist named Thomas Schelling who states that we tend to mistake the unfamiliar for the improbable.

There were many signs of a possible attack on Pearl Harbor beforehand, but Schelling explains that people most likely realized that the United States rarely ever gets attacked so therefore Hawaii was very unlikely to be attacked. The same can be said for the attacks on 9/11. There were again many signs that an attack might be imminent, but an attack of that scale had never happened on US soil and was seen as nearly impossible.

Predicting the attack would have been very difficult but perhaps we might have been better prepared if we didn’t have the notion that an attack was a near impossibility.

A professor at the University of Colorado named Aaron Clauset used mathematics to try and predict terrorist attacks.

What he found is that predicting terrorist attacks is a lot like predicting earthquakes that follow the power law distribution discussed earlier. If you live in an area and experience a couple of magnitude five earthquakes and then a year later a magnitude six, we know from earthquake forecasting that a serious earthquake is coming.

It’s the same way with terrorism. When there are a few smaller attacks throughout the years, it’s a sign that a large attack is coming. This would suggest that the 9/11 attacks were not a statistical outlier, but rather part of the mathematical process. Silver plotted all of the terrorist attacks on NATO countries from 1979-2009 by number of attacks versus fatalities (he also included the 9/11 attacks).

What he found is that when placed on a double logarithmic scale, it formed a straight line. This is the exact specification of the power distribution law. Granted this gives no bearing as to when and where an attack would happen specifically, but if officials knew a large attack of some sort would happen in the next 20 years, we might have been better prepared. Ultimately our first line of defense against terrorism is good intelligence, but we also need the insight to see that a large-scale terrorist attack is possible.


Silver explains his three rules to follow in order to be a good forecaster. First is to think probabilistically using the Bayesian model of thought. The second is when using the Bayesian style of thought; know your prior beliefs very well. In other words you need to have some base knowledge of whatever you are predicting prior to the condition of the prediction. Finally, Silver states that in order to make good forecasts we need to make a lot of them. The more forecasts you make, the more practice you get, and the better you become. would like to thank the Titans of Investing for allowing us to publish this content. Titans is a student organization founded by Britt Harris. Learn more about the organization and the man behind it by clicking either of these links.

Britt always taught us Titans that Wisdom is Cheap, and principal can find treasure troves of the good stuff in books. We hope only will also express their thanks to the Titans if the book review brought wisdom into their lives.

This post has been slightly edited to promote search engine accessibility.

Leave a Comment