During an odd phase of fascination with American politics (I’m Canadian), I stumbled across Nate Silver’s website’s (fivethirtyeight.com) political coverage of the 2016 Idaho primary. Their cold, analytical coverage of the election appealed to me. It turns out Nate also wrote a book about prediction, which luckily for me, is also in audiobook format.
The core idea
The takeaway from this book is essentially this: prediction is really hard and most people (and machines) suck at it (expect for weather forecasters). More concerningly (is that a word?), most people don’t even know that they suck at it. Oh, and you should use Bayesian statistics to give probabilistic estimates and update your probabilities when you get new information.
This book encouraged me to take a hard look at my own predictions. Do I suck at it? Do I actually understand Bayes theorem?
While the common theme of this book is prediction, it spans diverse topics such as: the housing market, politics, baseball, weather forecasting, earthquakes, poker, terrorism, chess, stock market, climate change… all in the context of prediction.
Below, I summarized points that stood out to me, and supplemented them with relevant links. This is one of the disadvantages of audiobooks… while the actual book includes references, these are not accessible with the audiobook. My side notes are in italics.
If these below notes seem interesting to you, you’ll probably enjoy this book. I certainly did.
Section 2 / 00:22:44
Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2(8), 0696–0701.
[Shaking my faith in science…]
Section 2 / 00:23:14
Another study claimed they could not replicate 2/3rds of the published experiments themselves.
Prinz, F., Schlange, T., & Asadullah, K. (2011). Believe it or not: how much can we rely on published data on potential drug targets? Nature Reviews. Drug Discovery, 10(9), 712. doi:10.1038/nrd3439-c1
Section 3 / 1:05:12
There has been housing crashes elsewhere in the world before (e.g., Japan in the 1980s-90s).
Houses traditionally were not a great investment.
Section 3 / 1:06:46
Governments instinct is to try to re-inflate the housing bubble.
Section 3 / 1:32:08
The “out of sample” problem: when you don’t have past data with the same conditions.
Section 4 / 1:48:26
Check out this book:
Expert Political Judgment: How Good Is It? How Can We Know?
Philip E. Tetlock
Experts in his survey do political prediction barely better than random chance and worse than using simple statistics.
Section 4 / 1:49:26
Experts, especially those who get lots of media attention, are terrible at making predictions
Section 6 / 4:04:28
Laplace’s demon: the idea that if we were smart enough and knew all the variables, we could compute the entire future and the past (scientific determinism) https://en.wikipedia.org/wiki/Laplace%27s_demon
Section 6 / 4:28:50
Including human judgement in weather prediction provides a measurable and significant improvement over computer aided prediction alone.
Section 6 / 4:30:22
Probability of being hit by lightening has gone down dramatically over time. In 1940’s was 1:400,000 as opposed to today 1:11 million
Section 6 / 4:50:41
The public is hyper aware of when predictions for rain are incorrect. So weather forecasters will increase the probability of rain in their forecasts since people will be pissed if it rains when they claimed sunny and be pleasantly surprised if its sunny when they thought it would rain.
[Interesting to change probabilities depending on the type of mistake.]
Section 6 / 4:53:18
Local weather stations don’t really try to make accurate predictions. They don’t care about accuracy – only if people are watching. So they often predict worse weather to get people to watch.
Section 6 / 5:02:21
But inaccurate weather forecasts can cause problems. Even for this seemingly benign example of weather forecasts, if you skew predictions towards worse weather for better ratings, people then doubt the severity when a true emergency arises (e.g., Katrina hurricane).
Section 7 / 5:17:58
The government group that studies earthquakes states that, to date, earthquakes cannot be predicted and will not be able to for some time. http://earthquake.usgs.gov/learn/topics/100_chance.php
Section 7 / 5:18:57
Difference between “prediction” and “forecast”. Prediction is a definitive statement while a forecast is more of a probabilistic statistical statement.
these are still fuzzy terms to me. More discussion here: http://stats.stackexchange.com/questions/65287/difference-between-forecast-and-prediction
Section 7 / 5:24:52
Earthquakes are a concern in Tehran… Issues with organization, estimate 20-30% casualty of a huge dense population.
Section 7 / 5:32:50
Scientists can and do make inaccurate predictions… Predicted earthquake in Peru in the 80’s…
Section 7 / 5:36:11
When thinking about earthquakes, it’s better to assume earthquake events are independent… So 1/35 chance a year rather than once every 35 years.
Section 7 / 5:48:22
[Nice to see a chapter on the problem of overfitting. This is a serious problem I constantly run into when building models from data.]
Section 8 / 6:59:20
Economic forecasts have traditionally been terrible. Probably best not to trust them.
Section 8 / 6:59:22
If a prediction does not give some confidence measure, hard to really trust it.
Section 8 / 7:02:00
Prediction markets… Place bets on political or economic outcomes.
[Seems fun. Be good to start verifying my own predictions more]
Section 9 / 7:08:14
Didn’t realize how bad the flu actually is. It was/is quite the epidemic.
Section 9 / 7:14:20
There is some statistical precedent for vaccines correlating with a Guillain-Barré syndrome in the 70’s.
Vaccine companies were absolved from legal responsibility if there were problems with the vaccine.
[vaccines clearly have done/do amazing things for the world. I’m not at all against vaccines.]
Section 9 / 7:37:45
Publicity of a disease can increase the reporting and thus make it look like the disease is spreading faster than it is.
Section 9 / 7:58:27
Agent based models for prediction. Simulate our world to make predictions.
Section 10 / 8:44:01
[This chapter has was a nice explanation of Bayes theorem. Re-read.]
Section 10 / 8:44:37
Another reference to: Why most published research findings are false (2005)
Section 10 / 8:56:14
Fisher, the famous statistician, argued *against* cigarettes causing lung cancer.
Coincidentally 🙁 he was a consultant for the tobacco industry. http://aje.oxfordjournals.org/content/133/5/416
Section 10 / 9:09:21
Many fields are going towards Bayesian and away from frequentest approaches.
Section 11 / 9:11:25
The Mechanical Turk was a machine that appeared to play chess in the late 1700’s, but actually contained a chess master concealed inside.
[ah so this is where the name for the Amazon Mechanical Turk came from]
Section 13 / 11:51:59
Members of Congress beat stock market rates by about 5-10% a year…
Section 13 / 11:52:03
Congress can influence the fate of companies and markets through legislation
Section 13 / 12:06:28
PPE. Price per earnings… How much higher the stock is than the actual earnings of the company.
Section 14 / 12:18:23
How brokerages may have different goals than yours (i.e., consider their own jobs rather than make money)
Section 14 / 12:18:55
Consider the incentives of the people in the financial markets… It’s not always to maximize profits. It’s to keep their jobs, etc.
Section 14 / 12:20:32
Suggested there exists a herd mentality with investors
Section 14 / 12:42:10
Avoid buying in the bubble.. Avoid selling in the decline. Do the opposite. Buy in the decline. Sell or hold in the bubble
Section 14 / 12:47:47
The “bubble” may be an intrinsic property of the system
Section 14 / 12:49:42
[Interesting that Nate doesn’t mention Warren Buffet once. But does mention Buffet’s mentor, Benjamin Graham.]
Section 15 / 13:12:28
Check out this book:
Principles of Forecasting. By: Scott Armstrong
Section 16 / 13:44:00
The predictions of global warming 20 years later have done quite well
Section 16 / 13:45:24
Both the statistical prediction and the model prediction did well. Model predicted hotter summers accurately
Section 16 / 13:47:39
Temperature increase is a fair bit lower than predicted
Section 16 / 13:48:48
Model assumed that there would be no changes. However, there has been some limited changes made to try and curb carbon emissions
Section 16 / 13:57:05
A simple linear regression with obvious factors would have outperformed the accepted statistical model that made the climate predictions. Another argument for using simpler models.
Section 16 / 14:08:50
The dangers of starting with a very confident claim…
If start you start with a very confident claim that is then shown to be wrong, then by Bayes theorem you must revise your belief about that claim by a lot.
Section 17 / 15:12:29
Man, this book covers so many different topics: baseball, earthquakes, politics, poker, terrorism, chess, stock market.
Section 18 / 15:30:36
The point of this book is to use Bayesian probability more?