Rate this book

Causal Inference: What If

Name: Causal Inference: What If
Rating: 4.39 (10 reviews)
ISBN: 9781420076165

Miguel Angel Hernán, James M. Robins

Rate this book

The application of causal inference methods is growing exponentially in fields that deal with observational data. Written by pioneers in the field, this practical book presents an authoritative yet accessible overview of the methods and applications of causal inference. With a wide range of detailed, worked examples using real epidemiologic data as well as software for replicating the analyses, the text provides a thorough introduction to the basics of the theory for non-time-varying treatments and the generalization to complex longitudinal data.

GenresScienceMathematicsTextbooksAcademicEconomicsNonfictionTechnology

312 pages, Hardcover

First published April 15, 2011

44 people are currently reading

533 people want to read

About the author

Miguel Angel Hernán

3 books2 followers

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

31 (63%)

4 stars

11 (22%)

3 stars

4 (8%)

2 stars

1 (2%)

1 star

2 (4%)

Displaying 1 - 10 of 10 reviews

Michael

115 reviews5 followers

January 3, 2021

If you are looking for a practical book about causal inference, this is it. The ideas and methods are explained simply, thoroughly and with enough numerical examples that you can pretty much understand how to execute them in practice. The book has very up-to-date citations (as late as 2019) and Hernan remains an active researcher and Tweeter.

I'll give a brief overview so you know what the book contains.

The first ten chapters (part 1) explain all of the fundamental ideas of causal inference based on the potential outcomes framework integrated with Pearl's DAGs. All of the fundamentals of causal inference, from potential outcomes as counterfactuals, exchangeability/consistency/positivity to stratification, superpopulations and inverse probability weighting are explained via the same small artificial dataset of heart transplants among Olympian gods. I think it's a good choice to do things that way because it keeps the focus on the ideas, without having to ponder eccentricities of the data every time. He introduces Pearl's DAGs in 6 and explains confounders and colliders in 7 and 8. 9 and 10 get into measurement bias and random variability, where he doesn't provide too much specific analysis or techniques. Hernan and Pearl butt heads on Twitter about what should or should not be included as a "treatment", and it's interesting to see some of that bleed through into the book .

The next eight chapters (part 2) are very useful because they explain how to actually implement the concepts of causal inference with "big data" or at least data where there are too many covariate values to be able to stratify the population as in the first part. Hernan calls it "with models" but for practitioners the key idea is "with realistic data". We now meet inverse propensity weighting, outcome models (standardization), g-estimation, instrumental variables and even a touch of survival analysis. After reading the second part, a reader with a quantitative background should feel comfortable implementing the vanilla version of these methods and should understand papers that build off of them. In that sense, this is a resounding success, filling an important gap in the literature. Pearl tells great stories about causality, but his books are heavy on ideas and light on practical tools. Imbens and Rubin do an excellent job of walking through a lot of the methods but their work can be pretty intimidating for someone just getting started. Hernan's part 2 really lays out a terrific vista of causal methods in a way that any scientist should be able to follow and even implement.

In this part, everything is done using the NHEFS data on the impact of quitting smoking on weight loss. On the one hand this is good because the data is pretty clean and the effect is pretty consistently detectable. On the other hand, because virtually all of the methods agree to within a few percent and even the naive correlation answer with no CI gets the overall effect right (2.5kg of weight gain after quitting, compared to 3.1-3.5kg depending on the methods used), I feel like a lot of "teachable moments" were missed. You go through the whole thing but you don't have much of a sense of why one method should be preferred over another, and even how to deal with a sign-change when going from naive methods to CI methods. In contrast, Imbens and Rubin Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction bring many different example datasets in their book and provide a lot more guidance in reasoning about different methods and even sign changes.

The last four chapters (part 3) kind of reads like a "future work" section of a paper. Not that the methods described have not yet been invented, just that the examples are much more complex and the level of depth is much lower. Unlike in the previous parts, you do not come away feeling "oh cool, that's how this is done" but rather "hmm my definition of time is problematic, my treatment strategy is not well defined, I can't measure adherence, I have unusual censoring, and there are multiple ways to define outcome." Suddenly the out-of-the-box formulas don't seem so useful and it's clear that any serious attempt to do causal inference on longitudinal data will require not just a lot of work, but also a lot of judgments about what sort of deviations from the model assumptions are tolerable or not, and because of the combinatorial complexity of longitudinal counterfactuals, even sensitivity analysis may prove prohibitive. It's clear that this is still very much an area of active research and somewhere where Hernan and Robins have worked and continue to work. Maybe someday it will come to resemble part 2.

The final chapter "target trial emulation" is interesting because Hernan mentioned when he was interviewed on the "casual inference" podcast that he hopes that this will be his legacy in the field. Unfortunately, it's not clear from the chapter why this idea is so important. The way I understand it, you want to write a fictional protocol for your observational data so that the data can be treated like an RCT (minus the initial randomization, which you correct via reweighting) and then CI on observational and experimental data coincide. It's a useful thought experiment but I just don't get why Hernan thinks it's such a big deal.

Summing it all up, my opinion is that as of January 2021, this book is far and away the best resource for a person who is looking to implement causal inference methods in healthcare, econometrics or general data science. It pays to read other books as well like Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction , Causal Inference in Statistics: A Primer , Observation and Experiment: An Introduction to Causal Inference and the various lecture notes / video lectures that are coming out of the top departments (Stanford, Harvard Med, Columbia, UCLA etc.). But if you had to choose one book that takes you from zero to using causal methods in your own work, this is it.

(Note: I read the July 31, 2020 version. I compared it with the November 2020 version and saw that all the page numbers in the TOC are the same. The books is already pretty much complete. I assume he's updating the references and fixing typos now.)

Bing Wang

33 reviews6 followers

December 27, 2020

A practical book for statistician, biologist and sociologist to analyze causal inference. For researcher in big data industry, one can always resort to randomized experiment with huge volume data and analyze the counterfactual easily since data are usually exchangeable, consistency and also satisfy positivity condition. However, the concept of ip-weighting, standardization and counterfactual are also valuable when randomization failed. This is a book that you can pickup any time once needed. It is also a book that you can start working after 10mins reading. It is the first book one should read before other theoretical book and the last one one should recall before applying practical approaches.

ml-math

Brian

110 reviews6 followers

December 7, 2023

I think this is very close to being an exceptional intermediate textbook on causal inference. It is vast in scope and covers many topics that are conventionally ignored in classical causal inference texts (such as causal inference with panel data and time-varying treatments).

But I do think there are several personal issues I have with this text that make me withold a 5-star review.

1. I don't think the language is as clear and concise as it could be.
2. I personally wish there was much less notation and much more code (I can't recall any code actually). So the expectation should be that this text is more theoretical than practical
3. I found the applied examples to be a bit unhelpful at times for me personally, just because I don't come from an epidemiological background.
4. I don't love the aesthetics of the book... Which sounds minor but all of the words and notation cramped together on the page is a bit overwhelming and distracting...

Overall, this is a great text, whose content is much more applicable to my own work compared to other similar texts. But it's definitely one that might require several reads for people like me.

Juan Vargas

1 review

July 14, 2023

I found this book hard to read, not because the concepts are difficult, but because IMHO there are too many unnecessary explanations for simple ideas. The narrative, which at times I feel is way too verbose, often over explains simple concepts, and at the end, what should have been a rather simple idea, becomes tortuous and more complicated than necessary.

Alex Tank

27 reviews

August 26, 2021

This book is fantastic. At first I was discouraged by the unorthodox style of presenting technical material and deferring maths to technical points. However, after finishing this book I feel I’ve gained an immense amount of knowledge and intuition about causal inference. The last sections on sequential decision problems are an excellent intro to the area, and the entire book does feel like it’s building up to being able to understand the sequential setting.

Kirill

Author 1 book12 followers

March 21, 2019

A very approachable book on causal inference. All technical points are separated into isolated sections. Authors made great work to make this book understandable. The code is readable and provided in every major statistical computing environment, which is also great.

I wish every scientific book was written with such care for the reader. Thanks a lot to the authors.

Oliver

51 reviews2 followers

February 25, 2025

Garbage. Unfortunately the authors are leading figures in the field so this book is a must read. Even more unfortunately, they don't know how to write a proper textbook so reading this is agonizing.

Prepare for pain, confusion, and frustration.

academic shit-books

Leonardo

Author 1 book80 followers

to-keep-reference

November 12, 2021

Libro recomendado en el curso Inference on Causal and Structural Parameters using ML and AI.

Shubhamkar Ayare

1 review

June 28, 2024

The book really needs to undergo a round of proof-reading. From other reviewers, I gather that the contents are pretty nice. But the language is way too hard for the contents.

Thư

9 reviews4 followers

July 28, 2024

couldn't ask for a better and more in-depth text on causal inference with and without models. it's like an all-in-one place for almost every important causal inference concept.

causal-inference statistics

Displaying 1 - 10 of 10 reviews