A nontechnical guide to the basic ideas of modern causal inference, with illustrations from health, the economy, and public policy.
Which of two antiviral drugs does the most to save people infected with Ebola virus? Does a daily glass of wine prolong or shorten life? Does winning the lottery make you more or less likely to go bankrupt? How do you identify genes that cause disease? Do unions raise wages? Do some antibiotics have lethal side effects? Does the Earned Income Tax Credit help people enter the workforce?
Causal Inference provides a brief and nontechnical introduction to randomized experiments, propensity scores, natural experiments, instrumental variables, sensitivity analysis, and quasi-experimental devices. Ideas are illustrated with examples from medicine, epidemiology, economics and business, the social sciences, and public policy.
Paul R. Rosenbaum is the Robert G. Putzel Professor Emeritus of Statistics and Data Science at the Wharton School of the University of Pennsylvania. He is the author of Observation and Experiment: An Introduction to Causal Inference, Design of Observational Studies, Observational Studies, and Replication and Evidence Factors in Observational Studies.
I didn't think I needed another introductory book on causal inference, but I was very pleasantly surprised by this one. I think it would make the ideal primary reading for a course in the topic, particularly one in a statistics, biostatistics, or public health department, due to the preponderance of health examples and the focus on methods popular in that field, though it also covers social science applications.
As for virtues, it starts with a very clear and detailed introduction to the Neyman-Rubin potential outcomes approach to causality, walking the reader through the mechanics of science tables clearly and without handwaving. This involves a little bit of tedium, but in my experience teaching the topic for many years, going through the details is the only way to actually learn it. From there it covers canonical causal methods including experiments and covariate adjustment, with a particular focus on nonparametric methods like permutation testing and matching, avoiding the reliance on linear models that characterizes most econometric treatments.
Rosenbaum is best known for his work in sensitivity analysis, which is often omitted in intro treatments, and that gets a chapter here, along with a chapter on evidence combination that describes ways to apply multiple imperfect causal designs to triangulate and reinforce a plausible causal explanation. The discussion of multiple control groups hints at differences in differences, but doesn't dignify it as a method for finding point results, which would require often questionable linearity assumptions. Together these chapters form a stern rebuke to the idea of causal inference as a hierarchy of purity, with higher and higher hurdles that a study must pass to be dignified as credible, until all evidence is excluded and one must fall back to a default. Highlighting how RA Fisher applied such defenses in the service of tobacco companies while medicine was instead persuaded of the health risks of smoking through systematic review of evidence sources and sensitivity analyses shows that this perspective matters both for scientific honesty and public health and drives home the point. A final substantive chapter on advanced methods including instrumental variables, based on a LATE approach, provides a glimpse into methods in more active areas of research.
The book is extremely short and to the point, omitting derivations and much of the statistical machinery one would need to apply all the methods in practice, but that gives it excellent clarity on the topics it does cover and I would recommend it to anyone, from new students to active researchers, interested in the topic.
The whole business of how we can use statistics to decide if something is caused by something else is crucially important to science, whether it's about the impact of a vaccine or deciding whether or not a spray of particles in the Large Hadron Collider has been caused by the decay of a Higgs boson. 'Correlation is not causality' is a mantra of science, because it's so easy to misinterpret a causal link from things that happen close together and space and time. As a result I was delighted with the idea of what the cover describes as a 'nontechnical guide to the basic ideas of modern causal inference'.
Paul Rosenbaum starts with a driving factor - deducing the effects of medical treatments - and goes on to bring in the significance of randomised experiments versus the problems of purely observational studies, digs into covariates and ways to bring in experiment-like features to observational studies, brings up issues of replication and finishes with the impact of uncertainty and complexity. This is mostly exactly the kind of topics than should be covered in such a guide, and as such it hits spot. But, unfortunately, while it is indeed an effective introductory guide for scientists who aren't mathematicians, Rosenbaum fails on making this accessible to a nontechnical audience.
Rosenbaum quotes mathematician George Pólya as saying that we need a notation that is 'unambiguous, pregnant, easy to remember…' I would have been happier with this book if Rosenbaum had explained how a mathematical notation could possibly be pregnant. (He doesn't.) But, more importantly, the notation used is simply not easy to remember for a nontechnical audience. Within one page of it starting to be used, I had to keep looking back to see what the different parts meant.
We are told that a causal effect is 'a comparison of outcomes' and in the first example given this is rTw - rCw. Bits of this are relatively clear. T and C are treatment and control. W is George Washington (as the example is about his being treated, then dying soon after). I'm guessing 'r' refers to result, though that term isn't used in the text, but most importantly it's not obvious why the 'causal effect' is those two variables, set to arbitrary values, with one subtracted from the other. I'm pretty familiar with algebra and statistics, but I rapidly found the symbolic representations used hard to follow - there has to be a better way if you are writing for a general audience: it appears the author doesn't know how to do this.
The irritating thing is that Rosenbaum doesn't then make use of this representation - he's lost half the readership for no reason. The rest of the book is more descriptive, but time after time the way that examples are described is handled in a way that is going to put people off, bringing in unnecessary jargon and simply writing more like a textbook without detail. Take the opening of the jauntily headed section 'Matching for Covariates as a Method of Adjustment': 'In figure 4 [which is several pages back in a different chapter], we saw more extensive peridontal disease amongst smokers, but we were not convinced that we were witnessing an effect caused by smoking. The figure compared the peridontal disease outcomes of treated individuals and controls who were not comparable. In figures 2-3 we saw that the smokers and nonsmokers were not comparable. The simplest solution is to compare individuals who are comparable, or at least comparable in ways we can see.'
This is a classic example of the importance of being aware of who the audience is and what the book is supposed to do. To reach that target nontechnical audience, the book would have to have been far less of a textbook light, rethinking the way the material is put across. The content is fine for a technical audience who aren't mathematicians - so this is still a useful book - but the content certainly isn't well-presented for the general public.
Another solid introduction to applied statistics and the crucial role of understanding the data underlying statistical tests. Rosenbaum skillfully presents real-world cases where causal inferences and statistical testing have been developed or utilized to make tangible impacts on our daily lives. The author's approach shines in its accessibility, breaking down complex concepts into digestible chunks without sacrificing depth. Throughout the book, readers are treated to a diverse array of examples that illustrate how statistical methods have been important from George Washington to health policy.
This book has intuitive foundations of statistical tests rather than mathematical rigour. This approach makes it particularly appealing to those who want to grasp the core principles and logic behind statistics without diving into heavy calculations.
While it may not satisfy those seeking a rigorous mathematical treatment of statistics, it excels as an "armchair approach" to the subject. I recommend this book for anyone looking to gain a practical understanding of statistical tests, their intuitive derivations, and real-life applications.
This is a rigorous yet practical introduction to understanding cause‑and‑effect in observational studies. Emphasizing design over complex modeling, Rosenbaum explains methods such as matching, randomization inference, and sensitivity analysis to assess causal claims when experiments are not feasible. I struggled with referring to this as an introduction, as I have two science degrees, and was impressed with its depth, but the concepts are explained clearly, it will just take some intense concentration and even repetitive study to assimilate everything. This will be going in my permanent library as a reference.
Short enjoyable and mostly accessible read about the challenges of drawing conclusions about causality, substantiated by practical examples. Rightly, the book stresses the importance of randomization in causal inference. Whether it is not practical to achieve complete randomization in observational and interventional studies, the author presents some techniques to mitigate the effect of potential covariates.
An excellent introduction to the topic that doesn't get very mathematical at all. This book is definitely not sufficient to start doing research, but it is a great start.