Epidemiology is a scientific discipline whose essence is causal inference. Without a solid understanding of causality and causal inference methods, epidemiology could not fulfill its role in disease prevention and control, nor in evaluating the efficacy, effectiveness, and safety of preventive and therapeutic interventions. In some of its branches, such as clinical epidemiology or pharmacoepidemiology, estimating causal effects is at the core of research.
Causal Inference: What If, by Miguel Hernán and James Robins, is a “living” book that has been continuously updated since its early versions. It has become a key reference for understanding the methods used to estimate causal effects from observational (and experimental) data, and, by extension, in the methodology of epidemiological research and the broader field of “data sciences”. The book is freely accessible at https://miguelhernan.org/whatifbook, where additional resources such as datasets and code for various statistical software (R, Stata, SAS, Python, and Julia) can be downloaded.
Causal Inference: What If is structured into three progressively complex sections. The first part, Causal Inference Without Models, introduces fundamental concepts of causal inference. It includes chapters on the definition of causal effects, randomized trials and observational studies, counterfactuals, target trial emulation, stratification, interaction, confounding, selection bias, and causal diagrams. A key takeaway from this section is the importance of explicitly defining the causal question before conducting data analysis. This part also introduces the mathematical notation that formalizes causal concepts and serves as the foundation for subsequent chapters.
The second part, Causal Inference With Models, explores the use of parametric models for estimating causal effects and specific techniques such as inverse probability weighting, marginal structural models, the parametric g-formula, propensity scores, instrumental variables, and survival analysis. This section highlights the balance between bias and variance and how different approaches can help reduce uncertainty in causal inference. In the third part, the authors focus on causal inference for time-varying treatments. Methods such as target trial emulation and nested structural models are examined. This section is particularly useful for researchers working with longitudinal data or analyzing long-term interventions (clinical, policy-related, etc.).
Causal Inference: What If is a well-structured and didactic book. Within its field, it is even considered accessible. The resource to causal diagrams and explanations about bias structures (selection bias, information bias, and confounding) helps readers understand how different factors can influence causal effect estimation.
This book is an essential resource for scientists across multiple disciplines, but its relevance in epidemiology, clinical epidemiology, and pharmacoepidemiology is particularly noteworthy. It provides a methodological framework for evaluating the effectiveness of interventions (public health, health promotion, policy, education, clinical, pharmacological) and understanding causality in observational studies. In summary, it enables evidence-based decision-making, which ultimately relies on demonstrating a causal relationship between a specific course of action and an outcome.
Authorship contributionsBoth authors have contributed equally to the review.
FundingNone.
Conflicts of interestNone.