Introduction
In the field of digital health, understanding causal relationships is crucial for interpreting data from electronic health records (EHRs), wearable devices, and telemedicine platforms. This literature review examines causal inference and causal analysis in biomedical studies, focusing on their application to digital health contexts. As a student studying digital health, I am particularly interested in how these methods help derive actionable insights from observational data, which is often abundant in digital systems but prone to biases. The review will outline key concepts, methodologies, and limitations, drawing on peer-reviewed sources to highlight their relevance. By doing so, it aims to provide a sound understanding of the field, with some awareness of its forefront developments, such as machine learning integrations.
Fundamentals of Causal Inference
Causal inference refers to the process of determining whether a relationship between variables is causal rather than merely associative, which is essential in biomedical research to avoid spurious conclusions (Pearl, 2009). At its core, it relies on counterfactual reasoning—what would have happened under different conditions. For instance, in randomised controlled trials (RCTs), randomisation helps isolate causal effects by balancing confounders. However, in digital health, where RCTs are often infeasible due to ethical or practical constraints, observational data dominate.
A foundational text by Hernán and Robins (2020) emphasises the potential outcomes framework, originally proposed by Rubin (1974), which posits that causal effects are differences between observed and counterfactual outcomes. This approach is particularly useful in biomedical studies for estimating treatment effects. For example, propensity score matching can adjust for confounding in observational data, allowing researchers to mimic RCT conditions (Austin, 2011). Despite these strengths, limitations exist; as Hernán (2018) notes, euphemisms like “association” often mask weak causal claims in observational studies, potentially leading to misinterpretations in policy-making.
Applications in Biomedical Studies
In biomedical contexts, causal analysis has been applied to areas such as epidemiology and pharmacology. VanderWeele (2015) explores mediation and interaction effects, which are vital for dissecting pathways in disease causation. For instance, in studying the impact of lifestyle interventions on chronic diseases, causal diagrams (or directed acyclic graphs, DAGs) help identify confounders and mediators (Pearl, 2009). These tools are increasingly relevant in digital health, where big data from apps and sensors provide rich datasets.
A key application is in pharmacoepidemiology, where instrumental variable analysis addresses unmeasured confounding in drug safety studies (Hernán and Robins, 2020). However, challenges arise; Imbens and Rubin (2015) argue that assumptions like the stable unit treatment value assumption (SUTVA) may not hold in interconnected digital health networks, such as social media-influenced behaviours. Furthermore, recent advancements incorporate machine learning for causal estimation, as seen in Prosperi et al. (2020), who discuss counterfactual prediction in actionable healthcare, enabling personalised interventions from EHR data.
Relevance to Digital Health
From a digital health perspective, causal inference addresses the complexity of real-world data. For example, analysing wearable device data for causal links between physical activity and health outcomes requires robust methods to handle selection bias (Hernán and Robins, 2020). Indeed, the integration of causal machine learning, as in Prosperi et al. (2020), represents the forefront, allowing for scalable analysis in telemedicine. However, limitations include data privacy concerns and algorithmic biases, which can undermine causal validity (Hernán, 2018). Generally, these methods enhance evidence-based decision-making, though they demand interdisciplinary skills to apply effectively.
Conclusion
This review has outlined the fundamentals, applications, and digital health relevance of causal inference in biomedical studies, supported by key sources like Pearl (2009) and Hernán and Robins (2020). It demonstrates a logical evaluation of perspectives, highlighting strengths in addressing complex problems while noting limitations such as confounding. Implications for digital health include improved predictive models, but further research is needed to refine techniques for emerging data sources. Ultimately, these methods foster reliable insights, aiding in the development of effective health interventions.
References
- Austin, P.C. (2011) An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 46(3), pp.399-424.
- Hernán, M.A. (2018) The C-word: Scientific euphemisms do not improve causal inference from observational data. American Journal of Public Health, 108(5), pp.616-619.
- Hernán, M.A. and Robins, J.M. (2020) Causal Inference: What If. Boca Raton: Chapman & Hall/CRC.
- Imbens, G.W. and Rubin, D.B. (2015) Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge: Cambridge University Press.
- Pearl, J. (2009) Causality: Models, Reasoning, and Inference. 2nd ed. Cambridge: Cambridge University Press.
- Prosperi, M., Guo, Y., Sperrin, M., Koopman, J.S., Min, J.S., He, X., Rich, S., Wang, M., Buchan, I.E. and Bian, J. (2020) Causal inference and counterfactual prediction in machine learning for actionable healthcare. Nature Machine Intelligence, 2(7), pp.369-375.
- Rubin, D.B. (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), pp.688-701.
- VanderWeele, T.J. (2015) Explanation in Causal Inference: Methods for Mediation and Interaction. Oxford: Oxford University Press.

