Regional Statistics Conference 2026

Regional Statistics Conference 2026

Causal Inference and External Validity in Observational Studies: Methodological Challenges for Modern Health Research

Conference

Regional Statistics Conference 2026

Format: IPS Abstract - Malta 2026

Keywords: biostatistics

Session: IPS 1280 - Innovative survey methods for the public good

Friday 5 June 8:30 a.m. - 10:10 a.m. (Europe/Malta)

Abstract

Observational studies have become increasingly central to health research, particularly in settings where randomized controlled trials are no longer feasible. However, their external validity, the capacity to extend findings to broader populations, is frequently undermined by confounding, selection bias, and the complexity of real-world data. In this landscape, synthetic data offer a methodological sandbox: a controlled space to explore causal scenarios, refine hypotheses, and simplify high-dimensional relationships in observational contexts where causal pathways are often obscured.
This work contributes to the development of methodological approaches that enhance the reliability of causal inference in observational research. Central to this effort is the Bayesian Network Propensity Score (BNPS), which uses Bayesian Networks to model complex covariate dependencies without imposing rigid parametric constraints. BNPS provides a flexible yet statistically robust framework for estimating treatment effects, as evidenced by simulations and practical applications that highlight its ability to reduce bias and improve the precision of causal estimates.
While synthetic data do not replace real-world evidence, they serve as a strategic tool for:
• Assessing the robustness of causal models under different conditions,
• Identifying key covariates in high-dimensional datasets,
• Simulating rare or complex scenarios where observational data may be limited.
Their use, however, requires critical consideration of potential pitfalls, such as the risk of oversimplifying dependencies or overlooking rare but critical events. The primary role of synthetic data is to support and refine—not substitute—rigorous causal inference.
By combining BNPS with transparent validation, clear assumptions, and reproducibility, this work seeks to strengthen the external validity of observational studies. The objective is to produce findings that are not only statistically valid but also generalizable and applicable in real-world health research, where causal ambiguity remains a defining challenge.