Statistical Innovation for Complex Data and Inference
Conference
Abstract
Modern statistical analysis increasingly involves complex and dependent data structures for which classical likelihood-based inference or fully specified probabilistic models are impractical, unavailable, or computationally prohibitive. Such challenges arise in a wide range of contemporary applications, including environmental extremes, longitudinal studies, and large-scale industrial systems, where dependence, high dimensionality, and model misspecification are inherent features of the data. Addressing these challenges requires innovative inferential methodologies that move beyond traditional assumptions while remaining statistically principled and computationally feasible. This Special Invited Session Paper of the ISI Young Statisticians Committee brings together recent methodological advances that respond to these demands, highlighting new approaches to inference, testing, and monitoring for complex data.
The first talk addresses inference based on penalized estimating equations in scenarios where specifying the full marginal distribution of the response is challenging, such as in longitudinal or overdispersed data. Under correct specification of the conditional mean, the proposed methodology yields √n-consistent estimators even when the working covariance structure is misspecified. The talk further develops hypothesis testing procedures based on partial penalization, demonstrating how inferential validity and power depend on the choice of variance and correlation structures.
The second talk presents likelihood-free inference methods for flexible bivariate extremal dependence models using neural Bayes estimators and classifiers. By exploiting the ability to efficiently simulate from complex models, neural networks are used to perform parameter estimation and model selection in settings where likelihood evaluation is infeasible. The approach provides an amortized inference framework capable of distinguishing between different extremal dependence regimes and delivering fast, reliable inference in applications involving environmental and geophysical extremes.
The third talk revisits statistical process control from a modern inferential perspective in the context of Big Data and Industry 4.0. Rather than being supplanted by real-time analytics or artificial intelligence tools, SPC is reframed as a flexible framework for inference, monitoring, and decision-making in high-dimensional and dependent industrial data. Through a large-scale retail case study, the talk illustrates how contemporary SPC methods can be embedded within complex data environments to support operational decision-making.