Regional Statistics Conference 2026

Regional Statistics Conference 2026

Advances in Robust Statistical Inference for High-Dimensional Data

Organiser

AA
Angela Andreella

Participants

  • CM
    Chiara Magnani
    (Presenter/Speaker)
  • Collective Outlier Detection and Enumeration with Conformalized Closed Testing

  • AV
    Anna Vesely
    (Presenter/Speaker)
  • Post-Selection Inference for Multiverse Analysis in Mixed-Effects Model

  • SD
    Samuel Davenport
    (Presenter/Speaker)
  • Localized Cluster Enhancement: TFCE Revisited with Valid Error Control

  • AA
    Angela Andreella
    (Discussant)

  • Proposal Description

    Robust statistical inference aims to develop methods that retain validity under deviations from idealized model assumptions, such as distributional misspecification, outliers, or data-driven procedures. In the context of high-dimensional data, such deviations are standard, and classical inferential procedures often lose their reliability. This invited session brings together recent methodological and theoretical advances in robust inference designed to address these challenges and to ensure the validity of statistical inferential conclusions.
    The session gathers three complementary contributions that together provide a coherent overview of current developments in robust selective statistics.

    Chiara Magagni will discuss a flexible distribution-free method for collective outlier detection and enumeration, designed for situations in which the presence of outliers can be detected powerfully even though their precise identification may be challenging due to the sparsity, weakness, or elusiveness of their signals. This method builds upon recent developments in conformal inference and integrates classical ideas from other areas, including multiple testing, rank tests, and non-parametric large-sample asymptotics. The key innovation lies in developing a principled and effective approach for automatically choosing the most appropriate machine learning classifier and two-sample testing procedure for a given data set. The performance of the method proposed is investigated through extensive empirical demonstrations, including an analysis of the LHCO high-energy particle collision data set.

    Sam Davenport will present a novel development of the Threshold-Free Cluster Enhancement (TFCE) framework, a nonparametric method widely used for signal detection in spatially structured data, such as brain activity maps. In this work, he demonstrates that the high sensitivity of TFCE arises at the expense of spatial specificity, resulting in inflated individual- and cluster-level error rates. To mitigate these issues, he introduces the Localized Cluster Enhancement approach that enables inference within a priori or data-driven clusters while maintaining rigorous control over false positive rates.

    Anna Vesely will present a generalization of the Post-Selection Inference approach to Multiverse Analysis (PIMA), the first inferential framework for multiverse analysis valid for any generalized linear model. The framework is extended to generalized mixed models, enabling valid selective inference in hierarchical and correlated data. It allows testing predictors across the full multiverse of models while controlling the family-wise error rate and supports flexible specifications, including random effects and clustered data. This work broadens PIMA’s applicability to complex data structures common in applied research.

    By integrating perspectives from conformal inference to multiverse analysis, this session invites reflection on a central challenge in modern statistics: how to design procedures that remain valid under model misspecification, data contamination, and the broader spectrum of selective analysis. The discussant will highlight the unifying principles underlying the three contributions, clarify their inferential guarantees, and situate them within the emerging domain of robust high-dimensional inferential methods.