Regional Statistics Conference 2026

Regional Statistics Conference 2026

Beyond Traditional Field Controls: Leveraging Artificial Intelligence to Monitor and Enhance Survey Data Quality in Large-Scale Field Operations

Conference

Regional Statistics Conference 2026

Format: CPS Abstract - Malta 2026

Keywords: artificial-intelligence, data-quality, machine learning

Session: CPS 08 Quality

Thursday 4 June 11 a.m. - noon (Europe/Malta)

Abstract

Ensuring high-quality survey data remains a core challenge in official statistics and large-scale social research, particularly in settings characterized by complex questionnaires, decentralized field operations, and constrained supervisory capacity. Traditional quality-control mechanisms, such as, back-checks, logical validation rules, and post-collection consistency checks, are essential but increasingly insufficient to detect adaptive fabrication, enumerator effects, and context-dependent anomalies in real time.
This paper presents an artificial intelligence, enabled framework for monitoring survey data quality that extends classical survey-statistical practices through the integration of machine learning and Retrieval-Augmented Generation (RAG) models. The proposed approach combines anomaly-detection algorithms with RAG architectures that dynamically retrieve and ground AI, enumerator profiles, and fieldwork protocols. By anchoring model outputs to authoritative survey documentation and empirical reference data, the framework enhances interpretability, transparency, and statistical coherence while mitigating risks associated with ungrounded AI inference.
Using evidence from large-scale household and perception surveys conducted under diverse field conditions, the paper demonstrates how RAG-enhanced quality indicators improve the early detection of inconsistent response patterns, implausible variable combinations, abnormal interview durations, and latent enumerator-specific behaviors that often evade rule-based systems. Importantly, the framework supports real-time monitoring, enabling targeted supervisory interventions and adaptive field management during data collection rather than relying solely on post-hoc data cleaning.

The paper concludes by discussing methodological implications for survey statisticians, ethical considerations related to accountability and transparency, and the potential role of RAG-based quality systems within national statistical offices and international survey programs.