2026 IAOS Conference

2026 IAOS Conference

Responsible AI in Official Statistics with Cantabular

Conference

2026 IAOS Conference

Format: CPS Abstract - IAOS 2026

Keywords: #officialstatistics, dissemination

Session: AI & ML in official statistics (3)

Thursday 14 May 9 a.m. - 10:30 a.m. (Europe/Vilnius)

Abstract

The rapid emergence of large language models (LLMs) has transformed how users access and interact with data, including official statistics. These models lower barriers to entry by enabling natural language interaction, but they also introduce significant risks: reliance on unverified sources, loss of statistical context, and well-documented limitations in numerical reasoning and reproducibility. In the domain of official statistics—where accuracy, transparency, and trust are paramount—these challenges are particularly acute.

Our paper presents a practical approach to integrating LLMs into statistical production systems while preserving the core principles of official statistics. We describe a project undertaken by Cantabular Ltd in which an LLM is used solely as an interface layer, translating human language prompts into formal statistical queries against sensitive microdata. Crucially, the LLM does not generate results, perform computations, or interpret outputs. All calculations, disclosure controls, and tabulations are executed by established statistical systems, and results are presented using traditional, auditable methods. We discuss the design choices required to constrain LLM behaviour, ensure reproducibility, and maintain clear separation between user intent, statistical logic, and computation. The approach demonstrates how AI can improve accessibility and user engagement without compromising statistical integrity or transparency.

Our paper argues that producers of official statistics should actively shape how AI is used in their domain. By embedding LLMs within well-governed statistical pipelines rather than allowing them to operate as autonomous analytical agents, statistical organisations can harness their strengths while regaining control of the narrative around data, trust, and evidence-based decision-making.