2026 IAOS Conference

2026 IAOS Conference

Exploring machine learning (ML) algorithms to predict high response burden and break-off in the Labor Force Survey (LFS)

Conference

2026 IAOS Conference

Format: CPS Abstract - IAOS 2026

Keywords: dataquality, laborforcesurvey, machine learning, respondent burden

Session: AI & ML in official statistics (2)

Wednesday 13 May 4:30 p.m. - 6 p.m. (Europe/Vilnius)

Abstract

Recent advances in artificial intelligence offer new opportunities for survey design and data collection. Building on prior exploratory work with break-off prediction in the Adult Education Survey (AES), this study examines whether machine learning can predict high respondent burden and the risk of break-off in Statistics Norway’s LFS, with the aim of improving the respondent experience and data quality. Unlike the AES, which is fielded every five to six years and thus limits model transfer over time, the LFS is conducted quarterly. Together with the introduction of web data collection in Q2 2025, this provides sufficient data volume and frequency for model training.

The study will use paradata and aggregated demographic information to train and validate models for predicting break-off and high respondent burden. Additionally, expert-based indicators of potential response burden, such as question sensitivity and question length, will be included as features to assess their predictive value for both observed response burden and break-off. These indicators will be evaluated across data collection modes (CATI and CAWI) to explore whether their association with response behaviour differs by survey mode. Lessons learned from the exploratory AES setup will be used to further develop the LFS setup, including the testing of additional features.

A methodological approach for the implementation of machine learning is planned, using the total machine learning error framework described by Putz et al. (2025, in Foundations and Advances of Machine Learning in Official Statistics), with adjustments.

This work aims to contribute to the development and implementation of new techniques for respondent-centred surveys in official statistics. The approach is strictly exploratory and assesses the feasibility of using machine learning for early detection of response behaviour patterns, with potential transferability to the design and monitoring of future panel surveys.