Sampling Population Networks at Scale: An Adaptive Framework for Demographic and Socio-Economic Analysis
Conference
Regional Statistics Conference 2026
Format: CPS Poster - Malta 2026
Session: CPS Poster Session 01
Wednesday 3 June 10 a.m. - 11 a.m. (Europe/Malta)
Abstract
Manuela Maia (1) and Pedro Campos(2)
(1) School of Economics and Management, University of Porto, and ISPGAIA and
(2) School of Economics and Management, University of Porto, and Statistics Portugal
The increasing availability of large-scale population data, derived from administrative registers, digital traces, and survey–register integration, has created new opportunities for studying complex social dynamics at unprecedented levels of granularity. Population networks—linking individuals to households, workplaces, services, or neighborhoods—often comprise millions of interconnected units, making full enumeration or exhaustive analysis computationally impractical. Importantly, however, understanding key population patterns does not require observing the entire network.
In this paper, we propose an adaptive network sampling framework to analyze population dynamics through a hybrid approach that combines weighted random walks with a novel neighborhood expansion mechanism. The methodology is applied to bipartite population networks (e.g., individuals–areas, individuals–services, or individuals–labor market states), which are first projected onto similarity networks capturing shared characteristics or behaviors. These projections enable the identification of latent population clusters, such as groups with similar mobility patterns, service usage, or socio-economic profiles.
The adaptive sampling process dynamically balances the representation of highly connected population groups (e.g., urban residents or frequent service users) and less visible or hard-to-reach subpopulations, while neighborhood expansion preserves local structural context and social embeddedness. We evaluate the proposed framework using demographic and socio-economic attributes such as age structure, employment status, and income proxies, comparing sampled subgraphs to full-population benchmarks in terms of distributional accuracy and structural fidelity. Results show that the adaptive approach outperforms uniform and purely random-walk sampling, achieving higher representativeness and lower distortion of population heterogeneity. The framework demonstrates strong potential for supporting population-based analyses, small-area statistics, and evidence-informed public policy design under data and computational constraints