New population estimation method in Latvia
Conference
Format: CPS Abstract - IAOS 2026
Keywords: "population, estimate, machine learning
Session: Topics in health & demography
Tuesday 12 May 4:30 p.m. - 6 p.m. (Europe/Vilnius)
Abstract
The Population and Housing Census 2011 revealed that, at the beginning of 2011, Latvia had a usually resident population of 2 074.6 thousand – 7% fewer than recorded in the Register of Natural Persons 1 maintained by the Office of Citizenship and Migration Affairs (2 228.0 thousand). To address this discrepancy, Central Statistical Bureau of Latvia developed a new methodology for estimating population of Latvia more accurately.
The method relies on statistical classification and migration mirror statistics and divides the population recorded in the Register of Natural Persons into two groups: persons actually residing in Latvia and those living abroad. A logistic regression model, developed in 2013 and trained on data from the Population and Housing Census 2011, forms the core of the method. The approach has been applied in the production of population statistics since 2012 and was used until 2024.
As the Population and Housing Census 2021 was based entirely on administrative data, it was not possible to update the training dataset. Therefore, in 2021, work began on developing a new method using the SoL-logit model, which belongs to the class of unsupervised machine learning models – that is, the model is trained without labelled historical data on a person’s status as a usual resident at the beginning of the year. Consequently, since 2025, the SoL-logit model has been used to determine whether a person is a usual resident of Latvia, and the data for 2023 and 2024 have been recalculated accordingly.