Impact of the non-probability sample on the accuracy of an estimator of a total when integrating non-probability and probability samples
Conference
65th ISI World Statistics Congress
Format: IPS paper - WSC 2025
Keywords: composite estimator, multivariate normal distribution, poisson pseudo-sampling design, propensity score, variance
Session: IPS 700 - Non-probability and Probability Sample Integrated Estimators for the Population Parameters
Monday 6 October 2 p.m. - 3:40 p.m. (Europe/Amsterdam)
Abstract
Data from the non-probability and probability samples are combined to estimate the finite population total. Assuming that the values of the study variable are available in both samples and under the independence of the pseudo-inclusion indicators to the non-probability sample, the integration of non-probability and probability samples through a composite estimator of the population total is studied. The integration is composed of a linear combination of the inverse probability weighted (IPW) estimator and a traditional design-based estimator. When evaluating the variance of the former estimator, the randomness of the underlying non-probability sample is taken into account through the distribution of the estimated propensity scores. Non-sampling errors are not considered. This approach is compared with an estimator proposed by other authors and with a bootstrap variance estimator. The proposed linear combination is robust to the misspecification of the propensity score model due to the incorporated bias estimator of the IPW estimator. A simulation study estimating the number of Lithuanian companies possessing websites is used to demonstrate the properties of the proposed estimators by combining survey data with large voluntary sample data. The methodological framework and empirical results are described in detail in Čiginas, A., Krapavickaitė, D., & Nekrašaitė-Liegė, V. (2025). Evaluating the Impact of a Non-Probability Sample-Based Estimator in a Linear Combination with an Estimator from a Probability Sample. Journal of Official Statistics, 41(2), 649–674. https://doi.org/10.1177/0282423X251331346