Regional Statistics Conference 2026

Regional Statistics Conference 2026

Outlier Robust XGBoost

Conference

Regional Statistics Conference 2026

Format: CPS Abstract - Malta 2026

Keywords: machine learning, outliers, robustness

Session: CPS 27 Outliers

Wednesday 3 June 4:30 p.m. - 5:30 p.m. (Europe/Malta)

Abstract

XGBoost is a very popular and powerful method for prediction. It is an iterative algorithm fitting simple decision trees to the residuals of the previous step. There is an efficient and scalable implementation available. The standard loss-function for XGBoost is the quadratic function, but a Huber loss is also available. In this paper we study the robustness of XGBoost, and show that it is fairly robust, except in presence of vertical outliers. To address this issue, we study other loss functions, corresponding to the S- and tau- estimators from robust regression. It turns out that a two-step procedure, called the MM-XGBoost, provides the best trade-off between robustness and prediction accuracy.