Outlier Robust XGBoost
Conference
Regional Statistics Conference 2026
Format: CPS Abstract - Malta 2026
Keywords: machine learning, outliers, robustness
Session: CPS 27 Outliers
Wednesday 3 June 4:30 p.m. - 5:30 p.m. (Europe/Malta)
Abstract
XGBoost is a very popular and powerful method for prediction. It is an iterative algorithm fitting simple decision trees to the residuals of the previous step. There is an efficient and scalable implementation available. The standard loss-function for XGBoost is the quadratic function, but a Huber loss is also available. In this paper we study the robustness of XGBoost, and show that it is fairly robust, except in presence of vertical outliers. To address this issue, we study other loss functions, corresponding to the S- and tau- estimators from robust regression. It turns out that a two-step procedure, called the MM-XGBoost, provides the best trade-off between robustness and prediction accuracy.