64th ISI World Statistics Congress - Ottawa, Canada

64th ISI World Statistics Congress - Ottawa, Canada

Methods for estimating industrial water use in Canada


Rezvan Taki


  • B
    Beni Ngabo Nsengiyaremye
  • I
    Ibrahim Ousmane Ida
  • M
    Michael Schimpf
  • M
    Martin Hamel


64th ISI World Statistics Congress - Ottawa, Canada

Format: CPS Abstract

Session: CPS 19 - Statistical estimation III

Monday 17 July 4 p.m. - 5:25 p.m. (Canada/Eastern)


Industrial facilities rely on water for their processes and production activities. Industrial water use refers to water withdrawals in three economic sectors: manufacturing, mining, and thermoelectric power plants. The estimation of industrial water use is crucial for a variety of related environmental policies, such as establishing realistic water conservation goals for industry. Statistics Canada conducts the biennial Industrial Water Survey (IWS), which publishes annual data on water use every second year. These data fulfill the requirements for water-related indicators as part of the Canadian Environmental Sustainability Indicators (CESI) published by Environment and Climate Change Canada. However, the lack of data for the non-surveyed years represents a major gap in our understanding of water consumption. Hence, the predictive accuracy of several statistical models for estimating industrial water use at the national level for the non-surveyed years was investigated. The model inputs include the surveyed estimates from the IWS as well as auxiliary data from other survey programs. The Extreme Gradient Boosting (XGBoost) model, linear regression, lasso regression, partial least squares regression (PLS), and multiple imputation by chained equation (MICE) models were compared using cross-validation to determine error and bias. The XGBoost technique provided the most robust option for predicting water use in the manufacturing industry. The PLS regression showed a lower error in the prediction of thermoelectric power water use. For the mining industry, including coal, metal mining and non-metal mining, different techniques of PLS regression, linear regression, and Lasso regression respectively outperformed the other statistical methods. Based on the proposed statistical models for each industry, water intake for 2019, a non- surveyed year due to the COVID-19 pandemic, was predicted. Given the successful calculation of robust data for 2019, historic data for the non-surveyed years, back to 2007, were also estimated, creating a continuous time series. Overall, our study highlights the utility of industrial water use modeling to enhance the quality and consistency of national industrial water use data and provides valuable insights for water management planning.