Download PDF

Comparison of traditional mathematical-statistical methods and machine learning-based systems for predicting corporate bankruptcy

Author

Tünde Katalin Szántó

Conference

Regional Statistics Conference 2026

Format: CPS Abstract - Malta 2026

Keywords: prediction

Session: CPS 19 Finance

Friday 5 June 10:30 a.m. - 11:30 a.m. (Europe/Malta)

Abstract

In legal terms, bankruptcy clearly means insolvency; bankruptcy refers to the event when a company is unable to meet its payment obligations by the deadline. However, bankruptcy is not a sudden event, but rather a lengthy process, a possible outcome of a period of financial difficulties. This processual nature makes it possible to predict the insolvency of companies in most cases. In recent decades, the importance of corporate bankruptcy prediction has become increasingly prominent. The main cause of the banking crises in Japan and Scandinavia was the bankruptcy of companies to which loans had been granted, which highlighted the importance of assessing the viability of customers when granting loans. The most important users of bankruptcy prediction models are therefore banks, but they can also be useful for accounting firms and even bond rating agencies. There are two basic types of bankruptcy prediction models. On the one hand, there are mathematical-based statistical models, and on the other hand, there are methods based on simulation experiments and machine learning. The beginning of modern corporate bankruptcy prediction can be traced back to 1966, when Beaver published his bankruptcy prediction model based on univariate discriminant analysis. In the decades since then, there has been a wide expansion in the methods, techniques, and indicators used to predict bankruptcy, but there is still no consensus on which methods and indicators should be used to assess the viability of businesses.
Among traditional mathematical and statistical methods, logistic regression is still widely used today to predict corporate bankruptcy: 95 percent of Hungarian banks still rely on this method. The reason for this is the relatively low computational requirements and the ease with which the results can be interpreted.
Advances in computer procedures and data analysis methods now enable the widespread use of machine learning procedures in the field of corporate bankruptcy prediction. Machine learning lies at the intersection of computer science and statistics and is suitable for examining more complex relationships than general econometric methods. A fundamental feature of machine learning-based methods is that they do not make any prior assumptions about the distribution of the variables describing the phenomenon to be modeled or the nature of the relationship between the dependent and independent variables, but rather attempt to discover this based on the data. Among machine learning-based systems, neural networks and support vector machines have become the most widely used methods in corporate bankruptcy prediction. These systems are often criticized for the black box phenomenon and there is also a risk of overfitting, i.e., that the model learns the characteristics of the sample used rather than general patterns. However, this problem can be eliminated by dividing the sample into learning and testing parts.
This study compares whether machine learning-based models or mathematical-statistical methods lead to higher classification accuracy. Based on previous researches, machine learning-based systems are able to classify businesses with greater accuracy. The study examines the effectiveness of the two methods using a sample of companies engaged in significant research and development activities.