Comparing the Estimation Power of Machine Learning Models and Statistical Models in Predicting Profit Component Changes and Selecting the Optimal Model

Document Type : Research Paper

Authors

1 MSc., Department of Accounting Management, Faculty of Economics and Management, Urmia University, Urmia, Iran.

2 Assistant Prof, Department of Accounting, Faculty of Economics and Management, Urmia University, Urmia, Iran.

10.22059/frj.2024.373472.1007580

Abstract

Objective
The aim of predicting profit changes is to create awareness for investors, financial analysts, managers, stock market officials, creditors, and other users to judge the business unit, make decisions about buying or selling stocks, or granting or denying loans and credits. The goal of this research is to evaluate the performance and compare the accuracy of machine learning models and statistical models in predicting the direction of changes in three profit components including net profit (loss), gross profit (loss), and operating profit (loss).
 
Methods
In this research, using the financial information of 139 manufacturing companies listed on the Tehran Stock Exchange over a 15-year period, from 2008 to 2022, and employing 25 machine learning models and 10 statistical models, the efficiency of machine learning models and statistical models in predicting the direction of changes in profit components including net profit (loss), gross profit (loss), and operating profit (loss) has been compared. In the present study, Excel software was used for data sorting, Eviews software for extracting descriptive statistics, and data mining software SPSS Modeler and Rapidminer for predicting profit changes. The performance of machine learning models was evaluated using two criteria: accuracy (predictive accuracy of the model) and AUC (area under the curve), and the performance of statistical models was evaluated only by the accuracy criterion. Finally, in order to select the model with the best performance for predicting the direction of changes in net profit (loss), gross profit (loss), and operating profit (loss), the best model among the machine learning models was chosen using the ROC curve.
 
Results
After calculating the average predictive accuracy of machine learning and statistical models, it was found that the average predictive accuracy of machine learning models for dependent variables including the percentage of changes in net profit (loss), percentage of changes in gross profit (loss), and percentage of changes in operating profit (loss) ranges from 83% to 93%. It was also found that the average predictive accuracy of statistical models for all three profit components varies from 76% to 83%. After confirming the non-normality of the average accuracy of machine learning and statistical models for profit components using the Kolmogorov-Smirnov test, the non-parametric Mann-Whitney U test was used to compare the predictive accuracy of machine learning models and statistical models in predicting the direction of changes in profit components.
 
Conclusion
The results of the research hypotheses test indicate the high efficiency of machine learning models in predicting the direction of changes in net profit (loss), gross profit (loss), and operating profit (loss), compared to statistical models. The ROC curve results indicate that the decision tree model achieved a predictive accuracy of 100% in forecasting the direction of changes in net profit (loss) and 99.38% accuracy in predicting the direction of changes in gross profit (loss). Additionally, the rule-based inference model demonstrated a predictive accuracy of 86.76% for forecasting the direction of changes in operating profit (loss). These models exhibited the best performance and were selected as the optimal models.

Keywords

Main Subjects


 
Anand, V., Brunner, R., Ikegwu, K. & Sougiannis, T. (2019). Predicting profitability using machine learning. Available at SSRN 3466478.
Asadi, M., Mirbargkar, S. & Chirani, E. (2022). Providing a neural network model to predict the profits of companies listed on the Tehran Stock Exchange and comparing its accuracy with HDZ and ARIMA models‏‏. Management Accounting, 15(54), 163-180. (in Persian)
Ashtab, A., Haghighat, H. & Kordestani, G. (2017). Comparison of Financial Distress Prediction Models Accuracy and its Effect on Earnings Management Tools. Accounting and Auditing Review, 24(2): 147-172. (in Persian)
Bagheri F, Alizadeh Majd H, Mehrbakhsh Z, Ziaratban M(2014). Use of data mining algorithms in assessing the affecting factors on predicting the health status of newborns. Jorjani Biomed Journal, 2 (2), 59- 68. (in Persian)
Barboza, F., Kimura, H. & Altman, E. (2017). Machine learning models and bankruptcy prediction. Expert Systems with Applications, 83, 405-417.
Bengio, Y., Courville, A. & Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8), 1798-1828.
Bishop, C. M. & Nasrabadi, N. M. (2006). Pattern recognition and machine learning, 4(4), 738.‏ New York: springer
Breuel, T. & Shafait, F. (2010). Automlp: Simple, effective, fully automated learning rate and size adjustment. In The Learning Workshop (Vol. 4, p. 51). Cliff Lodge.
Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M. & Elhadad, N. (2015). Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining: 1721-1730.
Chatfield, C. & Xing, H. (2019). The analysis of time series: an introduction with R. CRC press.
Chen, X., Cho, Y. H., Dou, Y. & Lev, B. (2022). Predicting future earnings changes using machine learning and detailed financial data. Journal of Accounting Research, 60(2), 467-515.
Dastile, X., Celik, T. & Potsane, M. (2020). Statistical and machine learning models in credit scoring: A systematic literature survey. Applied Soft Computing, 91, 106263.
Fayyad, U., Piatetsky-Shapiro, G. & Smyth, P. (1996). From data mining to knowledge discovery in databases. AI magazine, 17(3), 37-37.
Francis, J., LaFond, R., Olsson, P. & Schipper, K. (2005). The market pricing of accruals quality. Journal of accounting and economics, 39(2), 295-327.
Freeman, R. N., Ohlson, J. A. & Penman, S. H. (1982). Book rate-of-return and prediction of earnings changes: An empirical investigation. Journal of accounting research, 639-653.
Ghaderi, E., Amini, P. & Mohammadi Molqarny, A. (2020). Application of Artificial Neural Network Hybrid Models with Metaheuristic Algorithms (PSO, ICA) in Earnings Management Forecast. Empirical Research in Accounting, 10(2), 213-248. doi: 10.22051/jera.2018.19246.1952 (in Persian)
Gelman, A., Hill, J. & Vehtari, A. (2020). Regression and other stories. Cambridge University Press.
Gerakos, J. & Gramacy, R. (2013). Regression-based earnings forecasts. Chicago Booth Research Paper, 12-26.
Gitman, L. J. (1998). Principles of managerial finance. Addison Wesley Longman Higher Education.
Heidari, M. & Amiri, A.R. (2023). Inspecting the Predictive Power of Artificial Intelligence Models in Predicting the Stock Price Trend in Tehran Stock Exchange, Financial Research Journal, 24(4), 602-623. (in Persian)
Huang, G. B., Wang, D. H. & Lan, Y. (2011). Extreme learning machines: a survey. International journal of machine learning and cybernetics, 2, 107-122.
Jones, S., Moser, W. J. & Wieland, M. M. (2023). Machine learning and the prediction of changes in profitability. Contemporary Accounting Research, 40(4), 2643-2672.
Lev, B. & Gu, F. (2016). The end of accounting and the path forward for investors and managers. John Wiley & Sons.
Li, K. K. & Mohanram, P. (2014). Evaluating cross-sectional forecasting models for implied cost of capital. Review of Accounting Studies, 19, 1152-1185.
Martins, A. I. (2022). Earnings prediction using machine learning methods and analyst comparison (Doctoral dissertation).
Mierswa, I. (2006, July). Evolutionary learning with kernels: A generic solution for large margin problems. In Proceedings of the 8th annual conference on genetic and evolutionary computation, 1553-1560.
Mills, T. C. (2019). The econometric modelling of financial time series. Cambridge University Press
Mirzaei, S., Ashtab, A., & Zavari Rezaei, A. (2023). Comparing the Efficiency of Statistical Models and Machine-Learning Models and Choosing the Optimal Model for Predicting Net Profit and Operating Cash Flows. Journal of Asset Management and Financing, 11(2), 53-74. (in Persian)
Monahan, S. J. (2018). Financial statement analysis and earnings forecasting. Foundations and Trends® in Accounting, 12(2), 105-215.
Montgomery, D. C., Jennings, C. L. & Kulahci, M. (2019). Introduction to time series analysis and forecasting. John Wiley & Sons.
Moradi, B., Bahri Sales, J., Jabarzadeh Kangharlui, Said & Ashtab, A. (2022). Explaining and Proposing a Market Liquidity Prediction Model in Tehran Stock Exchange, Financial Research Journal, 24(1), 134-156. (in Persian)
Mullainathan, S. & Spiess, J. (2017). Machine learning: an applied econometric approach. Journal of Economic Perspectives, 31(2), 87-106.
Nourahmadi, M. & Sadeghi, H. (2022). A Machine Learning-Based Hierarchical Risk Parity Approach: A Case Study of Portfolio Consisting of Stocks of the Top 30 Companies on the Tehran Stock Exchange, Financial Research Journal, 24(2), 236-256. (in Persian)
Ou, J. A. & Penman, S. H. (1989). Financial statement analysis and the prediction of stock returns. Journal of accounting and economics, 11(4), 295-329.
Parlina, N. D. & Budianto, E. (2021). Implementation of cash flow as a measuring tool in predicting future net income: (case study at kedai nyobian 8 daily period september–october 2020). Journal of management, accounting, general finance and international economic issues, 1(1), 16-24.
Petropoulos, A., Siakoulis, V., Stavroulakis, E. & Vlachogiannakis, N. E. (2020). Predicting bank insolvencies using machine learning techniques. International Journal of Forecasting, 36(3), 1092-1113.
Popescu, M. C., Balas, V. E., Perescu-Popescu, L. & Mastorakis, N. (2009). Multilayer perceptron and neural networks. WSEAS Transactions on Circuits and Systems, 8(7), 579-588.
Raschka, S. & Mirjalili, V. (2019). Python machine learning: Machine learning and deep learning with Python, scikit-learn, and TensorFlow 2. Packt Publishing Ltd.
Render, B. & Stair Jr, R. M. (2016). Quantitative Analysis for Management, 12e. Pearson Education India.
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature machine intelligence, 1(5), 206-215.
Schipper, K. & Vincent, L. (2003). Earnings quality. Accounting horizons, 17, 97-110.
Tavakoli, S. & Ashtab, A. (1402). Comparing the effectiveness of machine learning models and statistical models in predicting financial risk. Financial management strategy, 11(1), 53-76. (in Persian)
Urso, A., Fiannaca, A., La Rosa, M., Ravì, V. & Rizzo, R. (2018). Data mining: Prediction methods. Encycl. Bioinforma. Comput. Biol. ABC Bioinforma, 1, 3.
Vaez, S.A., Montazer Hojat, A.H. & Ghadim, R.B. (2017). The Effect of Profit Sensitivity Dimensions (Earnings Response Coefficient, Returns Abnormal Fluctuations and Earning Prediction Error) on Board of Director’s Compensation. Financial Research Journal, 19(4), 615-642. (in Persian)
Wahlen, J. M. & Wieland, M. M. (2011). Can financial statement analysis beat consensus analysts’ recommendations? Review of Accounting Studies, 16, 89-115.