Evaluating the Corporate Tax Performance and Analyzing the Tax Trends through the Utilization of Data Mining Algorithms

Document Type : Research Paper



There is always a considerable difference between the corporate performance and the tax levy that is identified by the taxation authorities which has become a common practice. This fact has led to no fairness among taxpayers, a fact that influences the horizontal and vertical sides of equity. Horizontal equity is created when people feel the benefits of the tax gain that is proportional to the loss of benefits. People with more financial means should also pay more taxes that is equivalent to vertical equity. One reason for the difficulty of attaining the horizontal and vertical equities is to identify the taxpayers based on their previous taxation behavior and to deal with them effectively. The aim of this study is the design of a predictive system that evaluates the corporates taxation behavior based on their previous payments. The predicting system uses key performance variables that are identified during research and it will also help in the classification of companies based on their taxation behavior into three groups of high risk, medium risk and low risk. The system is specifically designed for the taxation authorities who are attempting to effectively assessing the risk of corporate taxes gaining. In this study, the taxation clusters of customers are identified and a decision tree is designed with 80% of accuracy by the utilization of clustering and classification algorithms and effective validation methods. The resulting models of applied algorithms investigate the taxation behavior of each customer and are capable of predicting the tax payment risk of taxpayers in the future with the addition of new corporates to the list.


Main Subjects

  1. Abasian, E., Mahmoodi, V. and Shaker, I. (2013/ 1391). Forecast Error Analysis of State Tax Revenues in Iran. Journal of Financial  Research 13(32).109-132. In Persian.
  2. Abdulsalam, M. and Abd Manaf, N. (2014). Do trust and power moderate each other in relation to tax compliance? Procedia- Social and Behavioral Sciences 164: 49–54.
  3.  Andrade, G., Ramos, G., Madeira, D., Sachetto, R., Ferreira, R. and Rocha, L. (2013). G-DBSCAN: A GPU Accelerated Algorithm for Density-based Clustering. Procedia Computer Science. 18: 369–378.
  4.  Anil, K.J. and Richard, C.D. (1988). Algorithms for clustering data .Prentice- Hall.
  5.  Bernardino da Silva, B., Leitão Paes, N. and Ospina, R. (2015). The replacement of    payroll tax by a tax on revenues: A study of sectorial impacts on the Brazilian economy. Economia. 16:46–59.

6.Lawson, D.J. and Falush, D. (2012). Similarity matrices and clustering algorithms for  population identification using genetic data. March 1, in edited.

  1. Falahpoor, S., Gol arzi, Q. and Fatore chiyan, N. (2014/ 1392). Predicting Stock Price Movement Using Support Vector Machine Based on Genetic Algorithm in Tehran Stock Exchange Market. Journal of Financial Research 15(2).269-288. In Persian.
  2. Ghosh, S. and Kumar Dubey, S. (2013). Comparative Analysis of K-Means and Fuzzy CMeans Algorithms. (IJACSA) International Journal of Advanced Computer Science and Applications,4(4): 35-39.
  3. Hasani, M., Shaban, M., Mokhtari Masinaee, M., and Moodi, M. (2012/ 1391). Discussion effective factor on tax capacity and prediction Khorasan Jonobi tax revenues with using ARMA model. Tax administration core research in Khorasan Jonobi state. In Persian.

10. http://www.mathworks.com/help/stats/classificationtree-class.html. (Seen in July 2015)

  1. 11.  Karami, A. and Johansson, R. (2014). Choosing DBSCAN Parameters Automatically using Differential Evolution. International Journal of Computer Applications. 91(7).

12. Lewis, R., Mello, C. and White, A. (2012). Tracking Epileptogenesis Progressions with Layered Fuzzy K-means and K-medoids Clustering. International Conference on Computational Science, ICCS.

13. Mohd Isa, K., Yussof, S. and Mohdali, R. (2014). The role of tax agents in sustaining the Malaysian tax system. sciences, 31:366–371.

14. Nurpratami, I. and Sitanggang, I. (2015). Classification rules for hotspot occurrence using spatial entropy based Decision tree algorithm. Procedia Environmental Sciences 24:120-126.

15. Popa, M. (2014). Taxes, Fees and Obligations in Romania- Main Components of Companies’ Fiscal Costs. Procedia- Social and Behavioral Sciences109:150-154. 

16. Radfar, R., Nezafati, N. and YousefiAsl, Y. (2014/ 1393), Classification of bank customer based on data mining algorithms. Journal of IT management .1: 71-90. In Persian.

17. Raee, R., Falahpoor, S. and Ameri matin, H. (2013/ 1391). Financial Risk Assessment Model for LNG Projects, Case Study: Iran LNG Project.Journal of Financial Research 14(2): 47-64. In Persian.

  1. 18.  Rokach, R. and Maimon, O. (2008). Data Mining with Decision Trees: Theory and Applications (Series in Machine Perception and Artificial Intelligence. 69, (USA) World Scientific Publishing Co.
  2. 19.  Wentian, J., Zhong Sheng, G. and En, Z. (2013). Improved K-medoids Clustering Algorithm under Semantic Web. Proceedings of the 2nd International Conference on Computer Science and Electronics Engineering (ICCSEE 2013).

25. Wu, R.Sh., Ou, C.S., Chang Sh. and Yen, D.C. (2012). Using Data Mining Technique to Enhance Tax Evasion DetectionPerformance. Expert Systems with Applications, 39: 8769-8777.

  1. Clusterevaluation http://www.uniweimar.de/medien/webis/teaching/lecturenotes/machine-learning/unit-en-cluster-analysis-evaluation.pdf. Seen at July 2015.