Artificial Intelligence and Data Mining in Detecting Financial Statement Fraud: A Systematic Literature Review
Abstract
General Background: Fraud in financial reporting significantly undermines stakeholder confidence and destabilises financial markets. Specific Background: The increasing complexity of financial data makes traditional fraud detection techniques inadequate, necessitating more sophisticated methods such as data mining and artificial intelligence (AI). Knowledge Gap: Despite the increasing adoption of AI in fraud detection, previous systematic literature reviews (SLRs) have generally focused narrowly on specific algorithms or data types, thus failing to provide a comprehensive assessment across multiple contexts. Objective: This study aims to critically evaluate the application of AI and data mining techniques in detecting financial statement fraud through a systematic literature review. Methods: A total of 30 peer-reviewed articles published between 2014 and 2024 were selected from Scopus, ScienceDirect, and Emerald databases using predefined inclusion-exclusion criteria and analysed narratively. Results: The review identified that supervised learning algorithms, specifically Support Vector Machine (SVM), Logistic Regression (LR), and XGBoost, were predominantly used, with XGBoost (96.94%) and LSTM (94.98%) showing the highest accuracy. Integration of financial and non-financial data improves detection stability. Novelty: In contrast to previous systematic reviews, this study offers a holistic synthesis covering algorithm types, structured and unstructured data, and diverse regional contexts. Implications: The findings highlight the transformative potential of AI in fraud detection and encourage further research on unsupervised learning and more in-depth utilisation of unstructured data
Downloads
Metrics
References
ACFE. (2022). Occupational Fraud 2022: A Report To The Nations. Association of Certified Fraud Examiners, 1–96. https://acfepublic.s3.us-west-2.amazonaws.com/2022+Report+to+the+Nations.pdf
Agustan, T. J., & Sari, U. P. (2022). Analisis Laporan Keuangan Guna Alat Ukur Kinerja Keuangan Pada Pt. Global Imoo Telekomunikasi. Worksheet : Jurnal Akuntansi, 1(2), 94–103. https://doi.org/10.46576/wjs.v1i2.2116
Al-Hashedi, K. G., & Magalingam, P. (2021). Financial fraud detection applying data mining techniques: A comprehensive review from 2009 to 2019. Computer Science Review, 40. https://doi.org/10.1016/j.cosrev.2021.100402
Alfian, F., & Triani, N. N. A. (2019). Fraudulent Financial Reporting Detection Using Beneish M-Score Model in Public Companies in 2012-2016. Asia Pacific Fraud Journal, 4(1), 27–42. https://doi.org/10.21532/apfj.001.19.04.01.03
Ali, A. A., Khedr, A. M., El-Bannany, M., & Kanakkayil, S. (2023). A Powerful Predicting Model for Financial Statement Fraud Based on Optimized XGBoost Ensemble Learning Technique. Applied Sciences (Switzerland), 13(4). https://doi.org/10.3390/app13042272
Altman, E. I. (1974). American Finance Association. The Journal of Finance, 29(1), 312–312. https://doi.org/10.1111/j.1540-6261.1974.tb00057.x
Ashtiani, M. N., & Raahemi, B. (2022). Intelligent Fraud Detection in Financial Statements Using Machine Learning and Data Mining: A Systematic Literature Review. IEEE Access, 10, 72504–72525. https://doi.org/10.1109/ACCESS.2021.3096799
Barman, R. D. (2023). Financial Statement: A tools to evaluate Business Performance. Business Management and Economics Engineering, 21(April), 819–835. https://businessmanagementeconomic.org/pdf/2023/02/819.pdf
Beneish, M. D. (1999). The Detection of Earnings Manipulation. Financial Analysts Journal, 55(5), 24–36. https://doi.org/10.2469/faj.v55.n5.2296
Chen, S. (2016). Detection of fraudulent financial statements using the hybrid data mining approach. In SpringerPlus (Vol. 5, Issue 1). https://doi.org/10.1186/s40064-016-1707-6
Chen, S., Goo, Y.-J. J., & Shen, Z.-D. (2014). A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements. Scientific World Journal, 2014. https://doi.org/10.1155/2014/968712
Craja, P., Kim, A., & Lessmann, S. (2020). Deep learning for detecting financial statement fraud. Decision Support Systems, 139. https://doi.org/10.1016/j.dss.2020.113421
Cressey, D. (2018). A hipótese de Cressey (1953) e a investigação da ocorrência de fraudes corporativas: Uma análise empírica em instituições bancárias brasileiras. Revista Contabilidade e Financas, 29(76), 60–81. https://doi.org/10.1590/1808-057x201803270
Daeli, A., Hutauruk, R. A., Rifai, M. B., & Silaen, K. (2024). Analisis Laporan Keuangan Sebagai Penilai Kinerja Manajemen. Pusat Publikasi Ilmu Manajemen, 2(3), 158–168. https://doi.org/10.59603/ppiman.v2i3.445
Dechow, P. M., Ge, W., Larson, C. R., & Sloan, R. G. (2011). Predicting Material Accounting Misstatements. Contemporary Accounting Research, 28(1), 17–82. https://doi.org/10.1111/j.1911-3846.2010.01041.x
Dong, W., Liao, S., & Zhang, Z. (2018). Leveraging Financial Social Media Data for Corporate Fraud Detection. Journal of Management Information Systems, 35(2), 461–487. https://doi.org/10.1080/07421222.2018.1451954
Dutta, I., Dutta, S., & Raahemi, B. (2017). Detecting financial restatements using data mining techniques. Expert Systems with Applications, 90, 374–393. https://doi.org/10.1016/j.eswa.2017.08.030
Gupta, S., & Mehta, S. K. (2024). Data Mining-based Financial Statement Fraud Detection: Systematic Literature Review and Meta-analysis to Estimate Data Sample Mapping of Fraudulent Companies Against Non-fraudulent Companies. Global Business Review, 25(5), 1290–1313. https://doi.org/10.1177/0972150920984857
Hajek, P., & Henriques, R. (2017). Mining corporate annual reports for intelligent detection of financial statement fraud – A comparative study of machine learning methods. Knowledge-Based Systems, 128, 139–152. https://doi.org/10.1016/j.knosys.2017.05.001
Hamal, S., & Senvar, O. (2021). Comparing performances and effectiveness of machine learning classifiers in detecting financial accounting fraud for turkish smes. International Journal of Computational Intelligence Systems, 14(1), 769–782. https://doi.org/10.2991/ijcis.d.210203.007
Han, J., Kamber, M., & Pei, J. (2012). Data mining: Concepts and techniques (3rd ed.). Elsevier. https://www.sciencedirect.com/book/9780123814791/data-mining-concepts-and-techniques
Hari, K. K., Sabrina, N., & Meratia, M. (2025). The Role of Whistleblowing in Moderate Factors Affecting Accounting Fraud Tendencies. Journal of Accounting Science, 9(1), 25–61. https://doi.org/10.21070/jas.v9i1.1842
Hu, K.-H., Chen, F.-H., & Chang, W.-J. (2016). Application of correlation-based feature selection and decision tree to detect earnings management and accounting fraud relationship. ICIC Express Letters, Part B: Applications, 7(11), 2361–2366. https://www.scopus.com/inward/record.uri?eid=2-s2.0-84992709547&partnerID=40&md5=2bde14f95c170f1075e4dc62c94c5376
Indawatika, F. (2017). Penyusunan Laporan Keuangan Berbasis SAK ETAP Koperasi Intako Dan Respon Pihak Eksternal. Journal of Accounting Science, 1(1), 38–50. https://doi.org/10.21070/jas.v1i1.788
Iskandar, D., Paramitha, V., & Frederica, D. (2022). Fraudulent Financial Statements in Manufacturing Companies. Jurnal Riset Akuntansi, 14(1), 20–36. https://doi.org/10.34010/jra.v14i1.5499
Jan, C.-L. (2021). Detection of financial statement fraud using deep learning for sustainable development of capital markets under information asymmetry. Sustainability (Switzerland), 13(17). https://doi.org/10.3390/su13179879
Jan, C.-L., & Hsiao, D. (2018). Detection of fraudulent financial statements using decision tree and artificial neural network. ICIC Express Letters, Part B: Applications, 9(4), 347–352. https://doi.org/10.24507/icicelb.09.04.347
Kim, Y. J., Baik, B., & Cho, S. (2016). Detecting financial misstatements with fraud intention using multi-class cost-sensitive learning. Expert Systems with Applications, 62, 32–43. https://doi.org/10.1016/j.eswa.2016.06.016
Kitchenham, B., & Brereton, P. (2007). Guidelines for performing systematic literature reviews in software engineering. Technical Report, Ver. 2.3 EBSE Technical Report. EBSE, January 2007, 1–57.
Kootanaee, A. J., Aghajan, A. A. P., & Shirvani, M. H. (2021). A Hybrid Model Based on Machine Learning and Genetic Algorithm for Detecting Fraud in Financial Statements. Journal of Optimization in Industrial Engineering, 14(2), 183–201. https://doi.org/10.22094/JOIE.2020.1877455.1685
Li, B., Yen, J., & Wang, S. (2024). Uncovering Financial Statement Fraud: A Machine Learning Approach With Key Financial Indicators and Real-World Applications. IEEE Access, 12, 194859–194870. https://doi.org/10.1109/ACCESS.2024.3520249
Li, J., Guo, C., Lv, S., Xie, Q., & Zheng, X. (2024). Financial fraud detection for Chinese listed firms: Does managers’ abnormal tone matter? Emerging Markets Review, 62, 101170. https://doi.org/https://doi.org/10.1016/j.ememar.2024.101170
Li, W., Liu, X., & Zhou, S. (2024). Deep learning model based research on anomaly detection and nancial fraud identi cation in corporate nancial reporting statements. Journal of Combinatorial Mathematics and Combinatorial Computing, 123, 343–355. https://doi.org/10.61091/jcmcc123-24
Lu, Q., Fu, C., Nan, K., Fang, Y., Xu, J., Liu, J., Bellotti, A. G., & Lee, B. G. (2023). Chinese corporate fraud risk assessment with machine learning. Intelligent Systems with Applications, 20, 200294. https://doi.org/https://doi.org/10.1016/j.iswa.2023.200294
Massi, M. C., Ieva, F., & Lettieri, E. (2020). Data mining application to healthcare fraud detection: A two-step unsupervised clustering method for outlier detection with administrative databases. BMC Medical Informatics and Decision Making, 20(1), 1–11. https://doi.org/10.1186/s12911-020-01143-9
Metawa, N., Boujlil, R., & Alsunbul, S. (2023a). Fraud-Free Green Finance: Using Deep Learning to Preserve the Integrity of Financial Statements for Enhanced Capital Market Sustainability. International Journal of Energy Economics and Policy, 13(6), 610–617. https://doi.org/10.32479/ijeep.15197
Metawa, N., Boujlil, R., & Alsunbul, S. (2023b). Fraud-Free Green Finance: Using Deep Learning to Preserve the Integrity of Financial Statements for Enhanced Capital Market Sustainability. International Journal of Energy Economics and Policy, 13(6), 610–617. https://doi.org/10.32479/ijeep.15197
Minhas, S., & Hussain, A. (2016). From Spin to Swindle: Identifying Falsification in Financial Text. Cognitive Computation, 8(4), 729–745. https://doi.org/10.1007/s12559-016-9413-9
Papík, M., & Papíková, L. (2021). Application of selected data mining techniques in unintentional accounting error detection. Equilibrium. Quarterly Journal of Economics and Economic Policy, 16(1), 185–201. https://doi.org/10.24136/eq.2021.007
Papík, M., & Papíková, L. (2024). Automated Machine Learning in Bankruptcy Prediction of Manufacturing Companies. Procedia Computer Science, 232, 1428–1436. https://doi.org/https://doi.org/10.1016/j.procs.2024.01.141
Prasetyo, S., & Dewayanto, T. (2024). Penerapan Machine Learning, Deep Learning, Dan Data Mining Dalam Deteksi Kecurangan Laporan Keuangan-a Systematic Literature Review. Diponegoro Journal of Accounting, 13(3), 1–12. http://ejournal-s1.undip.ac.id/index.php/accounting
Prayoga, H., & Sudaryati, E. (2020). Skepticism and Professionalism to Fraud Detection Ability. Journal of Accounting Science, 4(2), 70–85. https://doi.org/10.21070/jas.v4i2.1087
Rahimikia, E., Mohammadi, S., Rahmani, T., & Ghazanfari, M. (2017). Detecting corporate tax evasion using a hybrid intelligent system: A case study of Iran. International Journal of Accounting Information Systems, 25, 1–17. https://doi.org/https://doi.org/10.1016/j.accinf.2016.12.002
Riskiyadi, M. (2024). Detecting future financial statement fraud using a machine learning model in Indonesia: a comparative study. Asian Review of Accounting, 32(3), 394–422. https://doi.org/10.1108/ARA-02-2023-0062
Ritonga, R. F., & Budhiawan, A. (2024). Review of Criminal Law on Manipulation of PT Asabri’s Financial Statements As An Act Of Fraud. Journal Equity of Law and Governance, 4(2), 316–326. https://ejournal.warmadewa.ac.id/index.php/elg/article/view/10210
Schrijver, G., Sarmah, D. K., & El-hajj, M. (2024). Automobile insurance fraud detection using data mining: A systematic literature review. In Intelligent Systems with Applications (Vol. 21, p. 200340). https://doi.org/10.1016/j.iswa.2024.200340
Song, X.-P., Hu, Z.-H., Du, J.-G., & Sheng, Z.-H. (2014). Application of machine learning methods to risk assessment of financial statement fraud: Evidence from China. Journal of Forecasting, 33(8), 611–626. https://doi.org/10.1002/for.2294
Supriadi, A., & Aryati, T. (2022). Modal Intelektual Dan Kepemilikan Manajerial Pada Biaya Modal: Manajemen Laba Sebagai Moderasi. Journal of Accounting Science, 6(2). https://doi.org/10.21070/jas.v6i2.1621
Wahono. (2018). Systematic Literature Review: Pengantar, Tahapan Dan Studi Kasus. In Pengaruh Akupresur Lo4 (he kuk) dan Thai Cong terhadap Tingkat Nyeri Persalinan Kala I pada Ibu Bersalin. (Vol. 9). http://romisatriawahono.net/2016/05/15/systematic-literature-review-pengantar-tahapan-dan-studi-kasus/
Wahyu Fikri Darmawan, & Umaimah Umaimah. (2025). The Effect of Good Corporate Governance, Earning Management on Firm Value. Journal of Accounting Science, 9(1), 79–96. https://doi.org/10.21070/jas.v9i1.1922
Wang, D., & Chen, L.-X. (2024). Financial Intelligence Forecasting Model on Regression Analysis and Support Vector Machine. Journal of Network Intelligence, 9(3), 1388–1404. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85202912889&partnerID=40&md5=761e17d63d1bc661834ce6921e0d10cf
West, J., & Bhattacharya, M. (2016). Intelligent financial fraud detection: A comprehensive review. Computers and Security, 57, 47–66. https://doi.org/10.1016/j.cose.2015.09.005
WS Albrecht, CO Albrecht, CC Albrecht, M. Z. (2019). Instructor Solutions Manual Fraud Examination SIXTH EDITION Fraud Examination 6th Edition Albrecht Solutions Manual Visit TestBankDeal.com to get complete for all chapters. www.cengage.com/global.
Wu, H., Chang, Y., Li, J., & Zhu, X. (2022). Financial fraud risk analysis based on audit information knowledge graph. Procedia Computer Science, 199, 780–787. https://doi.org/https://doi.org/10.1016/j.procs.2022.01.097
Xiuguo, W., & Shengyong, D. (2022). An Analysis on Financial Statement Fraud Detection for Chinese Listed Companies Using Deep Learning. IEEE Access, 10, 22516–22532. https://doi.org/10.1109/ACCESS.2022.3153478
Yao, J., Pan, Y., Yang, S., Chen, Y., & Li, Y. (2019). Detecting fraudulent financial statements for the sustainable development of the socio-economy in China: A multi-analytic approach. Sustainability (Switzerland), 11(6). https://doi.org/10.3390/su11061579
Zhang, Y., Hu, A., Wang, J., & Zhang, Y. (2022). Detection of fraud statement based on word vector: Evidence from financial companies in China. Finance Research Letters, 46, 102477. https://doi.org/10.1016/j.frl.2021.102477
Zhou, L., Duan, Y., & Wei, W. (2023). Research on the Financial Data Fraud Detection of Chinese Listed Enterprises by Integrating Audit Opinions. KSII Transactions on Internet and Information Systems, 17(12), 3218–3241. https://doi.org/10.3837/tiis.2023.12.001
Copyright (c) 2025 Anggi Putri, Dian Anita Nuswantara

This work is licensed under a Creative Commons Attribution 4.0 International License.