Performance Evaluation Of SVM With Parameter Optimization On Credit Card Fraud Data Subset Using SMOTE
(1) Politeknik Negeri Lampung, Indonesia
(2) Politeknik Negeri Lampung, Indonesia
(3) Politeknik Negeri Lampung, Indonesia
(4) Politeknik Negeri Lampung, Indonesia
(5) Politeknik Negeri Lampung, Indonesia
(*) Corresponding Author
Abstract
This study evaluates the performance of the Support Vector Machine (SVM) algorithm in detecting credit card fraud by overcoming the class imbalance problem using the Synthetic Minority Oversampling Technique (SMOTE) technique and parameter optimization through Grid Search. The dataset used is sourced from Kaggle, consists of 10,001 transactions, and has been balanced. SMOTE is applied exclusively to the training data to prevent data leakage. The optimization process produces the best parameters at a value of C = 10 and gamma = 0.1. Model evaluation is carried out using recall, precision, F1-score, and AUC-ROC metrics. The results show a significant increase in performance in recognizing fraudulent transactions. The final model recorded a recall of 0.68, precision 0.90, F1-score 0.77, and AUC-ROC 0.98. These findings prove that the combination of SMOTE techniques and parameter optimization can improve the effectiveness of SVM in classifying minority classes more accurately. This approach is considered to have great potential to be applied in automated fraud detection systems in the financial sector.
Full Text:
PDFReferences
Agustin, ID, & Abidin, FIN (2024). The Influence Of Financial Literacy, Financial Behavior, Digital Payment And Paylater On Student Consumptive Behavior In The COVID-19 Pandemic Era. Innovative Technologica: Methodical Research Journal, 1(4), 15. Https://Doi.Org/10.47134/Innovative.V1i4.44
Batista, GEAPA, Prati, R.C., & Monard, M.C. (2004). A Study Of The Behavior Of Several Methods For Balancing Machine Learning Training Data. ACM SIGKDD Explorations Newsletter, 6(1), 20–29. Https://Doi.Org/10.1145/1007730.1007735
Ben-Hur, A., & Weston, J. (2010). A User's Guide To Supporting Vector Machines. Methods In Molecular Biology (Clifton, NJ), 609, 223–239. Https://Doi.Org/10.1007/978-1-60327-241-4_13
Bhattacharyya, S., Jha, S., Tharakunnel, K., & Westland, J. C. (2011). Data Mining For Credit Card Fraud: A Comparative Study. Decision Support Systems, 50(3), 602–613. Https://Doi.Org/10.1016/J.Dss.2010.08.008
Chen, T., & Guestrin, C. (2016). Xgboost: A Scalable Tree Boosting System. Proceedings Of The ACM SIGKDD International Conference On Knowledge Discovery And Data Mining, 17-13-August-2016, 785–794. Https://Doi.Org/10.1145/2939672.2939785
Cherkassky, V., & Ma, Y. (2004). Practical Selection Of SVM Parameters And Noise Estimation For SVM Regression. Neural Networks, 17(1), 113–126. Https://Doi.Org/10.1016/S0893-6080(03)00169-2
Cortes, C., & Vapnik, V. (1995). (2015). Support-Vector Networks. Machine Learning, 20(3), 273–297. Journal Of Physics: Conference Series, 628(1), 273–297. Https://Doi.Org/10.1088/1742-6596/628/1/012073
Dal Pozzolo, A., Caelen, O., Bontempi, G., & Johnson, R. A. (2015). Calibrating Probability With Undersampling For Unbalanced Classification Fraud Detection View Project Volatility Forecasting View Project Calibrating Probability With Undersampling For Unbalanced Classification. Ieee. Https://Www.Researchgate.Net/Publication/283349138
Dean, J., & Ghemawat, S. (2008). Mapreduce: Simplified Data Processing On Large Clusters. Communications Of The ACM, 51(1), 107–113. Https://Doi.Org/10.1145/1327452.1327492
Hsu, C., Chang, C., & Lin, C. (2003). A Practical Guide To Support Vector Classification (Presentation). 1–29. Https://Www.Cs.Sfu.Ca/People/Faculty/Teaching/726/Spring11/Svmguide.Pdf
Hutter, F. (2017). Optimization Parameters. Interdisciplinary Mathematical Sciences (Vol. 19). Https://Doi.Org/10.1142/9789814630146_0014
Jasper Snoek, H.L. (2013). Practical Bayesian Optimization Of Machine Learning Algorithms. Religion And The Arts, 17(1–2), 57–73. Https://Doi.Org/10.1163/15685292-12341254
Kim, H. (2022). Deep Learning. Artificial Intelligence For 6G, 22(4),
–303. Https://Doi.Org/10.1007/978-3-030-95041-5_6
Kohavi, R. (1995). A Study Of Cross-Validation And Bootstrapping For Accuracy Estimation And Model Selection. IJCAI International Joint Conference On Artificial Intelligence, 2(June), 1137–1143.
Liu, Y., An, A., & Huang, X. (2006). Boosting Prediction Accuracy On Imbalanced Datasets With SVM Ensembles. Lecture Notes In Computer Science (Including Subseries Lecture Notes In Artificial Intelligence And Lecture Notes In Bioinformatics), 3918 LNAI(April), 107–118. Https://Doi.Org/10.1007/11731139_15
Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, LOH (2020). SMOTE: Synthetic Minority Over-Sampling Technique. METHOMIKA Journal Of Informatics Management And Computerized Accounting, 4(1), 67–72. Https://Doi.Org/10.46880/Jmika.Vol4no1.Pp67-72
Ramadani, L. (2016). The Influence Of Debit Card And Electronic Money (E-Money) Usage On Student Consumption Expenditure. Journal Of Economics And Development Economics Studies, 8(1), 1–8. Https://Doi.Org/10.17977/Um002v8i12016p001
Sokolova, M., & Lapalme, G. (2009). A Systematic Analysis Of Performance Measures For Classification Tasks. Information Processing And Management, 45(4), 427–437. Https://Doi.Org/10.1016/J.Ipm.2009.03.002
Wu, Top 10 Algorithms In Data Mining. In Knowledge And Information Systems (Vol. 14, Issue 1). Https://Doi.Org/10.1007/S10115-007-0114-2
DOI: https://doi.org/10.30645/ijistech.v9i1.398
Refbacks
- There are currently no refbacks.
Jumlah Kunjungan:
Published Papers Indexed/Abstracted By:











