Development and Evaluation of Stroke Disease Classification Models: Classical Machine Learning, Deep Learning, and Explainable AI Approaches

Lianny Wydiastuty Kusuma; Andri Wijaya; Asahiro Nathanael Star Sitohang; Ceng Giap Yo

doi:10.30595/juita.v13i3.27868

Authors

Lianny Wydiastuty Kusuma Buddhi Dharma University
Andri Wijaya Buddhi Dharma University
Asahiro Nathanael Star Sitohang Buddhi Dharma University
Ceng Giap Yo Buddhi Dharma University

DOI:

https://doi.org/10.30595/juita.v13i3.27868

Keywords:

deep learning, machine learning, SMOTE, stroke, XAI

Abstract

This study evaluates the impact of the Synthetic Minority Oversampling Technique (SMOTE) on improving machine learning and deep learning performance in stroke risk classification using secondary, publicly available data from Kaggle’s Stroke Prediction Dataset (n = 5,110; 249 stroke cases, 4,861 non-stroke cases), for deep learning. Performance was measured using accuracy, precision, recall, and F1-score, while Explainable AI (XAI) methods (SHAP, LIME) were utilized for interpretability. The results show that applying SMOTE improves the model's sensitivity to the minority "Stroke" class, with Random Forest after SMOTE achieving 97% accuracy and a balanced precision–recall. These findings highlight the methodological potential of combining SMOTE with machine learning, deep learning, and XAI; however, they should not be interpreted as direct clinical validation. Future work with clinical and population-based datasets is necessary to assess the applicability in real-world healthcare settings.

References

[1] World Stroke Organization, “Impact of Stroke,” https://www.world-stroke.org/world-stroke-day-campaign/about-stroke/impact-of-stroke.

[2] U.S. Centers for Disease Control and Prevention (CDC), “Stroke Facts,” https://www.cdc.gov/stroke/data-research/facts-stats/index.html.

[3] S. Strilciuc, “The economic burden of stroke: a systematic review of cost of illness studies,” J Med Life, vol. 14, no. 5, pp. 606–619, Jan. 2021, doi: 10.25122/jml-2021-0361.

[4] A. Hassan, S. Gulzar Ahmad, E. Ullah Munir, I. Ali Khan, and N. Ramzan, “Predictive modeling and identification of key risk factors for stroke using machine learning,” Sci Rep, vol. 14, no. 1, p. 11498, May 2024, doi: 10.1038/s41598-024-61665-4.

[5] S. Yakut and N. Barişçi, “Comparison of Machine Learning and Deep Learning Techniques for Stroke Prediction,” vol. 17, no. 1, pp. 11–27, 2025, doi: 10.29137/ijerad.1432162.

[6] P. L. Chiang, “Deep Learning-Based Automatic Detection of ASPECTS in Acute Ischemic Stroke: Improving Stroke Assessment on CT Scans,” J Clin Med, vol. 11, no. 17, Sep. 2022, doi: 10.3390/jcm11175159.

[7] H. Yu, “Prognosis of ischemic stroke predicted by machine learning based on multi-modal MRI radiomics,” Front Psychiatry, vol. 13, Jan. 2023, doi: 10.3389/fpsyt.2022.1105496.

[8] R. Sebastian and C. Juliane, “Comparison of Data Mining Classification Algorithms for Stroke Disease Prediction Using the SMOTE Upsampling Method,” vol. 11, no. 2, pp. 311–321, 2023, doi: https://doi.org/10.30595/juita.v11i2.17348.

[9] A. Barragán-Montero, “Artificial intelligence and machine learning for medical imaging: A technology review,” Physica Medica, vol. 83, pp. 242–256, Mar. 2021, doi: 10.1016/j.ejmp.2021.04.016.

[10] W. P. Indahwati and F. M. Afendi, “Improving Stroke Detection with Hybrid Sampling and Cascade Generalization,” vol. 12, no. 1, pp. 9–18, May 2024, doi: https://doi.org/10.30595/juita.v12i1.19386.

[11] P. Linardatos, V. Papastefanopoulos, and S. Kotsiantis, “Explainable AI: A Review of Machine Learning Interpretability Methods,” Entropy, vol. 23, no. 1, p. 18, Dec. 2020, doi: 10.3390/e23010018.

[12] Z. Sadeghi, “A review of Explainable Artificial Intelligence in healthcare,” Computers and Electrical Engineering, vol. 118, p. 109370, Aug. 2024, doi: 10.1016/j.compeleceng.2024.109370.

[13] M. Issaiy, D. Zarei, S. Kolahi, and D. S. Liebeskind, “Machine learning and deep learning algorithms in stroke medicine: a systematic review of hemorrhagic transformation prediction models,” Jan. 01, 2025, Springer Science and Business Media Deutschland GmbH. doi: 10.1007/s00415-024-12810-6.

[14] E. Rahm and H. H. Do, “Data Cleaning: Problems and Current Approaches,” 2000, [Online]. Available: https://www.researchgate.net/publication/220282831

[15] P. Schober, C. Boer, and L. A. Schwarte, “Correlation Coefficients: Appropriate Use and Interpretation,” Anesth Analg, vol. 126, no. 5, pp. 1763–1768, May 2018, doi: 10.1213/ANE.0000000000002864.

[16] M. Avanzo, J. Stancanello, G. Pirrone, A. Drigo, and A. Retico, “The Evolution of Artificial Intelligence in Medical Imaging: From Computer Science to Machine and Deep Learning,” Oct. 2024, doi: 10.20944/preprints202410.0025.v1.

[17] M. Sulistiyono, Y. Pristyanto, S. Adi, and G. Gumelar, “Implementasi Algoritma Synthetic Minority Over-Sampling Technique untuk Menangani Ketidakseimbangan Kelas pada Dataset Klasifikasi,” SISTEMASI, vol. 10, pp. 445–459, May 2021, [Online]. Available: http://sistemasi.ftik.unisi.ac.id

[18] I. T. Jolliffe and J. Cadima, “Principal component analysis: a review and recent developments,” The Royal Society, vol. 374, no. 2065, p. 20150202, Apr. 2016, doi: 10.1098/rsta.2015.0202.

[19] K. Moulaei, L. Afshari, R. Moulaei, B. Sabet, S. M. Mousavi, and M. R. Afrash, “Explainable artificial intelligence for stroke prediction through comparison of deep learning and machine learning models,” Sci Rep, vol. 14, no. 1, p. 31392, Dec. 2024, doi: 10.1038/s41598-024-82931-5.

[20] N. V Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Over-sampling Technique,” 2002.

[21] T. Saito and M. Rehmsmeier, “The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets,” PLoS One, vol. 10, no. 3, p. e0118432, Mar. 2015, doi: 10.1371/journal.pone.0118432.

[22] Z. Salahuddin, H. C. Woodruff, A. Chatterjee, and P. Lambin, “Transparency of deep neural networks for medical image analysis: A review of interpretability methods,” Comput Biol Med, vol. 140, p. 105111, Jan. 2022, doi: 10.1016/j.compbiomed.2021.105111.

[23] S. M. Lundberg, P. G. Allen, and S.-I. Lee, “A Unified Approach to Interpreting Model Predictions,” Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 4768–4777, 2017, [Online]. Available: https://github.com/slundberg/shap

[24] M. T. Ribeiro, S. Singh, and C. Guestrin, “‘Why Should I Trust You?,’” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA: ACM, Aug. 2016, pp. 1135–1144. doi: 10.1145/2939672.2939778.

[25] A. Fernández, S. García, M. Galar, R. C. Prati, B. Krawczyk, and F. Herrera, Learning from Imbalanced Data Sets. Cham: Springer International Publishing, 2018. doi: 10.1007/978-3-319-98074-4.

[26] V. L. Feigin, “Global, regional, and national burden of stroke and its risk factors, 1990–2021: a systematic analysis for the Global Burden of Disease Study 2021,” Lancet Neurol, vol. 23, no. 10, pp. 973–1003, Oct. 2024, doi: 10.1016/S1474-4422(24)00369-7.

[27] K. Kourou, T. P. Exarchos, K. P. Exarchos, M. V. Karamouzis, and D. I. Fotiadis, “Machine learning applications in cancer prognosis and prediction,” Comput Struct Biotechnol J, vol. 13, pp. 8–17, 2015, doi: 10.1016/j.csbj.2014.11.005.