A Hybrid Case-Based Reasoning Framework Using KNN, Word2Vec, and Cosine Similarity for Employee Attrition Analysis
DOI:
https://doi.org/10.30595/juita.v14i1.28523Keywords:
Case-Based Reasoning, employee attrition, K-Nearest Neighbors, Word2Vec, human resource analyticsAbstract
Employee attrition prediction remains a longstanding challenge in human resource analytics, as organizations increasingly depend on computational decision-support systems that are transparent, consistent, and operationally accountable. Conventional methods that rely solely on numerical attributes are restricted in their ability to accurately capture the structural and contextual relationships inherent in categorical and text-based employee descriptors. To overcome this limitation, the current study investigates a hybrid Case-Based Reasoning (CBR) retrieval framework that combines K-Nearest Neighbors (KNN) with Word2Vec embeddings derived from the dataset's limited textual attributes, specifically Department, Gender, EducationField, MaritalStatus, and OverTime. Eight experimental configurations were assessed to examine the impact of alternative similarity metrics and diverse feature representations. The optimal configuration of KNN, enhanced with Word2Vec embeddings and cosine similarity, attained an accuracy of 0.8526 and a weighted F1-score of 0.8000, thereby exceeding the performance of baseline models based solely on numerical features and those utilizing Manhattan distance. Nonetheless, the improvements in performance remained limited owing to dataset-specific limitations, such as class imbalance and the inherently superficial characteristics of the textual descriptors, which restrict the semantic richness of Word2Vec embeddings. Furthermore, the IBM attrition dataset does not encompass downsizing or termination situations, highlighting conceptual and ethical constraints when utilizing similarity-based predictions for high-stakes HR decisions. Overall, the findings indicate that hybrid similarity representations, particularly the combination of Word2Vec embeddings with cosine distance, can improve the structural expressiveness of CBR, although their predictive effectiveness is still limited by data sparsity and considerations of fairness.
References
[1] Badan Pusat Statistik, “Statistik Indonesia 2023,” 2023. [Online]. Available: https://www.bps.go.id/id/publication/2023/02/28/18018f9896f09f03580a614b/statistik-indonesia-2023.html
[2] N. Yahia, J. Hlel, and R. Colomo-Palacios, “From Big Data to Deep Data to Support People Analytics for Employee Attrition Prediction,” IEEE Access, vol. 9, pp. 60447–60458, 2021, doi: 10.1109/ACCESS.2021.3082391.
[3] S. Paigude, S. C. Pangarkar, S. N. Hundekari, M. Mali, K. Wanjale, and Y. Dongre, “Potential of Artificial Intelligence in Boosting Employee Retention in the Human Resource Industry,” International Journal on Recent and Innovation Trends in Computing and Communication, 2023.
[4] T. S. I and T. Saranya, “Forecast Of Employee Attrition In Big Data To Support People Analytics,” International Journal of Scientific Research In Engineering And Management, 2023.
[5] R. Sharma and A. Singla, “Deep Learning in HRM: Transforming Employee Retention through Predictive Analytics,” in 2024 4th Asian Conference on Innovation in Technology (ASIANCON), 2024, pp. 1–6. doi: 10.1109/ASIANCON62057.2024.10837776.
[6] B. Kaur and A. Dogra, “A Machine Learning Model for Predicting Employees Retention: An Initiative towards HR through Machine,” in 2022 Seventh International Conference on Parallel, Distributed and Grid Computing (PDGC), 2022, pp. 653–657. doi: 10.1109/PDGC56933.2022.10053249.
[7] G. Marín Díaz, J. J. Galán Hernández, and J. L. Galdón Salvador, “Analyzing Employee Attrition Using Explainable AI for Strategic HR Decision-Making,” Mathematics, vol. 11, no. 22, 2023, doi: 10.3390/math11224677.
[8] A. Aamodt and E. Plaza, “Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches,” AI Communications. IOS Press, 1994.
[9] I. D. Watson and F. Marir, “Case-based reasoning: A review,” Knowl Eng Rev, vol. 9, pp. 327–354, 1994, [Online]. Available: https://api.semanticscholar.org/CorpusID:41059740
[10] D. K. Pandey, S. Upadhyay, A. K. Jha, S. Rana, and M. Singh, “Leveraging HR Analytics for Predictive Talent Management and Employee Retention,” in 2024 13th International Conference on System Modeling & Advancement in Research Trends (SMART), 2024, pp. 436–440. doi: 10.1109/SMART63812.2024.10882581.
[11] D. Srivamsi, O. M. Deepak, M. D. A. Praveena, and A. Christy, “Cosine Similarity Based Word2Vec Model for Biomedical Data Analysis,” in 7th International Conference on Trends in Electronics and Informatics, ICOEI 2023 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2023, pp. 1400–1404. doi: 10.1109/ICOEI56765.2023.10125794.
[12] E. C. Lopes, U. Schiel, and G. P. dos Santos Jr., “A Decision Support Methodology for the Control of Alternative Penalties - A Case-Based Reasoning Approach,” in 2009 International Conference on Information, Process, and Knowledge Management, 2009, pp. 72–77. doi: 10.1109/eKNOW.2009.19.
[13] X.-M. Han and J.-T. Han, “A Research of Intelligent Decision Support System of ELS3 Based on Case-Based Reasoning,” in 2007 International Conference on Wireless Communications, Networking and Mobile Computing, 2007, pp. 5765–5768. doi: 10.1109/WICOM.2007.1413.
[14] D. M. L. Martins and F. B. de Lima Neto, “Hybrid Intelligent Decision Support Using a Semiotic Case-Based Reasoning and Self-Organizing Maps,” IEEE Trans Syst Man Cybern Syst, vol. 50, no. 3, pp. 863–870, 2020, doi: 10.1109/TSMC.2017.2749281.
[15] A. Adla and M. Frendi, “A Decision Support Systemfor Commercial Lending,” in 2021 International Conference on Decision Aid Sciences and Application (DASA), 2021, pp. 326–331. doi: 10.1109/DASA53625.2021.9682296.
[16] S. M. R. Naqvi, M. Ghufran, S. Meraghni, C. Varnier, J.-M. Nicod, and N. Zerhouni, “CBR-Based Decision Support System for Maintenance Text Using NLP for an Aviation Case Study,” in 2022 Prognostics and Health Management Conference (PHM-2022 London), 2022, pp. 344–349. doi: 10.1109/PHM2022-London52454.2022.00067.
[17] O. Kovalchuk, D. Kobylkin, and O. Zachko, “HR Decision-Making Support System Based On The CBR Method,” in 2023 IEEE 18th International Conference on Computer Science and Information Technologies (CSIT), 2023, pp. 1–4. doi: 10.1109/CSIT61576.2023.10324169.
[18] D. Muthugala, S. M. Arachchi, P. Pallewatta, A. Maithripala, and G. Seneviratne, “Predicting Employee Attrition & Employee Retention Period using Supervised Learning,” in 2024 6th International Conference on Advancements in Computing (ICAC), 2024, pp. 127–132. doi: 10.1109/ICAC64487.2024.10851009.
[19] M. Muhammad, T. Sutikno, and I. Riadi, “A Comparative Study of K-Means and KNN Imputation for Handling Missing Data in Scholarship Applicant Datasets,” JUITA: Jurnal Informatika, vol. 13, no. 3, pp. 245–254, Nov. 2025, doi: 10.30595/juita.v13i3.26502.
[20] Yashu, R. Sharma, A. Jain, and M. Manwal, “Enhancing Human Resource Management through Deep Learning: A Predictive Analytics Approach to Employee Retention Success,” in 2024 IEEE International Conference on Information Technology, Electronics and Intelligent Communication Systems (ICITEICS), 2024, pp. 1–4. doi: 10.1109/ICITEICS61368.2024.10625175.
[21] A. M. Bongale, D. Dharrao, and S. Urolagin, “Exploratory Data Analysis and Classification of Employee Retention based on Logistic Regression Model,” in 2023 9th International Conference on Advanced Computing and Communication Systems (ICACCS), 2023, pp. 1929–1933. doi: 10.1109/ICACCS57279.2023.10112681.
[22] N. Silpa, V. V. R. Maheswara Rao, M. V. Subbarao, R. R. Kurada, S. S. Reddy, and P. J. Uppalapati, “An Enriched Employee Retention Analysis System with a Combination Strategy of Feature Selection and Machine Learning Techniques,” in 2023 7th International Conference on Intelligent Computing and Control Systems (ICICCS), 2023, pp. 142–149. doi: 10.1109/ICICCS56967.2023.10142473.
[23] K. M. Mitravinda and S. Shetty, “Employee Attrition: Prediction, Analysis Of Contributory Factors And Recommendations For Employee Retention,” in 2022 IEEE International Conference for Women in Innovation, Technology & Entrepreneurship (ICWITE), 2022, pp. 1–6. doi: 10.1109/ICWITE57052.2022.10176235.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Akhmad Arif Faisal Siregar, Ema Utami, Tika Novita Sari

This work is licensed under a Creative Commons Attribution 4.0 International License.

JUITA: Jurnal Informatika is licensed under a Creative Commons Attribution 4.0 International License.








