Optimization of Hyperparameter K in K-Nearest Neighbor Using Particle Swarm Optimization
Abstract
This study aims to enhance the performance of the K-Nearest Neighbors (KNN) algorithm by optimizing the hyperparameter K using the Particle Swarm Optimization (PSO) algorithm. In contrast to prior research, which typically focuses on a single dataset, this study seeks to demonstrate that PSO can effectively optimize KNN hyperparameters across diverse datasets. Three datasets from different domains are utilized: Iris, Wine, and Breast Cancer, each featuring distinct classification types and classes. Furthermore, this research endeavors to establish that PSO can operate optimally with both Manhattan and Euclidean distance metrics. Prior to optimization, experiments with default K values (3, 5, and 7) were conducted to observe KNN behavior on each dataset. Initial results reveal stable accuracy in the iris dataset, while the wine and breast cancer datasets exhibit a decrease in accuracy at K=3, attributed to attribute complexity. The hyperparameter K optimization process with PSO yields a significant increase in accuracy, particularly in the wine dataset, where accuracy improves by 6.28% with the Manhattan matrix. The enhanced accuracy in the optimized KNN algorithm demonstrates the effectiveness of PSO in overcoming KNN constraints. Although the accuracy increase for the iris dataset is not as pronounced, this research provides insight that optimizing the hyperparameter K can yield positive results, even for datasets with initially good performance. A recommendation for future research is to conduct similar experiments with different algorithms, such as Support Vector Machine or Random Forest, to further evaluate PSO's ability to optimize the iris, wine, and breast cancer datasets.
Keywords
References
[1] T. Denœux, O. Kanjanatarakul, and S. Sriboonchitta, “A new evidential K-nearest neighbor rule based on contextual discounting with partially supervised learning,” International Journal of Approximate Reasoning, vol. 113, pp. 287–302, 2019, doi: 10.1016/j.ijar.2019.07.009.
[2] J. Jasmir, S. Nurmaini, and B. Tutuko, “Fine-grained algorithm for improving knn computational performance on clinical trials text classification,” Big Data and Cognitive Computing, vol. 5, no. 4, 2021, doi: 10.3390/bdcc5040060.
[3] D. C. G. Pedronette, Y. Weng, A. Baldassin, and C. Hou, “Semi-supervised and active learning through Manifold Reciprocal kNN Graph for image retrieval,” Neurocomputing, vol. 340, pp. 19–31, 2019, doi: 10.1016/j.neucom.2019.02.016.
[4] S. Zhang and J. Li, “KNN Classification With One-Step Computation,” IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 3, pp. 2711–2723, 2023, doi: 10.1109/TKDE.2021.3119140.
[5] J. Hu, H. Peng, J. Wang, and W. Yu, “kNN-P: A kNN classifier optimized by P systems,” Theoretical Computer Science, vol. 817, pp. 55–65, 2020, doi: 10.1016/j.tcs.2020.01.001.
[6] W. Xing and Y. Bei, “Medical Health Big Data Classification Based on KNN Classification Algorithm,” IEEE Access, vol. 8, pp. 28808–28819, 2020, doi: 10.1109/ACCESS.2019.2955754.
[7] S. Uddin, I. Haque, H. Lu, M. A. Moni, and E. Gide, “Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction,” Scientific Reports, vol. 12, no. 1, pp. 1–11, 2022, doi: 10.1038/s41598-022-10358-x.
[8] B. Octaviana, “Combination of K-Means Clustering and K-Nearest Neighbor on Ecommerce Customer Spending Rate Prediction,” Infokum, vol. 9, no. 2, pp. 496–503, 2021.
[9] F. A. I. Achyunda Putra, F. Utaminingrum, and W. F. Mahmudy, “HOG Feature Extraction and KNN Classification for Detecting Vehicle in The Highway,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 14, no. 3, p. 231, 2020, doi: 10.22146/ijccs.54050.
[10] S. Ambesange, R. Nadagoudar, R. Uppin, V. Patil, S. Patil, and S. Patil, “Liver Diseases Prediction using KNN with Hyper Parameter Tuning Techniques,” Proceedings of B-HTC 2020 - 1st IEEE Bangalore Humanitarian Technology Conference, pp. 1–6, 2020, doi: 10.1109/B-HTC50970.2020.9297949.
[11] R. Ghawi and J. Pfeffer, “Efficient Hyperparameter Tuning with Grid Search for Text Categorization using kNN Approach with BM25 Similarity,” Open Computer Science, vol. 9, no. 1, pp. 160–180, 2019, doi: 10.1515/comp-2019-0011.
[12] D. Maincer, Y. Benmahamed, M. Mansour, M. Alharthi, and S. S. M. Ghonein, “Fault Diagnosis in Robot Manipulators Using SVM and KNN,” Intelligent Automation and Soft Computing, vol. 35, no. 2, pp. 1957–1969, 2023, doi: 10.32604/iasc.2023.029210.
[13] X. N. Bui, P. Jaroonpattanapong, H. Nguyen, Q. H. Tran, and N. Q. Long, “A Novel Hybrid Model for Predicting Blast-Induced Ground Vibration Based on k-Nearest Neighbors and Particle Swarm Optimization,” Scientific Reports, vol. 9, no. 1, pp. 1–14, 2019, doi: 10.1038/s41598-019-50262-5.
[14] M. A. M. Shaheen, H. M. Hasanien, and A. Alkuhayli, “A novel hybrid GWO-PSO optimization technique for optimal reactive power dispatch problem solution,” Ain Shams Engineering Journal, vol. 12, no. 1, pp. 621–630, 2021, doi: 10.1016/j.asej.2020.07.011.
[15] K. Song, F. Yan, T. Ding, L. Gao, and S. Lu, “A steel property optimization model based on the XGBoost algorithm and improved PSO,” Computational Materials Science, vol. 174, no. December 2019, p. 109472, 2020, doi: 10.1016/j.commatsci.2019.109472.
[16] H. Liu, X. W. Zhang, and L. P. Tu, “A modified particle swarm optimization using adaptive strategy,” Expert Systems with Applications, vol. 152, p. 113353, 2020, doi: 10.1016/j.eswa.2020.113353.
[17] X. W. Zhang, H. Liu, and L. P. Tu, “A modified particle swarm optimization for multimodal multi-objective optimization,” Engineering Applications of Artificial Intelligence, vol. 95, no. October 2019, p. 103905, 2020, doi: 10.1016/j.engappai.2020.103905.
[18] N. Zeng, Z. Wang, W. Liu, H. Zhang, K. Hone, and X. Liu, “A Dynamic Neighborhood-Based Switching Particle Swarm Optimization Algorithm,” IEEE Transactions on Cybernetics, vol. 52, no. 9, pp. 9290–9301, 2022, doi: 10.1109/TCYB.2020.3029748.
[19] W. Liu, Z. Wang, N. Zeng, Y. Yuan, F. E. Alsaadi, and X. Liu, “A novel randomised particle swarm optimizer,” International Journal of Machine Learning and Cybernetics, vol. 12, no. 2, pp. 529–540, 2021, doi: 10.1007/s13042-020-01186-4.
[20] X. fang Song, Y. Zhang, D. wei Gong, and X. yan Sun, “Feature selection using bare-bones particle swarm optimization with mutual information,” Pattern Recognition, vol. 112, p. 107804, 2021, doi: 10.1016/j.patcog.2020.107804.
[21] A. Shokrzade, M. Ramezani, F. Akhlaghian Tab, and M. Abdulla Mohammad, “A novel extreme learning machine based kNN classification method for dealing with big data,” Expert Systems with Applications, vol. 183, no. December 2019, p. 115293, 2021, doi: 10.1016/j.eswa.2021.115293.
[22] D. Zhao et al., “k-means clustering and kNN classification based on negative databases,” Applied Soft Computing, vol. 110, p. 107732, 2021, doi: 10.1016/j.asoc.2021.107732.
[23] N. Hidayati and A. Hermawan, “K-Nearest Neighbor (K-NN) algorithm with Euclidean and Manhattan in classification of student graduation,” Journal of Engineering and Applied Technology, vol. 2, no. 2, pp. 86–91, 2021, doi: 10.21831/jeatech.v2i2.42777.
[24] A. Alim Murtopo, B. Priyatna, and R. Mayasari, “Signature Verification Using The K-Nearest Neighbor (KNN) Algorithm and Using the Harris Corner Detector Feature Extraction Method,” Buana Information Technology and Computer Sciences (BIT and CS), vol. 3, no. 2, pp. 35–40, 2022, doi: 10.36805/bit-cs.v3i2.2763.
[25] X. Gao and G. Li, “A KNN Model Based on Manhattan Distance to Identify the SNARE Proteins,” IEEE Access, vol. 8, pp. 112922–112931, 2020, doi: 10.1109/ACCESS.2020.3003086.
[26] T. Wira Harjanti, S. Madenda, J. Harlan, and E. T. P. Lussiana, “Study and research on the identification of the leaves of indonesian herbal medicines using manhattan distance and neural network algorithms,” 2020 5th International Conference on Informatics and Computing, ICIC 2020, 2020, doi: 10.1109/ICIC50835.2020.9288564.
[27] Z. Yu, K. Wang, S. Xie, Y. Zhong, and Z. Lv, “Prototypical Network Based onManhattan Distance,” CMES - Computer Modeling in Engineering and Sciences, vol. 130, no. 3, 2022, doi: 10.32604/cmes.2022.019612.
[28] Y. Zeng, H. Ren, T. Yang, S. Xiao, and N. Xiong, “A Novel Similarity Measure of Single-Valued Neutrosophic Sets Based on Modified Manhattan Distance and Its Applications,” Electronics (Switzerland), vol. 11, no. 6, 2022, doi: 10.3390/electronics11060941.
[29] H. R. R. Zaman and F. S. Gharehchopogh, An improved particle swarm optimization with backtracking search optimization algorithm for solving continuous optimization problems, vol. 38, no. 0123456789. Springer London, 2022. doi: 10.1007/s00366-021-01431-6.
[30] M. Jahandideh-Tehrani, O. Bozorg-Haddad, and H. A. Loáiciga, “Application of particle swarm optimization to water management: an introduction and overview,” Environmental Monitoring and Assessment, vol. 192, no. 5, 2020, doi: 10.1007/s10661-020-8228-z.
[31] W. Li, X. Meng, Y. Huang, and Z. H. Fu, “Multipopulation cooperative particle swarm optimization with a mixed mutation strategy,” Information Sciences, vol. 529, pp. 179–196, 2020, doi: 10.1016/j.ins.2020.02.034.
DOI: 10.30595/juita.v12i1.20688
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 4.0 International License.
ISSN: 2579-8901
- Visitor Stats
View JUITA Stats