Handling Noise Data with PCA Method and Optimization Using Hybrid Fuzzy C-Means and Genetic Algorithm
DOI:
https://doi.org/10.30595/juita.v12i2.21765Keywords:
noise data, Principal Component Analysis, Fuzzy C-Means, Genetic Algorithm, customer segmentationAbstract
The significance of machine learning (ML) and data mining techniques particularly clustering is examined in this research, in managing large data sets for customer segmentation in the retail sector. The research emphasizes the challenges posed by data noise and proposes a solution using Principal Component Analysis (PCA) to improve accuracy. This study introduces a hybrid approach that combines Fuzzy C-Means (FCM) with genetic algorithms for optimization in customer segmentation, and suggests further research on the optimal number of clusters and data noise elimination. By addressing data noise, the proposed PCA-based method achieved a higher accuracy rate of 98% compared to 93% without PCA. This finding underscores the effectiveness of PCA in noise reduction, improving clustering accuracy. This research contributes to the advancement of customer-focused business strategies through better data analysis and interpretation. The proposed approach has potential applications in areas including data analysis, pattern recognition, and image processing, highlighting its relevance in the contemporary business environment.References
[1] J. Zhou, J. Wei, and B. Xu, “Customer segmentation by web content mining,” J. Retail. Consum. Serv., vol. 61, no. March, p. 102588, 2021, doi: 10.1016/j.jretconser.2021.102588.
[2] J.-M. Sahut, D. Schweizer, and M. Peris-Ortiz, “Technological Innovations to Ensure Confidence in the Digital World,” SSRN Electron. J., no. February 2023, 2022, doi: 10.2139/ssrn.4160924.
[3] A. Hadi, “Segmentasi Pelanggan Internet Service Provider (ISP) Berbasis Pillar K-Means,” J. Ilm. Teknol. Inf. Asia, vol. 13, no. 2, p. 151, 2019, doi: 10.32815/jitika.v13i2.413.
[4] M. A. Jassim and S. N. Abdulwahid, “Data Mining preparation: Process, Techniques and Major Issues in Data Analysis,” IOP Conf. Ser. Mater. Sci. Eng., vol. 1090, no. 1, p. 012053, 2021, doi: 10.1088/1757-899x/1090/1/012053.
[5] D. García-Gil, J. Luengo, S. García, and F. Herrera, “Enabling Smart Data: Noise filtering in Big Data classification,” Inf. Sci. (Ny)., vol. 479, no. 2019, pp. 135–152, 2019, doi: 10.1016/j.ins.2018.12.002.
[6] S. Gupta and A. Gupta, “Dealing with noise problem in machine learning data-sets: A systematic review,” Procedia Comput. Sci., vol. 161, pp. 466–474, 2019, doi: 10.1016/j.procs.2019.11.146.
[7] R. Zebari, A. Abdulazeez, D. Zeebaree, D. Zebari, and J. Saeed, “A Comprehensive Review of Dimensionality Reduction Techniques for Feature Selection and Feature Extraction,” J. Appl. Sci. Technol. Trends, vol. 1, no. 2, pp. 56–70, 2020, doi: 10.38094/jastt1224.
[8] B. M. S. Hasan and A. M. Abdulazeez, “A Review of Principal Component Analysis Algorithm for Dimensionality Reduction,” J. Soft Comput. Data Min., vol. 2, no. 1, pp. 20–30, 2021, doi: 10.30880/jscdm.2021.02.01.003.
[9] J. Chen, H. Zhang, D. Pi, M. Kantardzic, Q. Yin, and X. Liu, “A Weight Possibilistic Fuzzy C-Means Clustering Algorithm,” Sci. Program., vol. 2021, 2021, doi: 10.1155/2021/9965813.
[10] K. do Prado Ribeiro, C. H. Fontes, and G. J. A. de Melo, “Genetic algorithm-based fuzzy clustering applied to multivariate time series,” Evol. Intell., vol. 14, no. 4, pp. 1547–1563, 2021, doi: 10.1007/s12065-020-00422-8.
[11] G. M. Lee and X. Gao, “A hybrid approach combining fuzzy c‐means‐based genetic algorithm and machine learning for predicting job cycle times for semiconductor manufacturing,” Appl. Sci., vol. 11, no. 16, 2021, doi: 10.3390/app11167428.
[12] Ibnu Daqiqil Id, Machine Learning: Teori, Studi Kasus dan Implementasi Menggunakan Python. UR PRESS Riau, Indonesia, 2021.
[13] J. Jiawei, H., Kamber, M., & Pei, Data Mining Concepts and Techniques Third Edition. 2012.
[14] N. H. Timm, Applied Multivariate Analysis. 2002.
[15] W. S. Hasibuan, “Penerapan Metode Fisherface Untuk Mendeteksi Wajah Pada Citra Pasfoto,” Bull. Electr. Electron. Eng., vol. 1, no. 3, pp. 122–126, 2021.
[16] H. Abdi and L. J. Williams, “Principal component analysis,” Wiley Interdiscip. Rev. Comput. Stat., vol. 2, no. 4, pp. 433–459, 2010, doi: 10.1002/wics.101.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 JUITA : Jurnal Informatika

This work is licensed under a Creative Commons Attribution 4.0 International License.

JUITA: Jurnal Informatika is licensed under a Creative Commons Attribution 4.0 International License.