Optimizing Random Forest for East Java Election Sentiment with Chi-Square and Mutual Information

Authors

  • Rahma Putri Widyaiswari Telkom University
  • Anisa Dzulkarnain Telkom University
  • Alqis Rausanfita Telkom University

DOI:

https://doi.org/10.30595/juita.v13i3.26778

Keywords:

sentiment analysis, East Java governor election, social media X, random forest, feature selection

Abstract

The rise of social media has transformed the way people express opinions, including in political contexts. In the 2024 East Java Gubernatorial Election, social media platform X became a major outlet for public sentiment toward the governor and deputy governor candidates. This study aims to analyse public sentiment toward three candidate pairs by categorizing the data into three sentiment classes: positive, negative, and neutral. Feature selection was conducted by combining Term Frequency-Inverse Document Frequency (TF-IDF) with Chi-Square and Mutual Information (MI) methods to improve feature quality. The Random Forest algorithm was employed as the primary classification model. In addition, several other algorithms were tested for comparison. The results indicate that the TF-IDF and Chi-Square combination with Random Forest achieved the highest accuracy of 82.07%. These findings highlight the importance of feature selection in improving model performance for sentiment classification. The study provides insights into public opinion that can serve as a reference for strategic decision-making in the political and public sectors.

References

[1] T. Cahya Herdiyani and A. U. Zailani, “Sentiment Analysis Terkait Pemindahan Ibu Kota Indonesia Menggunakan Metode Random Forest Berdasarkan Tweet Warga Negara Indonesia Sentiment Analysis Related to Transportation of Indonesian Capital City Using Random Forest Method Based On Tweet Of Indonesian Citizens,” JTSI, vol. 3, no. 2, pp. 154–165, 2022, doi: https://doi.org/10.35957/jtsi.v3i2.2920.

[2] K. Adib, M. R. Handayani, W. D. Yuniarti, and K. Umam, “Opini Publik Pasca-Pemilihan Presiden: Eksplorasi Analisis Sentimen Media Sosial X Menggunakan SVM,” vol. 7, no. 2, pp. 80–91, 2024, [Online]. Available: https://doi.org/10.31598

[3] I. Arifin, M. Ahmad, J. Arifin, A. Agustang, and A. Sadriani, “Equilibrium: Jurnal Pendidikan Peran Media Sosial Dalam Mempengaruhi Keputusan Pemilih Pemula Pada Pemilu 2024 di Indonesia,” vol. 12, pp. 201–208, 2024, doi: https://doi.org/10.26618/equilibrium.v12i2.14421.

[4] M. R. Fahlevvi, “Analisis Sentimen Terhadap Ulasan Aplikasi Pejabat Pengelola Informasi dan Dokumentasi Kementerian Dalam Negeri Republik Indonesia di Google Playstore Menggunakan Metode Support Vector Machine,” Jurnal Teknologi dan Komunikasi Pemerintahan, vol. 4, no. 1, pp. 1–13, 2022, doi: https://doi.org/10.33701/jtkp.v4i1.2701.

[5] O. Manullang, C. Prianto, and N. H. Harani, “Analisis Sentimen Untuk Memprediksi Hasil Calon Pemilu Presiden Menggunakan Lexicon Based dan Random Forest,” vol. 11, no. 2, pp. 159–169, 2023, doi: https://doi.org/10.33884/jif.v11i02.7987.

[6] P. W. Ratiasasadara, S. Sudarno, and T. Tarno, “ANALISIS SENTIMEN PENERAPAN PPKM PADA TWITTER MENGGUNAKAN NAIVE BAYES CLASSIFIER DENGAN SELEKSI FITUR CHI-SQUARE,” Jurnal Gaussian, vol. 11, no. 4, pp. 580–590, Feb. 2023, doi: 10.14710/j.gauss.11.4.580-590.

[7] I. Gusti, A. Ngurah, R. Semadi, M. Samsudin, and K. Dharmendra, “Perbandingan Metode Seleksi Fitur Pada Analisis Sentimen (Studi Kasus Opini PILKADA DKI 2017),” Journal of informatics, vol. 8, no. 1, pp. 11–18, 2023, doi: https://doi.org/10.51211/itbi.v8i1.2408.

[8] A. P. P. Wardani, A. Adiwijaya, and M. D. Purbolaksono, “Sentiment Analysis on Beauty Product Review Using Modified Balanced Random Forest Method and Chi-Square,” Journal of Information System Research (JOSH), vol. 4, no. 1, pp. 1–7, Oct. 2022, doi: 10.47065/josh.v4i1.2047.

[9] I. G. Putra, C. Pramartha, A. A. I. N. E. Karyawati, and M. A. Raharja, “Penerapan SVM dengan Seleksi Fitur Mutual Information untuk Memprediksi Sentimen PEMILU 2024,” Jurnal Elektronik Ilmu Komputer Udayana, vol. 12, no. 4, pp. 2654–5101, May 2024, doi: https://doi.org/10.24843/JLK.2024.v12.i04.p11.

[10] T. Gori, A. Sunyoto, and H. Al Fatta, “Preprocessing Data dan Klasifikasi untuk Prediksi Kinerja Akademik Siswa,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 11, no. 1, pp. 215–224, Feb. 2024, doi: 10.25126/jtiik.20241118074.

[11] A. Arham, E. R. Swedia, M. Cahyanti, and M. R. D. Septian, “IMPLEMENTASI SENTIMENT ANALYSIS PADA OPINI MASYARAKAT INDONESIA DI TWITTER TERHADAP VIRUS COVID-19 VARIAN OMICRON DENGAN ALGORITMA NAÏVE BAYES, DECISION TREE, DAN SUPPORT VECTOR MACHINE,” Sebatik, vol. 26, no. 2, pp. 565–572, Dec. 2022, doi: 10.46984/sebatik.v26i2.1961.

[12] U. Khairani, V. Mutiawani, and H. Ahmadian, “Pengaruh Tahapan Preprocessing Terhadap Model Indobert Dan Indobertweet Untuk Mendeteksi Emosi Pada Komentar Akun Berita Instagram,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 11, no. 4, pp. 887–894, Aug. 2024, doi: 10.25126/jtiik.1148315.

[13] A. Zahra Latifa and W. Maharani, “Jurnal Teknik Informatika (JUTIF) Analyzing Public Sentiment on the Relocation of Indonesia’s Capital to Kalimantan as the Ibu Kota Nusantara Using Logistic Regression,” vol. 6, no. 2, pp. 575–592, 2025, doi: 10.52436/1.jutif.2025.6.2.4230.

[14] F. A. Larasati, D. E. Ratnawati, and B. T. Hanggara, “Analisis Sentimen Ulasan Aplikasi Dana dengan Metode Random Forest,” vol. 6, no. 9, pp. 4305–4313, Sep. 2022, [Online]. Available: http://j-ptiik.ub.ac.id

[15] R. Aryanti, T. Misriati, and A. Sagiyanto, “Analisis Sentimen Aplikasi Primaku Menggunakan Algoritma Random Forest dan SMOTE untuk Mengatasi Ketidakseimbangan Data,” Journal of Computer System and Informatics (JoSYC), vol. 5, no. 1, pp. 218–227, Nov. 2023, doi: 10.47065/josyc.v5i1.4562.

[16] R. Oktafiani, A. Hermawan, and D. Avianto, “Pengaruh Komposisi Split data Terhadap Performa Klasifikasi Penyakit Kanker Payudara Menggunakan Algoritma Machine Learning,” Jurnal Sains dan Informatika, vol. 9, no. 1, pp. 19–28, Jun. 2023, doi: 10.34128/jsi.v9i1.622.

[17] K. Hadi and E. Utami, “Analysis of K-NN with the Integration of Bag of Words, TF-IDF, and N-Grams for Hate Speech Classification on Twitter,” vol. 12, no. 2, pp. 289–298, 2024, doi: https://doi.org/10.30595/juita.v12i2.23829.

[18] T. Ernayanti, M. Mustafid, A. Rusgiyono, and A. R. Hakim, “PENGGUNAAN SELEKSI FITUR CHI-SQUARE DAN ALGORITMA MULTINOMIAL NAÏVE BAYES UNTUK ANALISIS SENTIMEN PELANGGGAN TOKOPEDIA,” Jurnal Gaussian, vol. 11, no. 4, pp. 562–571, Feb. 2023, doi: 10.14710/j.gauss.11.4.562-571.

[19] S. Amaliah, M. Nusrang, and A. Aswi, “Penerapan Metode Random Forest Untuk Klasifikasi Varian Minuman Kopi di Kedai Kopi Konijiwa Bantaeng,” VARIANSI: Journal of Statistics and Its application on Teaching and Research, vol. 4, no. 3, pp. 121–127, Dec. 2022, doi: 10.35580/variansiunm31.

[20] R. Nurhidayat and K. E. Dewi, “KOMPUTA : Jurnal Ilmiah Komputer dan Informatika PENERAPAN ALGORITMA K-NEAREST NEIGHBOR DAN FITUR EKSTRAKSI N-GRAM DALAM ANALISIS SENTIMEN BERBASIS ASPEK,” vol. 12, no. 1, pp. 91–100, 2023, doi: https://doi.org/10.34010/komputa.v12i1.9458.

[21] M. Andrew, A. Yasin, D. Arman Prasetya, and T. M. Fahrudin, “Analisis Sentimen Tiktok Shop Menggunakan Metode Multinomial Naïve Bayes Dan BM25,” Jurnal Ilmiah Teknologi Informasi Asia, vol. 18, no. 02, pp. 24–31, 2024, doi: https://doi.org/10.32815/jitika.v18i2.

Downloads

Published

2025-11-08

How to Cite

Widyaiswari, R. P., Dzulkarnain, A., & Rausanfita, A. (2025). Optimizing Random Forest for East Java Election Sentiment with Chi-Square and Mutual Information. JUITA: Jurnal Informatika, 13(3), 297–305. https://doi.org/10.30595/juita.v13i3.26778

Similar Articles

1 > >> 

You may also start an advanced similarity search for this article.