Clustering Analysis and Heavy Thunderstorm Prediction Using K-Means, Probit, dan Logit
Analisis Klasterisasi dan Prediksi Hujan Lebat Petir Menggunakan Model K-Means, Probit, Logit
DOI:
https://doi.org/10.30595/jrst.v10i1.27174Keywords:
Clustering, K-means, logit, probit, thunderstormAbstract
Weather in Indonesia is often influenced by seasonal phenomena, such as the peak of the rainy season, which exhibits highly dynamic patterns. These conditions frequently lead to extreme weather events, such as heavy rain accompanied by thunderstorms, affecting daily community activities. Therefore, it is essential to identify weather patterns and accurately predict the potential for thunderstorms, both for risk mitigation and activity planning. This study aims to predict thunderstorms on the 8th day and over the following seven days. Furthermore, it seeks to identify which provinces are likely to experience thunderstorms during this period. The method employed is K-Means clustering with the Elbow technique to determine the optimal number of clusters. Weather prediction is then performed using logit and probit models with a threshold of 0.2. The results indicate that the optimal number of clusters is five. Predictions for the 8th day show that two clusters have the potential to experience thunderstorms, with probabilities of 0.2 and 0.3, respectively. Forecasts for the next seven days reveal that 14 provinces are likely to experience thunderstorms, with probabilities ranging from 0.01 to 0.11. This study provides a clear overview of thunderstorm potential across various regions in Indonesia. By understanding these weather patterns, communities are expected to better prepare and reduce risks associated with extreme weather conditions.
ABSTRAK (Bahasa Indonesia)
Cuaca di Indonesia sering dipengaruhi oleh fenomena musiman, seperti puncak musim hujan, yang memiliki pola sangat dinamis. Kondisi tersebut kerap menimbulkan cuaca ekstrem, seperti hujan lebat disertai petir, yang berdampak pada aktivitas masyarakat sehari-hari. Oleh karena itu, penting untuk mengidentifikasi pola cuaca dan memprediksi potensi hujan petir secara akurat, baik untuk mitigasi risiko maupun perencanaan aktivitas masyarakat. Penelitian ini bertujuan memprediksi hujan petir pada hari ke-8 serta selama tujuh hari ke depan. Selain itu, penelitian juga bertujuan mengidentifikasi provinsi yang berpotensi mengalami hujan petir pada periode tersebut. Metode yang digunakan adalah klasterisasi K-Means dengan teknik Elbow untuk menentukan jumlah klaster optimal. Prediksi cuaca dilakukan menggunakan model logit dan probit dengan ambang batas (threshold) 0,2. Hasil penelitian menunjukkan jumlah klaster optimal adalah lima. Prediksi untuk hari ke-8 mengindikasikan terdapat dua klaster yang berpotensi mengalami hujan petir dengan probabilitas masing-masing sebesar 0,2 dan 0,3. Prediksi selama tujuh hari ke depan menunjukkan sebanyak 14 provinsi berpotensi mengalami hujan petir dengan tingkat probabilitas bervariasi antara 0,01 hingga 0,11. Penelitian ini memberikan gambaran yang jelas mengenai potensi hujan petir di berbagai wilayah Indonesia. Dengan memahami pola cuaca ini, diharapkan masyarakat dapat mempersiapkan diri dengan lebih baik dan mengurangi risiko yang mungkin timbul akibat cuaca ekstrem.
References
Amin, F. M., Rusydiyah, E. F., & Azizah, A. N. (2025). Personalized Library Book Recommendations Using K-Means Clustering and Association Rules. Journal of Scientometric Research, 14(1), 32–45.
Badan Meteorologi, K. dan G. (2024). Data Prakiraan Cuaca Terbuka BMKG (Kecamatan). Retrieved from https://data.bmkg.go.id/csv/
Cahyanto, R., Chrismanto, A. R., & Sebastian, D. (2020). Pengelompokan Komentar Dataset Sentipol dengan Modified K-Means Clustering. Jurnal Teknik Informatika dan Sistem Informasi, 6(3), 531–540.
Chusyairi, A. (2023). Clustering Data Cuaca Ekstrim Indonesia dengan K-Means dan Entropi. Journal of Informatics and Communication Technology (JICT), 5(1), 1–10.
Cui, M. (2020). Introduction to the K-Means Clustering Algorithm Based on the Elbow Method, 5–8.
David, M., Alonso-Montesinos, J., Le Gal La Salle, J., & Lauret, P. (2023). Probabilistic Solar Forecasts as a Binary Event Using a Sky Camera. Energies, 16(20), 1–18.
Dinar Ajeng, K., Irwansyah, S., & Rina. (2021). Rain Prediction Clustering in Australia Using the K-Means Algorithm in the WEKA and RStudio Application. Portal Journal UPN Yogyakarta, (November), 13–2021.
Fitria, N. I. (2024). Prospek Cuaca Mingguan Periode 03–09 Desember 2024: Siaga Puncak Musim Hujan: Potensi Cuaca Ekstrem Melanda Sejumlah Wilayah Indonesia. www.bmkg.go.id. Retrieved January 12, 2025, from https://www.bmkg.go.id/cuaca/prospek-cuaca-mingguan/prospek-cuaca-mingguan-periode-03-09-desember-2024-siaga-puncak-musim-hujan-potensi-cuaca-ekstrem-melanda-sejumlah-wilayah-indonesia
Kalude, U. F., Titaley, J., & Kalua, A. L. (2022). Analisis Clustering Perubahan Cuaca dan Iklim di Kota Manado Menggunakan Metode K-Means, 01(02), 1–12.
Khairunnisa, S., & Jambak, M. I. (2022). Pengelompokan Cuaca Kota Palembang Menggunakan Algoritma K-Means Clustering Untuk Mengetahui Pola Karakteristik Cuaca. Jurnal Media Informatika Budidarma, 6(4), 2352.
Learn, S. (2025). StandardScaler. Retrieved December 15, 2025, from https://scikit-learn.org/1.5/modules/generated/sklearn.preprocessing.StandardScaler.html
Liyew, C. M., & Melese, H. A. (2021). Machine learning techniques to predict daily rainfall amount. Journal of Big Data, 8(1). Springer International Publishing. Retrieved from https://doi.org/10.1186/s40537-021-00545-4
Luthfiarta, A., Febriyanto, A., Lestiawan, H., & Wicaksono, W. (2020). Analisa Prakiraan Cuaca dengan Parameter Suhu, Kelembaban, Tekanan Udara, dan Kecepatan Angin Menggunakan Regresi Linear Berganda. JOINS (Journal of Information System), 5(1), 10–17.
Neumann, O., Turowski, M., Mikut, R., Hagenmeyer, V., & Ludwig, N. (2023). Using weather data in energy time series forecasting: the benefit of input data transformations. Energy Informatics, 6(1). Springer International Publishing. Retrieved from https://doi.org/10.1186/s42162-023-00299-8
Pan, Q., Porth, L., & Li, H. (2022). Assessing the Effectiveness of the Actuaries Climate Index for Estimating the Impact of ExtremeWeather on Crop Yield and Insurance Applications. Sustainability (Switzerland), 14(11).
Putratama, R. (2024). BMKG: Cuaca Ekstrem Masih Berpotensi Hingga 8 Maret 2024. Retrieved January 12, 2025, from https://www.bmkg.go.id/siaran-pers/bmkg-cuaca-ekstrem-masih-berpotensi-hingga-8-maret-2024
Raghavendran, R. V. (2019). K- Means Clustering and Naive Bayes Algorithms, (June).
Ramadhani, A., Putri, D. R., & Raihani, R. (2022). Perbandingan Keakuratan Model Logit Dan Probit Dalam Mengidentifikasi Faktor-Faktor Yang Memengaruhi Persentase Penduduk Miskin di Kalimantan. Prosiding Seminar Nasional Matematika dan Statistika, 387–407. Retrieved from http://jurnal.fmipa.unmul.ac.id/index.php/SNMSA/article/view/850
Schröer, C., Kruse, F., & Gómez, J. M. (2021). A systematic literature review on applying CRISP-DM process model. Procedia Computer Science, 181(2019), 526–534. Elsevier B.V. Retrieved from https://doi.org/10.1016/j.procs.2021.01.199
Shi, C., Wei, B., Wei, S., Wang, W., Liu, H., & Liu, J. (2021). A quantitative discriminant method of elbow point for the optimal number of clusters in clustering algorithm. Eurasip Journal on Wireless Communications and Networking, 2021(1). Springer International Publishing. Retrieved from https://doi.org/10.1186/s13638-021-01910-w
Yani, V. I., Aradea, A., & Mubarok, H. (2022). Optimasi Prakiraan Cuaca Menggunakan Metode Ensemble pada Naïve Bayes dan C4.5. Jurnal Teknik Informatika dan Sistem Informasi, 8(3), 607–619.
Zhan, D., Qin, S., (Leon) Wang, L., & Hassan, I. G. (2025). Weather clustering for machine learning-based hourly building energy prediction models at design phase. Energy and Buildings, 329(January), 115308. Elsevier B.V. Retrieved from https://doi.org/10.1016/j.enbuild.2025.115308
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Yussi Dwi Hastuti, Arief Wibowo

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access)
JRST (Jurnal Riset Sains dan Teknologi) is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

