What do Indonesians talk when they talk about COVID-19 Vaccine: A Topic Modeling Approach with LDA

Authors

  • Theresia Ratih Dewi Saputri Universitas Ciputra Surabaya
  • Caecilia Citra Lestari <p style="text-align: left;" dir="ltr" align="center">Universitas Ciputra Indonesia</p>
  • Salmon Charles Siahaan <p style="text-align: left;" align="center">Universitas Ciputra Surabaya</p>

DOI:

https://doi.org/10.30595/juita.v10i2.13500

Keywords:

COVID-19, vaccine, topic modeling, LDA

Abstract

To end the COVID-19 pandemics, the government attempted to accelerate the vaccination through various programs and collaboration. Unfortunately, the number is still relatively small compared to the number of populations in Indonesia. There are some reasons attributed to this challenge, one of them being the reluctance of citizens to accept the COVID-19 vaccine due to various factors. Knowing this factor to increase public compliance, the vaccination program can be speed-up. Unfortunately, traditionally acquiring the knowledge related to COVID-19 vaccine rejection can be challenging.  One of the ways to capture the knowledge is by conducting a survey or interview related to COVID-19 vaccine acceptance. This method can be inefficient in terms of cost and resources. To address those problem, we propose a novel method for analyzing the topics related to the COVID-19 Indonesians’ opinions on Twitter by implementing topic modeling algorithm called Latent Dirichlet Allocation. We gathered more than 22000 tweets related to the COVID-19 vaccine. By applying the algorithm to the collected dataset, we can capture the what is general opinion and topic when people discuss about COVID-19 vaccine. The result was validated using the labeled dataset that have been gathered in the previous research. Once we have the important term, the strategy based on can be determined by the medical professional who are responsible to administer the COVID-19 vaccine. 

Author Biographies

Theresia Ratih Dewi Saputri, Universitas Ciputra Surabaya

Informatics, School of Information Technology

Caecilia Citra Lestari, <p style="text-align: left;" dir="ltr" align="center">Universitas Ciputra Indonesia</p>

Informatics, School of Information Technology

Salmon Charles Siahaan, <p style="text-align: left;" align="center">Universitas Ciputra Surabaya</p>

School of Medicine

References

[1] Tosepu, R., Gunawan, J., Effendy, D. S., Lestari, H., Bahar, H., & Asfian, P., “Correlation between weather and Covid-19 pandemic in Jakarta, Indonesia,” Sci. Total Environ., vol. 725, 2020, doi: 10.1016/j.scitotenv.2020.138436.

[2] M. Ciotti, M. Ciccozzi, A. Terrinoni, W. C. Jiang, C. Bin Wang, and S. Bernardini, “The COVID-19 pandemic,” Crit. Rev. Clin. Lab. Sci., vol. 57, no. 6, pp. 365–388, 2020, doi: 10.1080/10408363.2020.1783198.

[3] J. S. Tregoning, K. E. Flight, S. L. Higham, Z. Wang, and B. F. Pierce, “Progress of the COVID-19 vaccine effort: viruses, vaccines and variants versus efficacy, effectiveness and escape,” Nat. Rev. Immunol., vol. 21, no. 10, pp. 626–636, 2021, doi: 10.1038/s41577-021-00592-1.

[4] T. Koyama, D. Weeraratne, J. L. Snowdon, and L. Parida, “Emergence of Drift Variants That May A ff ect COVID-19 Vaccine Development and,” Pathogens, no. 2020, pp. 1–7, 2020, doi: 10.1109/BIBM52615.2021.9669839

[5] Mathieu, E., Ritchie, H., Ortiz-Ospina, E., Roser, M., Hasell, J., Appel, C., Giattino, C. and Rodés-Guirao, L.., “A global database of COVID-19 vaccinations,” Nat. Hum. Behav., 2021, doi: 10.1038/s41562-021-01122-8.

[6] A. Fuady, N. Nuraini, K. K. Sukandar, and B. W. Lestari, “Targeted vaccine allocation could increase the covid-19 vaccine benefits amidst its lack of availability: A mathematical modeling study in indonesia,” Vaccines, vol. 9, no. 5, 2021, doi: 10.3390/vaccines9050462.

[7] A. Z. Sarnoto and L. Hayatina, “Polarization of the muslim community towards government policies in overcoming the COVID-19 pandemic in Indonesia,” Linguist. Cult. Rev., vol. 5, no. April, pp. 642–652, 2021, doi: 10.37028/lingcure.v5nS1.1449.

[8] G. B. S. Wirawan, P. N. T. Y. Mahardani, M. R. K. Cahyani, N. L. P. S. P. Laksmi, and P. P. Januraga, “Conspiracy beliefs and trust as determinants of COVID-19 vaccine acceptance in Bali, Indonesia: Cross-sectional study,” Pers. Individ. Dif., vol. 180, no. January, p. 110995, 2021, doi: 10.1016/j.paid.2021.110995.

[9] Harapan, H., Wagner, A.L., Yufika, A., Winardi, W., Anwar, S., Gan, A.K., Setiawan, A.M., Rajamoorthy, Y., Sofyan, H., Vo, T.Q. and Hadisoemarto, P.F. “Willingness-to-pay for a COVID-19 vaccine and its associated determinants in Indonesia,” Hum. Vaccines Immunother., vol. 16, no. 12, pp. 3074–3080, 2020, doi: 10.1080/21645515.2020.1819741.

[10] A. N. Mason, J. Narcum, and K. Mason, “Social media marketing gains importance after Covid-19,” Cogent Bus. Manag., vol. 8, no. 1, 2021, doi: 10.1080/23311975.2020.1870797.

[11] A. R. Rahmanti, D. N. A. Ningrum, L. Lazuardi, H. C. Yang, and Y. C. Li, “Social Media Data Analytics for Outbreak Risk Communication: Public Attention on the ‘New Normal’ During the COVID-19 Pandemic in Indonesia,” Comput. Methods Programs Biomed., vol. 205, p. 106083, 2021, doi: 10.1016/j.cmpb.2021.106083.

[12] L. J. Peng, X. G. Shao, and W. M. Huang, “Research on the Early-Warning Model of Network Public Opinion of Major Emergencies,” IEEE Access, vol. 9, pp. 44162–44172, 2021, doi: 10.1109/ACCESS.2021.3066242.

[13] L. Hong and B. D. Davison, “Empirical Study of Topic Modeling in Twitter,” in 1st Workshop on Social Media Analytics (SOMA ’10), 2010, p. 138.

[14] C. H. Papadimitriou, P. Raghavan, H. Tamaki, and S. Vempala, “Latent semantic indexing: A probabilistic analysis,” J. Comput. Syst. Sci., vol. 61, no. 2, pp. 217–235, 2000, doi: 10.1006/jcss.2000.1711.

[15] I. Vayansky and S. A. P. Kumar, “A review of topic modeling methods,” Inf. Syst., vol. 94, p. 101582, 2020, doi: 10.1016/j.is.2020.101582.

[16] T. Schmiedel, O. Müller, and J. vom Brocke, “Topic Modeling as a Strategy of Inquiry in Organizational Research: A Tutorial With an Application Example on Organizational Culture,” Organ. Res. Methods, vol. 22, no. 4, pp. 941–968, 2019, doi: 10.1177/1094428118773858.

[17] W. Chen, K. Lai, and Y. Cai, “Topic generation for Chinese stocks: a cognitively motivated topic modelingmethod using social media data,” Quant. Financ. Econ., vol. 2, no. 2, pp. 279–293, 2018, doi: 10.3934/qfe.2018.2.279.

[18] T. Porturas and R. A. Taylor, “Forty years of emergency medicine research: Uncovering research themes and trends through topic modeling,” Am. J. Emerg. Med., vol. 45, no. xxxx, pp. 213–220, 2021, doi: 10.1016/j.ajem.2020.08.036.

[19] S. Boon-Itt and Y. Skunkan, “Public Perception of the COVID-19 Pandemic on Twitter: Sentiment Analysis and Topic Modeling Study,” JMIR Public Heal. Surveill., vol. 6, no. 4, p. e21978, Nov. 2020, doi: 10.2196/21978.

[20] T. De Melo and C. M. S. Figueiredo, “Comparing News and Tweets about COVID-19 in Brazil Table of Contents,” JMIR Public Heal. Surveill., vol. 7, no. 2, 2021, doi: 10.2196/24585.

[21] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation,” J. Mach. Learn. Res., vol. 3, pp. 993–1022, 2003, doi: 10.1016/B978-0-12-411519-4.00006-9.

[22] G. Salton, “Some research problems in automatic information retrieval,” 1983. doi: 10.1145/1013230.511830

[23] T. Hofmann, “Probabilistic latent semantic indexing,” Proc. 22nd Annu. Int. ACM SIGIR Conf. Res. Dev. Inf. Retrieval, SIGIR 1999, pp. 50–57, 1999, doi: 10.1145/312624.312649.

[24] Scott Deerwester, Richard Harshman, Susan T, George W, and Thomas K, “Indexing by Latent Semantic Analysis,” J. Am. Soc. Inf. Sci., vol. 41, no. 6, pp. 391–407, 1990. doi: 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9

[25] J. C. Lyu, E. Le Han, and G. K. Luli, “Covid-19 vaccine-related discussion on twitter: Topic modeling and sentiment analysis,” J. Med. Internet Res., vol. 23, no. 6, pp. 1–12, 2021, doi: 10.2196/24435.

[26] M. Zhuang, Y. Li, X. Tan, L. Xing, and X. Lu, “Analysis of public opinion evolution of COVID-19 based on LDA-ARMA hybrid model,” Complex Intell. Syst., vol. 7, no. 6, pp. 3165–3178, 2021, doi: 10.1007/s40747-021-00514-7.

[27] A. M. Tri Sakti, E. Mohamad, and A. A. Azlan, “Mining of opinions on COVID-19 large-scale social restrictions in indonesia: Public sentiment and emotion analysis on online media,” J. Med. Internet Res., vol. 23, no. 8, 2021, doi: 10.2196/28249.

[28] D. A. Nurdeni, I. Budi, and A. B. Santoso, “Sentiment Analysis on Covid19 Vaccines in Indonesia: From the Perspective of Sinovac and Pfizer,” 3rd 2021 East Indones. Conf. Comput. Inf. Technol. EIConCIT 2021, pp. 122–127, 2021, doi: 10.1109/EIConCIT50028.2021.9431852.

[29] P. Otero, J. Gago, and P. Quintas, “Twitter data analysis to assess the interest of citizens on the impact of marine plastic pollution,” Mar. Pollut. Bull., vol. 170, no. June, p. 112620, 2021, doi: 10.1016/j.marpolbul.2021.112620.

[30] U. Naseem, I. Razzak, M. Khushi, P. W. Eklund, and J. Kim, “COVIDSenti: A Large-Scale Benchmark Twitter Data Set for COVID-19 Sentiment Analysis,” IEEE Trans. Comput. Soc. Syst., vol. 8, no. 4, pp. 976–988, 2021, doi: 10.1109/TCSS.2021.3051189.

[31] J. Xu and W. B. Croft, “Corpus-based stemming using cooccurrence of word variants,” ACM Trans. Inf. Syst., vol. 16, no. 1, pp. 61–81, 1998, doi: 10.1145/267954.267957.

[32] J. J. Webster and C. Kit, “Tokenization as The Initial Phase in NLP,” in COLING: The 14th International Conference on Computational Linguistics, 1992, vol. 4. doi: 10.3115/992424.992434

[33] S. Syed and M. Spruit, “Full-Text or abstract? Examining topic coherence scores using latent dirichlet allocation,” Proc. - 2017 Int. Conf. Data Sci. Adv. Anal. DSAA 2017, vol. 2018-January, pp. 165–174, 2017, doi: 10.1109/DSAA.2017.61.

[34] R. Řehůřek and P. Sojka, “Gensim — Statistical Semantics in Python,” Retrieve from gensim.org, no. May 2010, 2011.

[35] C. Sievert and K. Shirley, “LDAvis: A method for visualizing and interpreting topics,” in Workshop on Interactive Language Learning, Visualization, and Interfaces, 2015, pp. 63–70, doi: 10.3115/v1/w14-3110.

Downloads

Published

2022-11-14

How to Cite

Saputri, T. R. D., Lestari, C. C., & Siahaan, S. C. (2022). What do Indonesians talk when they talk about COVID-19 Vaccine: A Topic Modeling Approach with LDA. JUITA: Jurnal Informatika, 10(2), 233–242. https://doi.org/10.30595/juita.v10i2.13500