Cyberbullying Detection Modelling at Twitter Social Networking

Ika Yunida Anggraini, Sucipto Sucipto, Rini Indriati


Cybercrimes often happened in social networking sites. Cyber-bullying is a form of cybercrime that recently trended in one of popular social networking sites, Twitter. The practice of cyber-bullying on teenager can cause depression, murderer or suicidal thoughts and it needs a preventing action so it will not harmful to the victim. To prevent cyber-bullying a text mining modelling can be done to classify tweets on Twitter into two classes, bullying class and not bullying class. On this research we use Naïve Bayes Classifier with five stages of pre-processing : replace tokens, transform case, tokenization, filter stopwords and n-grams. The validation process on this research used 10-Fold Cross Validation. To evaluate the performance of the model a Confusion Matrix table is used. The model on 10-Fold Cross Validation phase works well with 77,88% of precision , 94,75% of recall and 82,50% of accuracy with +/-5,12%  of standard deviation.


Modelling, Cyberbullying, Twitter.


[1], “Kominfo : Pengguna Internet di Indonesia 63 Juta Orang,” 2017. [Daring]. Tersedia pada: [Diakses: 01-Jul-2018].

[2] H. Sanchez dan K. Shreyas, “Twitter Bullying Detection,” California, 2011.

[3] T. Viva, “1 dari 4 Remaja Pernah Alami Perundungan di Dunia Maya,” Gaya Hidup VIVA, 2017. [Daring]. Tersedia pada: [Diakses: 01-Jul-2018].

[4] J. W. Patchin dan S. Hinduja, Words Wound : Delete Cyberbullying and Make Kindness Go Viral. Minneapolis: Free Spirit Publishing, 2014.

[5] S. Hinduja dan J. W. Patchin, “Cyberbullying and Suicide,” 2010.

[6] H. Margono, X. Yi, dan G. K. Raikundalia, “Mining Indonesian Cyber Bullying Patterns in Social Networks,” in Proceedings of Thirty-Seventh Australasian Computer Science Conference (ACSC 2014), 2014, no. ACSC, hal. 115–124.

[7] J. E. Sembodo, E. B. Setiawan, dan Z. K. A. Baizal, “Data Crawling Otomatis pada Twitter,” in IND. SYMPOSIUM ON COMPUTING, 2016, no. September, hal. 11–16.

[8] RapidMiner, “Replace ( Dictionary ),” 2018. [Daring]. Tersedia pada: [Diakses: 01-Jul-2018].

[9] A. Go, R. Bhayani, dan L. Huang, “Twitter Sentiment Classification using Distant Supervision,” USA, 2009.

[10] B. Liu, Web Data Mining : Exploring Hyperlinks, Contents, and Usage Data, Second Edi. Berlin: Springer, 2011.

[11] J. Han dan M. Kamber, Data Mining : Concepts and Techniques, 2nd Editio. San Francisco: Morgan Kaufmann Publishers, 2006.

[12] S. Genzer, “Term Frequencies and TF-IDF: How are these calculated?,” RapidMiner Knowledge Base, 2018. [Daring]. Tersedia pada: [Diakses: 01-Jul-2018].

[13] D. Jurafsky dan J. H. Martin, Speech and Language Processing. New Jersey: Prentice Hall, 2009.

[14] F. Gorunescu, Data Mining : Concepts, Models and Techniques. Berlin: Springer, 2011.

[15] Sucipto, Kusrini, and E. L. Taufiq, “Classification method of multi-class on C4.5 algorithm for fish diseases,” in Proceeding - 2016 2nd International Conference on Science in Information Technology, ICSITech 2016: Information Science for Green Society and Environment, 2016, pp. 5–9.

[16] S. Sucipto, “Analisa Hasil Rekomendasi Pembimbing Menggunakan Multi-Attribute Dengan Metode Weighted Product,” Fountain Informatics J., vol. 2, no. 1, p. 27, May 2017.

Full Text: PDF

DOI: 10.30595/juita.v6i2.3350


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

ISSN: 2579-8901