Sentiment Analysis of Hate Speech Against Presidential Candidates of the Republic of Indonesia in the 2024 Election Using BERT
Abstract
The issue of hate speech on social media has become a matter of growing concern, particularly in the context of political discourse, as evidenced by the 2024 elections in Indonesia. Online platforms such as YouTube represent a primary medium for political discourse, frequently accompanied by negative or hateful commentary directed towards presidential candidates. The objective of this study is to analyze the sentiment of YouTube comments related to Indonesian presidential candidates in the 2024 General Election using the BERT algorithm. The data was obtained through scraping using the YouTube API and subsequently categorized into three distinct categories of hate speech: The categories of hate speech are as follows: OFP (offensive personal), OFG (offensive group), and OFO (offensive others). The CRISP-DM method was employed in this research, which included the following stages: business understanding, data understanding, data preparation, modeling, evaluation, and deployment. The results demonstrate that the BERT algorithm is capable of classifying comments with a satisfactory level of accuracy. This algorithm can be utilized to develop predictive applications that assist in identifying and managing hate speech on social media.
Downloads
References
Y. Chen, H. Sack, and M. Alam, “Analyzing social media for measuring public attitudes toward controversies and their driving factors: a case study of migration,” Soc. Netw. Anal. Min., vol. 12, no. 1, Dec. 2022, doi: 10.1007/s13278-022-00915-7.
W. A. Social, “Digital 2023: Global Overview Report,” 2023. [Online]. Available: https://wearesocial.com/
N. Muhamad, “Twitter, Medsos dengan Ujaran Kebencian Terbanyak pada Kampanye Pemilu 2024,” Katadata. [Online]. Available: https://databoks.katadata.co.id/datapublishembed/167538/twitter-medsos-dengan-ujaran-kebencian-terbanyak-pada-kampanye-pemilu-2024
Y. Tang and N. Dalzell, “Classifying Hate Speech Using a Two-Layer Model,” Stat. Public Policy, vol. 6, no. 1, pp. 80–86, Jan. 2019, doi: 10.1080/2330443X.2019.1660285.
H. Watanabe, M. Bouazizi, and T. Ohtsuki, “Hate Speech on Twitter: A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate Speech Detection,” IEEE Access, vol. 6, pp. 13825–13835, Feb. 2018, doi: 10.1109/ACCESS.2018.2806394.
M. H. Siregar, “Riset: Ujaran Kebencian Terhadap Capres Meningkat di Media Sosial Jelang Pemilu 2024,” The Conversation. [Online]. Available: https://theconversation.com/riset-ujaran-kebencian-terhadap-capres-meningkat-di-media-sosial-jelang-pemilu-2024-222060
J. M. Molero, J. Perez-Martin, A. Rodrigo, and A. Penas, “Offensive Language Detection in Spanish Social Media: Testing from Bag-of-Words to Transformers Models,” IEEE Access, vol. 11, pp. 95639–95652, 2023, doi: 10.1109/ACCESS.2023.3310244.
K. Florio, V. Basile, M. Polignano, P. Basile, and V. Patti, “Time of your hate: The challenge of time in hate speech detection on social media,” Appl. Sci., vol. 10, no. 12, Jun. 2020, doi: 10.3390/APP10124180.
S. Gite et al., “Textual Feature Extraction Using Ant Colony Optimization for Hate Speech Classification,” Big Data Cogn. Comput., vol. 7, no. 1, Mar. 2023, doi: 10.3390/bdcc7010045.
J. Devlin, M.-W. Chang, K. Lee, K. T. Google, and A. I. Language, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” [Online]. Available: https://github.com/tensorflow/tensor2tensor
Y. Liu et al., “RoBERTa: A Robustly Optimized BERT Pretraining Approach,” Jul. 2019, [Online]. Available: http://arxiv.org/abs/1907.11692
I. Budiman, T. Prahasto, and Y. Christyono, “Data Clustering Menggunakan Metodologi CRISP-DM Untuk Pengenalan Pola Proporsi Pelaksanaan Tridharma,” J. Sist. Inf. Bisnis, vol. 1, no. 3, pp. 15–16, 2014, doi: 10.21456/vol1iss3pp129-134.
A. P. Fadillah, “Penerapan Metode CRISP-DM untuk Prediksi Kelulusan Studi Mahasiswa Menempuh Mata Kuliah (Studi Kasus Universitas XYZ),” J. Tek. Inform. dan Sist. Inf., vol. 1, no. 3, pp. 260–270, 2015, doi: 10.28932/jutisi.v1i3.406.
A. Rianti, N. W. A. Majid, and A. Fauzi, “CRISP-DM: Metodologi Proyek Data Science,” Pros. Semin. Nas. Teknol. …, pp. 107–114, 2023, [Online]. Available: http://ojs.udb.ac.id/index.php/Senatib/article/view/3015
F. N. Dhewayani, D. Amelia, D. N. Alifah, B. N. Sari, and M. Jajuli, “Implementasi K-Means Clustering untuk Pengelompokkan Daerah Rawan Bencana Kebakaran Menggunakan Model CRISP-DM,” J. Teknol. dan Inf., vol. 12, no. 1, pp. 64–77, 2022, doi: 10.34010/jati.v12i1.6674.
D. Kurniawan and M. Yasir, “Optimization Sentimen Analysis using CRISP-DM and Naive Bayes Methods Implemented on Social Media,” Cybersp. J. Pendidik. Teknol. Inf., vol. 6, no. 2, p. 74, 2022, doi: 10.22373/cj.v6i2.12793.
D. Alita and A. R. Isnain, “Pendeteksian Sarkasme pada Proses Analisis Sentimen Menggunakan Random Forest Classifier,” J. Komputasi, vol. 8, no. 2, pp. 50–58, 2020, doi: 10.23960/komputasi.v8i2.2615.
Z. Boulouard, M. Ouaissa, M. Ouaissa, M. Krichen, M. Almutiq, and K. Gasmi, “Detecting Hateful and Offensive Speech in Arabic Social Media Using Transfer Learning,” Appl. Sci., vol. 12, no. 24, 2022, doi: 10.3390/app122412823.
M. I. Amal, E. S. Rahmasita, E. Suryaputra, and N. A. Rakhmawati, “Analisis Klasifikasi Sentimen Terhadap Isu Kebocoran Data Kartu Identitas Ponsel di Twitter,” J. Tek. Inform. dan Sist. Inf., vol. 8, no. 3, pp. 645–660, 2022, doi: 10.28932/jutisi.v8i3.5483.
K. K. Dobbin and R. M. Simon, “Optimally splitting cases for training and testing high dimensional classifiers,” BMC Med. Genomics, vol. 4, 2011, doi: 10.1186/1755-8794-4-31.
A. Peryanto, A. Yudhana, and R. Umar, “Klasifikasi Citra Menggunakan Convolutional Neural Network dan K Fold Cross Validation,” J. Appl. Informatics Comput., vol. 4, no. 1, pp. 45–51, 2020, doi: 10.30871/jaic.v4i1.2017.
N. Putu, V. D. Saraswati, N. Yudistira, and P. P. Adikara, “Analisis Sentimen terhadap Perundungan Siber pada Twitter menggunakan Algoritma Bidirectional Encoder Representations from Transformer (BERT),” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 7, no. 2, pp. 909–916, 2023, [Online]. Available: https://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/12345
C. Prianto, N. H. Harani, and I. Firmansyah, “Analisis Sentimen Terhadap Kandidat Presiden Republik Indonesia Pada Pemilu 2019 di Media Sosial Twitter,” J. Media Inform. Budidarma, vol. 3, no. 4, p. 405, 2019, doi: 10.30865/mib.v3i4.1549.