Analyzing Bias Trade-Offs in Movie Review Sentiment Analysis using a BERT - SVM Enhanced Model

  • Vany Eka Universitas Amikom Yogyakarta
  • Hastari Utama Universitas Amikom Yogyakarta
Keywords: Sentiment Analysis, BERT, Bias Mitigation, Algorithmic Fairness, SVM

Abstract

Sentiment analysis of movie reviews often exhibits genre-based bias, where model performance varies significantly across subgroups—an issue that standard accuracy metrics can mask. To address this, we propose a novel fairness-aware hybrid model, BERT-SVM (Fairness-Tuned), which integrates sample re-weighting focused on the lowest-performing genre into the BERT-SVM pipeline. Using a public IMDb movie review dataset from Kaggle, we first train a standard BERT-SVM model and identify Horror as the weakest-performing genre (accuracy: 72.3%, vs. overall 89.6%). We then apply targeted re-weighting to upsample underrepresented or misclassified Horror samples during training. The Fairness-Tuned model reduces the accuracy gap by 62%, raising Horror genre accuracy to 83.1% while maintaining strong overall performance (87.4%). This work not only quantifies the fairness–accuracy trade-off but also demonstrates that lightweight, genre-specific bias mitigation within a hybrid architecture can effectively enhance equity without drastic model redesign—highlighting the value of explicit fairness evaluation in NLP applications

Downloads

Download data is not yet available.

References

A. G. Yuda, R. Novita, Mustakim, and M. Afdal, “Comparison of Service and Ease of e-Commerce User Applications Using BERT,” J. Sist. Cerdas, vol. 7, no. 2, pp. 157-167, 2024.

P. U. Rukmana, O. N. Pratiwi, and H. Fakhrurroja, “Perbandingan Analisis Sentimen Aplikasi Traveloka dan Tiket.com pada Twitter dengan Metode Support Vector Machine,” J. Sist. Cerdas, vol. 6, no. 3, pp. 260-271, 2023.

N. Abror, R. Novita, Mustakim, and M. Afdal, “Sentiment Analysis on the Impact of Artificial Intelligence (AI) Development to Determine Technology Needs,” J. Sist. Cerdas, vol. 7, no. 2, pp. 192-201, 2024.

M. Diqi, D. R. Rahmayanti, M. E. Hiswati, I. W. Ordiyasa, and I. Hafizah, “Digital Democracy: Analyzing Political Sentiments through Multinomial Naive Bayes in Election Campaign Ads,” J. Sist. Cerdas, vol. 7, no. 2, pp. 213-224, 2024.

N. P. D. T. Yanti and I. M. D. P. Asana, “Sistem Klasifikasi Pengajuan Kredit dengan Metode Support Vector Machine (SVM),” J. Sist. Cerdas, vol. 5, no. 3, pp. 287-295, 2022.

S. F. Pane and J. Ramdan, “Pemodelan Machine Learning : Analisis Sentimen Masyarakat Terhadap Kebijakan PPKM Menggunakan Data Twitter,” J. Sist. Cerdas, vol. 5, no. 1, pp. 191-199, 2022.

D. S. Rahayu, R. Novita, T. K. Ahsyar, and Zarnelly, “Sentiment Analysis ChatGPT Using the Multinominal Naïve Bayes Classifier (NBC) Algorithm,” J. Sist. Cerdas, vol. 7, no. 1, pp. 63-71, 2024.

E. P. S. Nugroho and R. N. Rosso, “Klasifikasi Ulasan Film Berbahasa Indonesia Menggunakan Support Vector Machine Dan Information Gain,” J. Sist. Cerdas, vol. 6, no. 1, pp. 11–20, 2023.

H. Utama and A. Masruro, “Analisis Sentimen pada Twitter menggunakan Word Embedding dengan Pendekatan Word2Vec,” J. Sist. Cerdas, vol. 5, no. 2, pp. 242-250, 2022.

F. R. Suprihati, “Analisis Klasifikasi SMS Spam Menggunakan Logistic Regression,” J. Sist. Cerdas, vol. 4, no. 3, pp. 166-173, 2021.

A. P. Widodo, M. A. Purwoadi, Y. Agusta, and A. Grahitandaru, “Implementasi Machine Learning pada Sistem Prediksi Kejadian dan Lokasi Patah Rel Kereta Api di Indonesia,” J. Sist. Cerdas, vol. 3, no. 1, pp. 58-69, 2020.

C. Yulia, Y. Agusta, and M. A. Purwoadi, “Predictive Analitycs Menggunakan Machine Learning Untuk Memprediksi Waktu Keterlambatan Berdasarkan Penyebab Keterlambatan Pada PT. Kereta Api Indonesia,” J. Sist. Cerdas, vol. 3, no. 1, pp. 59-68, 2020.

H. Fakhrurroja, A. M. Sundjaja, and Suyanto, “Klasifikasi Gangguan Tidur REM Behaviour Disorder Berdasarkan Sinyal EEG menggunakan Machine Learning,” J. Sist. Cerdas, vol. 3, no. 3, pp. 68-76, 2020.

H. Fakhrurroja, A. M. Sundjaja, and Suyanto, “Studi Komparasi Algoritma Klasifikasi Mental Workload Berdasarkan Sinyal EEG,” J. Sist. Cerdas, vol. 3, no. 2, pp. 69-78, 2020.

B. Hutabarat, “A Survey on Smart Analytics: Method, Tools, and Open Research Issues,” J. Sist. Cerdas, vol. 3, no. 1, pp. 54-62, 2020.

Puspita, R., & Rahayu, C. (2023). Sentiment Analysis on IMDB Movie Reviews using BERT. Indonesian Journal of Artificial Intelligence and Data Mining (IJAIDM), 6(2), 179-187.

Kumar, C. H., & Kumar, R. S. (2022). Natural Language Processing of Movie Reviews to Detect the Sentiments using Novel Bidirectional Encoder Representation-BERT for Transformers over Support Vector Machine. Journal of Pharmaceutical Negative Results, 13(4), 619-628.

Venugopal, J. P., Subramanian, A. A. V., Sundaram, G., Rivera, M., & Wheeler, P. (2024). A Comprehensive Approach to Bias Mitigation for Sentiment Analysis of Social Media Data. Applied Sciences, 14(23), 11471.

Atmajaya, D., Febrianti, A., & Darwis, H. (2023). Metode SVM dan Naive Bayes untuk Analisis Sentimen ChatGPT di Twitter. Indonesian Journal of Computer Science, 12(4), 2173-2180.

Dahlian, R. B., & Sitanggang, D. (2023). Analisis Sentimen Migrasi Televisi Digital pada Twitter Menggunakan Perbandingan Algoritma Multinomial Naïve Bayes, Support Vector Machines, dan Logistic Regression. Jurnal SISFOKOM (Sistem Informasi dan Komputer), 12(02), 280-288.

Kelvin, K., Banjarnahor, J., Nababan, M. N., & Sinurat, S. H. (2022). Analisis Perbandingan Sentimen Corona Virus Disease-2019 (COVID19) pada Twitter Menggunakan Metode Logistic Regression dan Support Vector Machine. Jurnal Sistem Informasi dan Ilmu Komputer Prima, 5(2), 47-52.

Utama, H., & Masruro, A. (2022). Sentiment Analysis on Twitter using Word Embedding with a Word2Vec Approach. Jurnal Sistem Cerdas, 5(2), 242-250.

Yuda, A. G., Novita, R., Mustakim, M., & Afdal, M. (2024). Comparison of Service and Ease of e-Commerce User Applications Using BERT. Jurnal Sistem Cerdas, 7(2), 157-167.

Rukmana, P. U., Pratiwi, O. N., & Fakhrurroja, H. (2023). Comparison of Sentiment Analysis of Traveloka and Tiket.com Applications on Twitter using the Support Vector Machine Method. Jurnal Sistem Cerdas, 6(3), 260-271.

Nugroho, E. P. S., & Rosso, R. N. (2023). Indonesian-Language Film Review Classification Using Support Vector Machine and Information Gain. Jurnal Sistem Cerdas, 6(1), 11–20.

Published
2026-04-19
How to Cite
Vany Eka, & Hastari Utama. (2026). Analyzing Bias Trade-Offs in Movie Review Sentiment Analysis using a BERT - SVM Enhanced Model. Jurnal Sistem Cerdas, 9(1), 1 - 13. https://doi.org/10.37396/jsc.v9i1.570