Use of Cosine Similarity, Manhattan Distance, and Jaccard Similarity Methods to Improve the Accuracy of Manual Payment Evidence Validation in ERP Applications
Abstract
Manual validation of payment receipts in Enterprise Resource Planning (ERP) applications often faces challenges in terms of Accuracy, especially when payment data must be matched with existing transactions. Data mismatches can lead to recording errors and increase the burden of manual verification. This study aims to improve the Accuracy of payment receipt validation by comparing three Similarity methods: Cosine Similarity, Jaccard Similarity, and Manhattan Distance. In this research, Optical Character Recognition (OCR) is utilized to validate scanned images of payment receipts. By using OCR, data from receipt images can be automatically extracted into text format for further processing. The experimental results show that Cosine Similarity delivers the best performance, with a Precision of 100%, Recall of 90%, and Accuracy of 90%. On the other hand, Jaccard Similarity failed to identify any valid data, resulting in 0% across all evaluation metrics. Meanwhile, Manhattan Distance achieved high Precision (100%) but performed poorly in Recall and Accuracy, both at 10%. Based on these findings, Cosine Similarity is recommended as the most effective method for enhancing OCR-based payment validation in ERP systems. This study also opens the opportunity to develop hybrid approaches, combining Cosine Similarity and Manhattan Distance methods to further improve overall system performance.
Downloads
References
Arief, M., & Rafik, A. (2024). Pengelolaan Proyek Implementasi ERP pada Sistem Laporan Keuangan Parkir di PT. Harfan Tri Megah (Edugate). 02(06), 79–92.
Alhadian, F. (2024). Analisis Perencanaan Penerapan Enterprise Resource Planning (ERP) dalam Aktivitas Manajerial di Yayasan Sosial dan Pendidikan Bina Muda Cicalengka. 2(2).
Azzam, F., Jaber, M., & Saies, A. (2023). applied sciences The Use of Blockchain Technology and OCR in E-Government for Document Management: Inbound Invoice Management as an Example.
Jiang, P. (2024). A Survey of Text-Matching Techniques.
Representation, F., & Similarity, C. (2024). Journal of Dinda. 4(2), 149–153.
Baruah, N., Gupta, S., Ghosh, S., & Afrid, S. N. (n.d.). Exploring Jaccard Similarity and Cosine Similarity for Developing an Assamese Question Answering System. 1–11.
Paulo, S., & Paulo, S. (n.d.). Asymptotic behavior of the Manhattan distance in ? -dimensions : Estimating multidimensional scenarios in empirical experiments.
Shade, B., & Altmann, E. G. (2023). Quantifying the Dissimilarity of Texts.
Rahmawati, V., Julianty, L., Fauziah, S., & Rafif, S. (2025). A Comparative Study of Cosine Similarity and Manhattan Distance on Text Representations Using TF-IDF and Bag of Words. 210–215. https://doi.org/10.1109/ICITCOM66635.2025.11265201
Amelia, N., Agama, I., Negeri, I., & Kerinci, I. (2024). Eksplorasi Validitas dan Reliabilitas Soal Pemahaman Konsep dalam Asesmen Pembelajaran. 2(1), 222–232.
Komputer, J. S., Septio, P. A., Yulianto, S., Prasetyo, J., Kristen, U., & Wacana, S. (2023). Pembuatan Aplikasi Validasi Document Tagihan Pembelian Barang Secara Digital Menggunakan OCR dengan tool tesseract pada System Portal Perusahaan. 7(September), 650–662.
Larsson, A. (2016). Automated invoice handling with machine learning and OCR Automatiserad fakturahantering med maskininlärning och OCR.
Timur, J., Komputer, F. I., & Brawijaya, U. (n.d.). Sentrin 2020.
Halim, J., & Lasut, D. (2024). Document Plagiarism Detection Application Using Web-Based TF-IDF and Cosine Similarity Methods. 7(2). https://doi.org/10.32877/bt.v7i2.1697
Wang, Z., Chen, J., & Hu, J. (2022). Multi-View Cosine Similarity Learning with Application to Face Verification. 1–13.
Adi, C. (2024). Implementasi Pengenal Tulisan Tangan Menggunakan Optical Character Recognition Dengan Metode Cnn Dan Rnn Pada Dokumen Resi Dan Kuitansi. 11(1), 32–38.
Amer, A. A., & Abdalla, H. I. (2020). A set theory based similarity measure for text clustering and classification. Journal of Big Data. https://doi.org/10.1186/s40537-020-00344-3
Fauziah, S., Saputra, D. D., Pratiwi, R. L., & Kusumayudha, M. R. (2023). Komparasi Metode Feature Selection Text Mining Pada Permasalahan Klasifikasi Keluhan Pelanggan Industri Telekomunikasi Menggunakan Smote Dan Naïve Bayes. IJIS - Indonesian Journal On Information System, 8(2), 174. https://doi.org/10.36549/ijis.v8i2.289
Agustina, T., Masrizal, M., & Irmayanti, I. (2024). Performance Analysis of Random Forest Algorithm for Network Anomaly Detection using Feature Selection. Sinkron, 8(2), 1116–1124. https://doi.org/10.33395/sinkron.v8i2.13625
Ronzon, T., Gurria, P., Carus, M., Cingiz, K., El-Meligi, A., Hark, N., Iost, S., M’barek, R., Philippidis, G., van Leeuwen, M., Wesseler, J., Medina-Lozano, I., Grimplet, J., Díaz, A., Tejedor-Calvo, E., Marco, P., Fischer, M., Creydt, M., Sánchez-Hernández, E., … Miras Ávalos, J. M. (2025). No 主観的健康感を中心とした在宅高齢者における 健康関連指標に関する共分散構造分析Title. Sustainability (Switzerland), 11(1), 1–14. https://doi.org/10.1016/j.resenv.2025.100208%0A
Wang, Z., Cai, Z., & Wu, Y. (2023). An improved YOLOX approach for low-light and small object detection: PPE on tunnel construction sites. Journal of Computational Design and Engineering, 10(3), 1158–1175. https://doi.org/10.1093/jcde/qwad042
Yu, L., Yang, X., Wei, H., Liu, J., & Li, B. (2024). Driver fatigue detection using PPG signal, facial features, head postures with an LSTM model. Heliyon, 10(21). https://doi.org/10.1016/j.heliyon.2024.e39479
Dharmendra, I. K., Agus, I. M., Putra, W., & Atmojo, Y. P. (2024). Evaluasi Efektivitas SMOTE dan Random Under Sampling pada Klasifikasi Emosi Tweet. Informatics for Educators And Professionals : Journal of Informatics, 9(2), 192–193.
Jupin, J. A., Sutikno, T., Ismail, M. A., Mohamad, M. S., Kasim, S., & Stiawan, D. (2019). Review of the machine learning methods in the classification of phishing attack. Bulletin of Electrical Engineering and Informatics, 8(4), 1545–1555. https://doi.org/10.11591/eei.v8i4.1344





