Sentiment Analysis of Twitter Users Regarding Taxation Topics in Indonesia Utilizing Multinomial Naive Bayes

Dewan Dinata Tarigan, Said Iskandar Al Idrus

Abstract


The country's income is heavily dependent on taxes, which contribute to improved public well-being. Public confidence in tax authorities plays a key role in increasing tax receipts. Therefore, it is important to measure this level of confidence. One of the methods used is sentimental analysis, which helps to understand public views on regulations, services, performance, and tax policies. One of the purposes of this study is to measure the sentiment of Twitter users towards taxation in Indonesia. Sentiment analysis involves data collection processes, initial data processing, separation of datasets, feature extraction, classification, and evaluation. The classification model used is Multinomial Naive Bayes with a comparison of 80% training data and 20% test data. The results show that 89.65% of tweets about taxation in Indonesia have negative sentiment. The model evaluation was carried out on two test scenarios, namely initial data and randomly under-sampleed data. Classification on initial data achieved accuracy of 89.97%, precision of 46.68%, and sensitivity of 33.61%. Whereas on undersampling data results, accuration reached 53.28%, accurateness of 52.66%, and sensibility of 52.52%. Analysis showed significant differences between the two scenarios in which undersammpling techniques resulted in a more balanced distribution of data. Despite this, the model still faces difficulties in classifying positive and neutral data due to the dominance of negative sentiment.


Full Text:

PDF

References


Republik Indonesia, “Undang-Undang Republik Indonesia Nomor 9 Tahun 1994 Tentang Perubahan Atas Undang-Undang Nomor 6 Tahun 1983 Tentang Ketentuan Umum Dan Tata Cara Perpajakan.” 1994. [Online]. Available: Https://Www.Bphn.Go.Id/Data/Documents/94uu009.Pdf

D. R. Niru And A. Sinaga, “Pemungutan Pajak Dan Permasalahannya Di Indonesia,” Vol. 7, No. 1, 2016.

M. Ismayadie, “Analisis Efektivitas Dan Kontribusi Penerimaan Pajak Bumi Dan Bangunan (PBB) Dan Pajak Penghasilan (PPH) Terhadap Pendapatan Negara Tahun 2007-2017,” Equity: Jurnal Ekonomi, Vol. 7, No. 2, Pp. 12–24, Dec. 2019, Doi: 10.33019/Equity.V7i2.5.

I. R. S. Rahayu, “Sepanjang 2022, Penerimaan Pajak Tembus Rp 1.716 Triliun,” Kompas, Jan. 03, 2023.

Ç. Erhan And S. .Isa, “A Brief Overview Of The Tax Compliance Literature Specific To Experimental Studies,” Journal Of Public Finance Studies, No. 69, Pp. 187–210, 2023.

Z. Ibrahim, A. Ibrahim, And Syahribulan, “Pengaruh Kepercayaan Publik Terhadap Kepatuhan Wajib Pajak,” YUME : Journal Of Management, Vol. 3, No. 3, Pp. 80–93, 2020.

H. Azizah, B. S. Rintyarna, And T. A. Cahyanto, “Sentimen Analisis Untuk Mengukur Kepercayaan Masyarakat Terhadap Pengadaan Vaksin Covid-19 Berbasis Bernoulli Naive Bayes,” BIOS: Jurnal Teknologi Informasi Dan Rekayasa Komputer, Vol. 3, No. 1, Pp. 23–29, 2022.

D. R. Alghifari, M. Edi, And L. Firmansyah, “Implementasi Bidirectional LSTM Untuk Analisis Sentimen Terhadap Layanan Grab Indonesia,” Jurnal Manajemen Informatika (JAMIKA), Vol. 12, No. 2, Pp. 89–99, 2022.

I. Verawati And B. S. Audit, “Algoritma Naïve Bayes Classifier Untuk Analisis Sentiment Pengguna Twitter Terhadap Provider By.U,” JURNAL MEDIA INFORMATIKA BUDIDARMA, Vol. 6, No. 3, P. 1411, Jul. 2022, Doi: 10.30865/Mib.V6i3.4132.

B. R. Atmadja, “Analisis Sentimen Bahasa Indonesia Pada Tempat Wisata Di Kabupaten Sukabumi Dengan Naïve Bayes,” Vol. 15, No. 2, Pp. 371–382, 2022.

C. Andrade, “The Limitations Of Online Surveys,” Indian J Psychol Med, Vol. 42, No. 6, Pp. 575–576, 2020.

J. P. Agans, S. A. Schade, S. R. Hanna, S.-C. Chiang, K. Shirzad, And S. Bai, “The Inaccuracy Of Data From Online Surveys: A Cautionary Analysis,” Qual Quant, Pp. 1–22, 2023.

M. K. Durán-Vaca And J. A. Ballesteros-Ricaurte, “Sentiment Analysis On Twitter To Measure The Perception Of Taxation In Colombia,” In International Conference Europe Middle East & North Africa Information Systems And Technologies To Support Learning, 2019, Pp. 184–193.

J. P. Gujjar And H. P. Kumar, “Sentiment Analysis: Textblob For Decision Making,” Int. J. Sci. Res. Eng. Trends, Vol. 7, No. 2, Pp. 1097–1099, 2021.

N. P. Challa, K. R. Madhavi, B. Naseeba, B. B. Bhanu, And C. Naresh, “Sentiment Analysis From TWITTER Using NLTK,” In International Conference On Hybrid Intelligent Systems, 2023, Pp. 852–861.

A. S. R. Rufaida, A. E. Permanasari, And N. A. Setiawan, “Lexicon-Based Sentiment Analysis Using Inset Dictionary: A Systematic Literature Review,” In ICAE 2022: Proceedings Of The 5th International Conference On Applied Engineering, ICAE 2022, 5 October 2022, Batam, Indonesia, 2023, P. 258.

F. Muharram And K. Saputra, “Analisis Sentimen Pengguna Twitter Terhadap Kinerja Walikota Medan Menggunakan Metode Naive Bayes Classifier,” Jurnal Sistem Informasi Dan Ilmu Komputer, Vol. 1, No. 2, Pp. 1–12, 2023.

A. I. Kadhim, “Term Weighting For Feature Extraction On Twitter: A Comparison Between BM25 And TF-IDF,” In 2019 International Conference On Advanced Science And Engineering (ICOASE), 2019, Pp. 124–128.

O. S. D. Silaen, H. Herlawati, And R. Rasim, “Analisis Sentimen Mengenai Gangguan Bipolar Pada Twitter Menggunakan Algoritma Naïve Bayes,” Jurnal Komtika (Komputasi Dan Informatika), Vol. 6, No. 2, Pp. 62–73, Nov. 2022.

A. Miftahusalam, A. Febby Nuraini, A. A. Khoirunisa, And H. Pratiwi, “Perbandingan Algoritma Random Forest, Naïve Bayes, Dan Support Vector Machine Pada Analisis Sentimen Twitter Mengenai Opini Masyarakat Terhadap Penghapusan Tenaga Honorer,” In SEMINAR NASIONAL OFFICIAL STATISTICS 2022 , 2022, P. 563.

D. Farah Zhafira, B. Rahayudi, And P. Korespondensi, “ANALISIS SENTIMEN KEBIJAKAN KAMPUS MERDEKA MENGGUNAKAN NAIVE BAYES DAN PEMBOBOTAN TF-IDF BERDASARKAN KOMENTAR PADA YOUTUBE,” Vol. 2, No. 1, Pp. 55–63, 2021.

H. M. Saragih, “Analisis Sentimen Pengguna Twitter Terhadap Layanan Pajak Kendaraan Bermotor Menggunakan Algoritme Naive Bayes Classifier,” 2021.

H.-T. Duong And T.-A. Nguyen-Thi, “A Review: Preprocessing Techniques And Data Augmentation For Sentiment Analysis,” Comput Soc Netw, Vol. 8, No. 1, Pp. 1–16, 2021.

D. S. Abdelminaam, N. Neggaz, I. A. E. Gomaa, F. H. Ismail, And A. A. Elsawy, “Arabicdialects: An Efficient Framework For Arabic Dialects Opinion Mining On Twitter Using Optimized Deep Neural Networks,” Ieee Access, Vol. 9, Pp. 97079–97099, 2021.

L. Chiruzzo, P. Amarilla, A. R’Ios, And G. Giménez-Lugo, “Development Of A Guarani-Spanish Parallel Corpus,” In Proceedings Of The Twelfth Language Resources And Evaluation Conference, 2020, Pp. 2629–2633.

E. Alpaydin, Introduction To Machine Learning, 3rd Ed. Cambridge: The MIT Press, 2014.

L. Zhu, J. Li, Y. La, And T. Jia, “Improving The Accuracy Of Remote Sensing Land Cover Classification By GEO-ECO Zoning Coupled With Geostatistical Simulation,” Applied Sciences, Vol. 11, No. 2, P. 553, 2021.

N. A. Salsabila, Y. A. Winatmoko, A. A. Septiandri, And A. Jamal, “Colloquial Indonesian Lexicon,” In International Conference On Asian Language Processing (IALP), Bandung: IEEE, Jan. 2019.

S. F. S. Sodiq, W. Desena, A. Wibowo, And Others, “Penerapan Algoritma Stemming Nazief & Adriani Pada Proses Klasterisasi Berita Berdasarkan Tematik Pada Laman (Web) Direktorat Jenderal HAM Menggunakan Rapidminer,” Syntax: Jurnal Informatika, Vol. 11, No. 02, Pp. 10–21, 2022.

A. A. Hidayat And B. Pardamean, “Count Time Series Modelling Of Twitter Data Topic Modelling: A Case Of Indonesia Flood Events,” In AIP Conference Proceedings, 2023.

F. Koto And G. Y. Rahmaningtyas, “Inset Lexicon: Evaluation Of A Word List For Indonesian Sentiment Analysis In Microblogs,” In Proceedings Of The 2017 International Conference On Asian Language Processing, IALP 2017, Institute Of Electrical And Electronics Engineers Inc., Feb. 2018, Pp. 391–394. Doi: 10.1109/IALP.2017.8300625.

B. Liu And G. Tsoumakas, “Dealing With Class Imbalance In Classifier Chains Via Random Undersampling,” Knowl Based Syst, Vol. 192, P. 105292, 2020.

Z. Wang, C. Cao, And Y. Zhu, “Entropy And Confidence-Based Undersampling Boosting Random Forests For Imbalanced Problems,” IEEE Trans Neural Netw Learn Syst, Vol. 31, No. 12, Pp. 5178–5191, 2020.




DOI: https://doi.org/10.24114/j-ids.v3i1.52465

Article Metrics

Abstract view : 27 times
PDF - 34 times

Refbacks

  • There are currently no refbacks.


Journal of Informatics and Data Science (J-IDS)

ISSN (Online) : 2964-0415

Published By Computer Science Study Program, Faculty of Mathematics and Natural Sciences, Universitas Negeri Medan.

Website: https://jurnal.unimed.ac.id/2012/index.php/jids/index

Email : jids@unimed.ac.id

This work is licensed under a Creative Commons Attribution 4.0 International License.