Perbandingan Performa Algoritma Gaussian Naive Bayes dan Decision Tree Classifier dalam Klasifikasi Prompt AI-Generated Image

Authors

  • Agung Riyadi Politeknik Negeri Batam
  • Putri Paramitha Politeknik Negeri Batam

DOI:

https://doi.org/10.24114/cess.v10i2.66521

Keywords:

kecerdasan buatan, text classification, Naive Bayes Classifier, Decision Tree Classifier, Machine Learning

Abstract

Sebuah website yang dikembangkan oleh peneliti memiliki jutaan data prompt dan hasil gambar AI-generated menghadapi tantangan seperti penyajian konten yang lambat dan tidak efisien bagi pengguna. Ketiadaan sistem kategorisasi yang tepat menyebabkan proses filtering dan pencarian konten menjadi lambat, sehingga membutuhkan implementasi sistem klasifikasi otomatis untuk meningkatkan kecepatan akses dan user experience. Penelitian ini membandingkan performa algoritma Gaussian Naive Bayes dan Decision Tree Classifier dalam mengklasifikasikan prompt text-to-image ke dalam tiga kategori: Background/Texture, Landscape, dan Arts. Dataset terdiri dari 7.040 prompt yang telah dikategorikan secara manual. Metodologi mencakup pra-pemrosesan data, representasi teks menggunakan Bag of Words, penerapan kedua algoritma klasifikasi, dan evaluasi menggunakan metrik akurasi, precision, recall, dan F1-score. Hasil menunjukkan bahwa Decision Tree mencapai akurasi tertinggi sebesar 99,16%, mengungguli Gaussian Naive Bayes yang hanya memperoleh 61,74%. Temuan ini menunjukkan bahwa Decision Tree lebih mampu menangani kompleksitas karakteristik prompt, serta dapat diimplementasikan untuk meningkatkan efisiensi pencarian dan penyaringan konten pada platform generative AI.

Author Biographies

Agung Riyadi, Politeknik Negeri Batam

Program Studi Teknik Informatika, Politeknik Negeri Batam

Putri Paramitha, Politeknik Negeri Batam

Program Studi Teknik Informatika, Politeknik Negeri Batam

References

[1] Javaid, M., Haleem, A., Singh, R. P., & Suman, R. (2022). Artificial intelligence applications for industry 4.0: A literature-based study. Journal of Industrial Integration and Management, 7(1), 83-111.

[2] Zhang, H., Wang, L., Chen, Y., & Liu, X. (2024). A Review on Generative AI for Text-to-Image and Image-to-Image Generation and Implications to Scientific Images. arXiv preprint arXiv:2502.21151.

[3] Haque, M. E., Alam, M. S., & Rahman, M. A. (2024). Data cleaning and machine learning: a systematic literature review. Automated Software Engineering, 31(2), 1-47.

[4] Rajpurkar, P., Chen, E., Banerjee, O., & Topol, E. J. (2024). The Impact of Data Preprocessing on Machine Learning Model Performance: A Comprehensive Examination. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 10(2), 845-854.

[5] Johnson, L., Martinez, C., & Brown, K. (2025). Mastering Duplicate Data Management in Machine Learning for Optimal Model Performance. Journal of Machine Learning Research, 26(3), 1-28.

[6] Sendek, A. D., Yang, Q., Cubuk, E. D., Bradlyn, B., & Steinhardt, P. J. (2023). Exploiting redundancy in large materials datasets for efficient machine learning with less data. Nature Communications, 14, 7283.

[7] Thompson, R., Wilson, J., & Davis, M. (2024). Effects of Data Duplication on Machine Learning Model Generalization. IEEE Transactions on Neural Networks and Learning Systems, 35(8), 2145-2158.

[8] Chen, W., Zhang, L., & Wang, H. (2022). Automated Data Preprocessing Pipelines for Machine Learning Applications. Journal of Computational Science, 58, 101523.

[9] Martinez, A., Brown, K., & Wilson, J. (2023). Optimizing Data Format Selection for Machine Learning Pipelines: A Comparative Study. Journal of Computational Science, 64, 101856.

[10] Thompson, R., Davis, L., & Johnson, M. (2024). Data reduction in big data: a survey of methods, challenges and future directions. International Journal of Data Science and Analytics, 8(2), 245-267.

[11] Lee, S., Park, D., & Kim, J. (2022). The Impact of Data Preprocessing on the Quality and Effectiveness of Machine Learning Models. International Journal of Intelligent Systems and Applications in Engineering, 10(4), 287-295.

Downloads

Published

2025-07-16

How to Cite

Riyadi, A., & Paramitha, P. (2025). Perbandingan Performa Algoritma Gaussian Naive Bayes dan Decision Tree Classifier dalam Klasifikasi Prompt AI-Generated Image . CESS (Journal of Computer Engineering, System and Science), 10(2), 389–398. https://doi.org/10.24114/cess.v10i2.66521

Issue

Section

Articles

Similar Articles

> >> 

You may also start an advanced similarity search for this article.