Collaboration of Nazief & Adriani Stemming Algorithm with PostgreSQL Queries Parsing Method to Search for New Study Program Names

Indra Chaidir

Abstract


Penolakan usulan nama baru program studi vokasi pada Aplikasi Silemkerma di Direktorat Jenderal Pendidikan Tinggi Vokasi, Kementerian Pendidikan, Kebudayaan, Riset, dan Teknologi sering terjadi karena terdapat kemiripan nama program studi yang diusulkan dengan nama program studi yang sudah ada di dalam basis data. Banyak data tidak ditemukan karena filter data menggunakan metode konvensional dalam kasus ini menggunakan operator ILIKE dengan pola wildcard character % (percent), sedangkan data yang dicari tersedia di dalam basis data. Ini terjadi dikarenakan operator ILIKE tidak dapat membaca perubahan kata dari leksem/akar kata (root word) seperti "pengelolaan" dengan memiliki prefix dan suffix, dengan akar kata "kelola". Mengatasi permasalahan ini, penulis memanfaatkan Algoritma Nazief & Adriani untuk stemming agar mendapatkan leksem dari kalimat yang dimasukan. Hasil algoritma tersebut terus diolah menggunakan Metode Parsing Queries, salah satu metode Full Text Search yang ada pada basis data PostgresQL. Hasil penelitian ini dapat diimplementasikan pada Aplikasi tersebut.

Rejection of new vocational study program name proposals in Silemkerma Application at the Directorate General of Vocational Higher Education, Ministry of Education, Culture, Research, and Technology often occurs because there is a similarity between the proposed study program name and the existing study program name in the database. Many data are not found because the data filter uses conventional methods in this case using the ILIKE operator with the wildcard character pattern % (percent), while the data sought is available in the database. This is because the ILIKE operator cannot read word changes from lexemes/root words such as "pengelolaan" which has a prefix and suffix, with the root word "kelola". Overcoming this problem, the author utilizes the Nazief & Adriani Algorithm for stemming in order to get lexemes from the sentences entered. The results of the algorithm are then processed using the Parsing Queries Method, one of the Full Text Search methods available in the PostgresQL database. The results of this research can be implemented in the application.


Keywords


Natural Language Processing; Stemming; Parsing Queries; Full Text Search; PostgreSQL; Program Studi Baru; Nomenklatur

Full Text:

PDF

References


A. Gelbulkh, “Natural Language Processing,” in Fifth International Conference on Hybrid Intelligent Systems (HIS’05), Rio de Janeiro, Brazil, 2005, p. 6. doi: 10.1109/ICHIS.2005.79.

F. Z. Tala, “A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia,” M.Sc. Thesis, Append. D, vol. pp, pp. 39–46, 2003.

K. Divya, B. S. Siddhartha, N. M. Niveditha, and B. M. Divya, “An Interpretation of Lemmatization and Stemming in Natural Language Processing,” J. Univ. Shanghai Sci. Technol., vol. 22, no. 10, p. 351, 2020, [Online]. Available: https://www.researchgate.net/publication/348306833

J. Asian, H. E. Williams, and S. M. M. Tahaghoghi, “Stemming Indonesian: A Confi x-Stripping Approach,” Conf. Res. Pract. Inf. Technol. Ser., vol. 38, no. September 2018, pp. 307–314, 2005, doi: 10.1145/1316457.1316459.

D. Wahyudi, T. Susyanto, and D. Nugroho, “Implementasi Dan Analisis Algoritma Stemming Nazief & Adriani Dan Porter Pada Dokumen Berbahasa Indonesia,” J. Ilm. SINUS, vol. 15, no. 2, pp. 49–56, 2017, doi: 10.30646/sinus.v15i2.305.

S. Suhada and S. Bahri, “Implementasi Algoritma Rabin Karp Dan Stemming Najief Andriani Untuk Deteksi Plagiarisme Dokumen,” Swabumi, vol. 5, no. 1, pp. 84–89, 2017, [Online]. Available: https://ejournal.bsi.ac.id/ejurnal/index.php/swabumi/article/view/1776

A. C. Herlingga, I. P. E. Prismana, D. R. Prehanto, and D. A. Dermawan, “Algoritma Stemming Nazief & Adriani dengan Metode Cosine Similarity untuk Chatbot Telegram Terintegrasi dengan E-layanan,” J. Informatics Comput. Sci., vol. 2, no. 01, pp. 19–26, 2020, doi: 10.26740/jinacs.v2n01.p19-26.

A. Jelita, “Effective Techniques for Indonesian Text Retrieval,” Ph.D Thesis, pp. 1–286, 2007, [Online]. Available: https://researchbank.rmit.edu.au/view/rmit:6312

A. Z. Arifin, P. Adhi, K. Mahendra, and H. T. Ciptaningtyas, “Enhanced Confix-Stripping Stemmer and Ants Algorithm for Classifying News Document in Indonesian Language,” 5th Int. Conf. Inf. Commun. Technol. Syst., no. April 2014, pp. 149–158, 2009.

A. D. Tahitoe and D. Purwitasari, “Enhanced Confix Stripping Stemmer,” pp. 1–15, 2010.

PostgreSQL, “PostgreSQL Documentation 15, Chapter 12, ‘Full Text Search,’” 2022. https://www.postgresql.org/docs/current/textsearch-intro.html




DOI: https://doi.org/10.24114/cess.v8i2.48212

Article Metrics

Abstract view : 96 times
PDF - 94 times

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

CESS (Journal of Computer Engineering, System and Science)

Creative Commons License
CESS (Journal of Computer Engineering, System and Science) is licensed under a Creative Commons Attribution 4.0 International License