Melody Transcription from Monophony Audio with Fast Fourier Transform
Abstract
Music has been an inseparable part of human life since ancient times. One form of music that is often studied is monophonic music, which consists of a single note played at a time. In the digital era, melody transcription has become an important aspect of music processing, allowing sound to be converted into musical notation. This study focuses on melody transcription from monophonic sound recordings using the Fast Fourier Transform (FFT) method. The research aims to analyze the accuracy of FFT in extracting frequency components from monophonic signals and converting them into musical notation. The research methodology involves collecting monophonic sound recordings from piano and guitar, preprocessing the audio to remove noise and normalize volume, applying FFT to extract frequency features, and mapping these frequencies into musical notation. The evaluation process is conducted using Dynamic Time Warping (DTW) and a confusion matrix to measure accuracy, precision, recall, and F1-score. The results show that the FFT-based transcription system achieves an accuracy rate of 99.24% for piano and 98.86% for guitar. The study also highlights the impact of noise and audio quality on transcription accuracy, as well as the limitations of FFT in detecting closely spaced frequencies. Despite these limitations, FFT proves to be an efficient method for melody transcription in simple monophonic music. Future research could explore hybrid approaches combining FFT with other pitch detection algorithms to improve transcription accuracy.Downloads
Published
2024-11-30 — Updated on 2024-11-30
Versions
- 2024-11-30 (2)
- 2024-11-30 (1)
Issue
Section
Articles