Application of Random Forest for Heart Disease Classification with SMOTE Approach to Balance Data
DOI:
https://doi.org/10.24114/j-ids.v3i2.66481Abstract
In order to increase the accuracy and efficiency in heart disease detection, this work intends to develop a Random Forest algorithm based on machine learning into a heart disease prediction model. There are 255 samples in the dataset including 17 independent variables covering lifestyle and health elements. This work uses the SMote (Synthetic Minority Over-sampling Technique) technique to balance the class distribution by including synthetic data to the minority class given the data imbalance between the "Yes" (heart disease) and "No" (no heart disease) classes. With an accuracy of 94.7% and an AUC of 0.983, the Random Forest model built showed quite good results indicating that this model can effectively separate persons with and without heart disease. This work shows that the application of SMOTE considerably enhances model performance in handling data imbalance issues and helps to build machine learning-based predictive systems for heart disease classification. This work is novel in the use of the SMOTE technique to overcome data imbalance in heart disease prediction, so providing an efficient solution for data-driven medical decision making.Downloads
Published
2025-10-12 — Updated on 2024-11-30
Issue
Section
Articles