Badiaa Rahman Khalil

University of Suleimani


Dr.  Sozan Sabir Haider

University of Suleimani


Dr. Mohammad Mahmood Faqe

University of Suleimani


      Heart failure is the heart’s inability to pump blood efficiently, causing symptoms. Annual deaths due to this condition reach hundreds of thousands globally, impacting millions. This study presents the utilization of two machine learning models, namely Random Forest and Naïve Bayes, to classify a dataset of 299 heart failure patients’ data obtained from the UCI repository in 2015 based on their survival outcomes during follow-up. Employing distinct classification techniques, we thoroughly evaluate their performance through various valuation metrics. Our methodology involves training multiple decision trees on diverse subsets of data by ratio (80%), followed by the aggregation of their predictions to establish patient categories using the Random Forest technique. In parallel, the Naïve Bayes algorithm computes probabilities for each category based on patient attributes, assigning probabilities such as 0.68 for patients likely to pass away and 0.32 for those likely to survive. Various training-test ratios, including (60-40%), (70-30%) and (80-20%) are explored using the random forest approach in conjunction with Naïve Bayes. We demonstrate that the Random Forest classifier exhibits superior accuracy and predictive capability when compared to the Naïve Bayes classifier. With an 80% dataset training and 20% testing split, the Random Forest model achieves an accuracy rate of 85%, showcasing its robustness in categorizing patients effectively. Remarkably, the (80-20%) ratio consistently yields the highest accuracy, reaffirming the significance of optimal data partitioning for accurate patient classification. This study highlights the successful application of Random Forest and Naïve Bayes models to classify heart failure patients’ survival outcomes. The Random Forest model outperforms the Naïve Bayes model in accuracy and predictive capability. The study emphasizes the importance of proper data partitioning and demonstrates the potential of machine learning techniques in medical research.

Keywords: Machine Learning, Random Forest, Naïve Bayes, Heart Failure, Survival Outcomes, Classification.