ISSN:1005-3026

THE RULE-BASED MULTI-CLASS CLASSIFICATION MODEL PREDICTS EARLY DIABETES USING SUPERVISED MACHINE LEARNING TECHNIQUES

R. Karthikeyan1a, b*, P. Geetha2, E. Ramaraj3

1aPh.D. Research Scholar, Department of Computer Science, Alagappa University, Karaikudi, India-630003

1bHead i/c, Department of Computer Science, SRM Arts and Science College, Kattankulathur, Chennai, India-603203

2Associate Professor & Head, PG Department of Computer Science, Dr. Umayal Ramanathan College for Women, Karaikudi, India-630003

3Professor, Department of Computer Science, Alagappa University, Karaikudi,   India-630003

*Corresponding Author Email: karthikeyan.r@srmasc.ac.in

Abstract

Diabetes is a metabolic disorder characterized by high blood sugar levels in which the body fails to create essential insulin or fails to utilize the insulin that is produced adequately. Diabetes can be caused by a failure to detect pre-diabetes early on. There were only two possible results in diabetes research previously: a Tested Negative or a Tested Positive result. The primary goal of this study is to identify pre-diabetes, as well as the Test Negative and Positive results, using a Rule-Based Multi-Class Classification Algorithm that can avoid the formation of Type II Diabetes. This research made use of the PIMA dataset. The variable relevance identifies the most important factors in the datasets such as BMI, Plasma glucose, and Blood pressure. The rules are developed based on the important variable. Using Supervised Machine Learning methods such as Decision Tree, RepTree and Logistic Regression approaches, the Rule-Based Multi-Class Classification model classifies and predict an individual them as Non-Diabetic, Pre-Diabetic, and Diabetic. Previous research has found limitations in Machine Learning Classifiers for Diabetes Prediction in terms of data size, accuracy, and multi-class predictor variables. The proposed system outperforms all of them and produces the best results in predicting Diabetes Mellitus in experimental data too.

Keywords: Supervised Machine Learning, Diabetes, Decision Tree, Logistic Regression, RepTree, Rule-Based Multiclass.