AN EFFECTIVE PRE-PROCESSING TECHNIQUES FOR DIABETES MELLITUS PREDICTION IN HEALTHCARE SYSTEMS
Diabetes Mellitus (DM) is a persistent disease caused by elevated blood sugar levels, and if left untreated, it can result in severe health complications, such as cardiac disorders, kidney damage, and stroke. However, current Machine Learning (ML) and Deep Learning (DL) approaches face challenges in accurately predicting diabetes in patients. Furthermore, this research evaluated the proposed pre-processing technique on large datasets, incorporating outlier identification and removal, missing value imputation, and standardization to enhance the accuracy of diabetes prediction. To achieve effective diabetes classification, the researchers utilized an Artificial Neural Network (ANN) with initialized weights. Data is collected from the PIMA Dataset and the North California State University (NCSU) dataset. Next, a Bivariate filter-based feature selection is performed to identify relevant features. Furthermore, the chosen features are fed into the Pearson correlation to further refine the feature set by considering a threshold value, which selects the most effective features. The experimental results showed that the proposed approach outperforms existing methods significantly, achieved better classification accuracy of 98.99%.
Artificial Neural Network, Bivariate filter, Diabetes mellitus, Pearson correlation, Standardization.