Data Mining Analysis Of Chronic Kidney Disease (CKD) Level
Loading...
Date
2022-08-05
Authors
Mohd Harizi, Muhammad Hafizam Afiq
Journal Title
Journal ISSN
Volume Title
Publisher
Universiti Sains Malaysia
Abstract
Chronic Kidney Disease (CKD) is the state when the kidneys' functions fail
and worsen over time. The rise in the number of patients with CKD and the fatal
consequences of end-stage CKD poses a significant challenge globally. Studies have
widely focused on the clinical progression and attributing factors of the disease.
However, data mining analysis to address the CKD is low. Few studies have
considered the attributing factors that accurately classify the CKD levels. The main
goal of this study is to discover the CKD patient demographics, clinical indicators, and
risk factors attributing to CKD stages as well as to develop a data mining model on the
risk factors for CKD progression stages. The CKD case study consists of clinical data
records extracted from the UCI Machine Learning Repository domain. Data mining
analysis takes place in four stages: data pre-processing, data classification,
classification of attributes by stages of CKD, and classification model verification
using Microsoft Excel and Waikato Environmental for Knowledge Analysis (WEKA)
version 3.8.5 software. Data classifications are performed at a 10-fold-cross-validation
mode using Naïve Bayes (NB), Support Vector Machine (SVM), and J48 Trees. The
ZeroR algorithm was set as the baseline There are three levels of classification
analyses: before and after handling the missing values, before and after the outliers’
treatment, and adding uncertain classes. Findings show no difference in classification
accuracies before and after treating the missing values using NB (96.0%) and SMO
(98.5%) except for J48 indicating a slight improvement of 1% to (97.8%). The
classification accuracies for outliers’ treatment are 97. 4%, 98.4%, and 97.1% for NB,
SMO, and J48 respectively. Adding the uncertain class the best accuracy obtained was
98.5% using the SMO algorithm. A predictive classification model that determines the
accuracy for three classification classes was developed accordingly using the SMO
algorithm.