Data Mining Analysis Of Chronic Kidney Disease (CKD) Level

Date

2022-08-05

Authors

Mohd Harizi, Muhammad Hafizam Afiq

Publisher

Universiti Sains Malaysia

Abstract

Chronic Kidney Disease (CKD) is the state when the kidneys' functions fail and worsen over time. The rise in the number of patients with CKD and the fatal consequences of end-stage CKD poses a significant challenge globally. Studies have widely focused on the clinical progression and attributing factors of the disease. However, data mining analysis to address the CKD is low. Few studies have considered the attributing factors that accurately classify the CKD levels. The main goal of this study is to discover the CKD patient demographics, clinical indicators, and risk factors attributing to CKD stages as well as to develop a data mining model on the risk factors for CKD progression stages. The CKD case study consists of clinical data records extracted from the UCI Machine Learning Repository domain. Data mining analysis takes place in four stages: data pre-processing, data classification, classification of attributes by stages of CKD, and classification model verification using Microsoft Excel and Waikato Environmental for Knowledge Analysis (WEKA) version 3.8.5 software. Data classifications are performed at a 10-fold-cross-validation mode using Naïve Bayes (NB), Support Vector Machine (SVM), and J48 Trees. The ZeroR algorithm was set as the baseline There are three levels of classification analyses: before and after handling the missing values, before and after the outliers’ treatment, and adding uncertain classes. Findings show no difference in classification accuracies before and after treating the missing values using NB (96.0%) and SMO (98.5%) except for J48 indicating a slight improvement of 1% to (97.8%). The classification accuracies for outliers’ treatment are 97. 4%, 98.4%, and 97.1% for NB, SMO, and J48 respectively. Adding the uncertain class the best accuracy obtained was 98.5% using the SMO algorithm. A predictive classification model that determines the accuracy for three classification classes was developed accordingly using the SMO algorithm.

URI

http://hdl.handle.net/123456789/16803

Collections

Pusat Pengajian Kejuruteraan Mekanikal - Monograf

Full item page