Discussion
Our study findings demonstrated that ML classification models are well suited for a meaningful prediction of the initiation of maintenance dialysis in patients with CKD stages 3–5. The ANN method showed a better performance level for 1-year and 3-year prediction of dialysis commencement with a higher AUC (0.96 and 0.92), good sensitivity (0.88 and 0.87) and specificity (0.75 and 0.79).
In previous studies, AI models have been applied to predict CKD progression and start KRT. In 2015, Jamshid Norouzi et al used an adaptive neuro-fuzzy inference system to predict renal failure progression. Their model could accurately (>95%) predict the GFR for 6-month to 18-month intervals. However, only 465 patients with CKD were included in their study, and it was noted that proteinuria was not an important feature in their model.8 In 2019, Jing Xiao et al developed ML models to predict CKD progression. Their model used only the patient’s demographics and biochemical blood features, not features derived from a urinalysis. Besides, the predictive power of the model was not high (AUC: 0.873, sensitivity: 0.83 and specificity: 0.82).17 Another model was performed using only comorbidity data from 8492 patients to predict the onset of KRT, and their results were even lower (AUC, sensitivity and specificity were only 0.773, 0.623 and 0.781, respectively).7 Recently, Qiong Bai et al also conducted an ML model to predict the risk of ESRD. This model selected many important factors associated with the progression of CKD, including demographics, blood tests and comorbidities, but not proteinuria. However, the predictability has not improved compared with previous models.18 It could be explained by the fact that many patients in that study were in the early stages of CKD, resulting in a low percentage of those who progressed to ESRD when followed for a short period of time. The imbalance in the outcome can significantly affect the model’s predictive power.
In this study, we only focused on patients with CKD stages 3–5, and their risk of progression to ESRD is high. Hence, predicting the time of their dialysis commencement is very practical in our daily clinical care. Moreover, we carefully identified the model features associated with CKD progression and KRT based on the clinical setting and traditional logistic regression analysis. Forty-five significant prognostic factors were selected, including patient demographics, comorbidities, routine blood and urine tests and commonly used medications. Therefore, the predictive ability of our model has higher accuracy.
In further analysis of the ANN model, we also ranked all predictors according to their influence on the 1-year and 3-year models using SHAP values.19 Notably, several distinct features have been identified, respectively. For example, age, comorbidity (CCI score), PLT counts and WBC counts were important contributing factors in the 1-year prediction model, whereas gender and other medications such as proton pump inhibitors, beta-lactam antibacterial agents, organic nitrates and H2-receptor antagonists were relevant factors in the 3-year model. Common important factors identified in both models included eGFR at baseline, blood urea, serum creatinine and albuminuria (see figure 3). These are also key determinants for the risk classification of CKD according to the 2012 KDIGO guidelines.13 20 Other contributing factors in both models included serum Hgb level, TG or CHOL levels, hypertension, diabetes mellitus, diuretic use, antihypertensive agents and medications for controlling blood glucose levels (see figure 3). Anaemia typically develops during the course of CKD; a decrease in serum Hgb is significantly associated with the progression of CKD.21 22 Diabetic nephropathy is the leading cause of ESRD in adults.23 24 In patients with diabetic CKD, blood glucose levels are associated with poor outcomes such as serum creatinine doubling, ESRD and mortality, and intensive glycaemic control could reduce these risks.25–29 Additionally, several studies have demonstrated that certain levels of dyslipidaemia is independently associated with rapid renal progression, KRT, all-cause mortality and cardiovascular death in predialysis patients.30–33 Hypertension may occur early during the course of CKD and is related to a more rapid decline of kidney function, the development of cardiovascular disease and death in patients with CKD.34 35 Early intervention and tight control of blood pressure could lessen the risk of CVD and all-cause death in patients with and without CKD.36 37 Diuretics are an important part of guideline-directed medical therapy for patients with CKD with hypertension, oedema and hyperkalaemia.38 In terms of adverse effects, whether diuretics are an independent risk factor for CKD progression remains controversial. However, these medicines played important roles in both our models.39–41 Therefore, diuretics should be used with caution in patients with CKD stages 3–5. Finally, the GFR decline rate is also influenced by some immutable patient factors. The Kidney Disease Outcomes Quality Initiative guideline has provided ample evidence that African-American race (not justified in this study), male gender and older age are related to a more rapid GFR reduction.20 In summary, our models take advantage of the important factors involved in the progression of CKD, are consistent with current clinical practice guidelines and are highly applicable. They could be a good screening tool to determine the likelihood of initiating long-term dialysis by using the available clinical data on the patient. Several limitations need to be addressed. First, due to the patient’s lack of weight and height, the body surface area was not adjusted for eGFR. As a result, the determination of G-stage using unadjusted eGFR may be inaccurate for oversized patients. Second, using the decline in eGFR between baseline eGFR and previous eGFR may not accurately capture the progression of CKD when compared with the annual decline in eGFR during the follow-up period. Consequently, this factor did not significantly contribute to our model. Third, we only used retrospective data from three hospitals in Taipei to create our models, and it is widely recognised that racial and regional variables also influence CKD progression. Further work should involve training and validating the models through multinational and multiracial data before the clinical application is generalised. Fourth, we incorporated all important features into the prediction model, acknowledging that this approach might not be practical for clinical implementation. Nevertheless, these features underwent meticulous screening and hold varying degrees of significance in relation to CKD progression. Additionally, we assessed the model using only the top 10 important features and obtained comparable results (online supplemental tables S4 and S5, online supplemental figures S1 and S2).