Authors: Xiao-Qin Luo, Yi-Xin Kang, Shao-Bin Duan, Ping Yan, Guo-Bao Song, Ning-Ya Zhang, Shi-Kun Yang, Jing-Xin Li, Hui Zhang
J Med Internet Res. 2023. 25 (2023): e41142.
Commentary by:
Tommy Rappold, Jr. MD,* Jamie Sinton, MD,** J. “Nick” Pratap, MB, BChir*
*Divisions of Cardiac Anesthesiology and Cardiac Critical Care Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA
**Department of Anesthesia, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH
Take-home points
What is already known:
- Cardiac surgery-associated acute kidney injury (CSA-AKI) is common following heart surgery in children and is associated with increased morbidity and mortality.
- Machine learning models used to predict CSA-AKI for adults have encouraging results.
- To date the development of machine learning models to predict CSA-AKI in pediatric patients has been limited.
What this study adds:
- First development of a machine learning prediction model for CSA-AKI in infants and children.
- Identification of five major predictors for the development of CSA-AKI in children.
- External validation of the new model using two additional hospital cohorts.
Summary of Study
Cardiacsurgery–associated acute kidney injury (CSA-AKI) is commonly encountered in children who have had open heart surgery, but details of risk factors remain uncertain. Luo et al. performed a retrospective cohort study aimed at developing and validating machine learning (ML) models to predict CSA-AKI in patients aged 1 month to 18 years undergoing cardiac surgery requiring cardiopulmonary bypass (CPB).1 A prediction model was developed using data from 3,278 patients who underwent cardiac surgery over 7 years at a large academic medical center of Central South University in China. Validation cohorts, totaling 585 patients, were created from data collected at two additional Central South University hospitals over a similar period. Patients required at least one serum creatinine (SCr) measurement prior to surgery and one within seven days following surgery to be included. Patients with congenital renal malformations, a preoperative estimated glomerular filtration rate (eGFR) of 15 mL/min/1.73 m2 or lower, or multiple surgeries within 7 days were excluded.
Using the 2012 “Kidney Disease: Improving Global Outcomes (KDIGO)” clinical practice guideline, the CSA-AKI outcome of interest was defined as an increase from baseline SCr by ≥0.3 mg/dL within 48 hours of surgery or 50% within 7 days after surgery. Potential predictor variables were extracted from the electronic medical record and selected by experts based on clinical relevance for developing CSA-AKI; data were classified as preoperative (demographics, conditions, lab values, medications within 7 days prior to surgery) or intraoperative (including surgical and bypass times, lowest mean arterial pressure and temperature, intraoperative blood loss, RACHS-1 surgical risk score). The predictors were whittled down to those found to add predictive power when supplied to the ML algorithms (K-nearest neighbors, naïve Bayes, support vector machines, random forest, extreme gradient boosting [XGBoost], and artificial neural networks). The predictive power of each model was evaluated using external validation sets from different hospitals associated with the same university.
Overall, data from 3,278 patients was used to develop the models. In that cohort, 564 (17.2%) developed CSA-AKI. Of the 585 patients in the validation cohort, CSA-AKI was approximately half as common with 51 (8.7%) patients developing the outcome of interest. In order to weigh the importance of pre- vs. intraoperative factors, the authors actually trained 2 sets of models, based on 25 preoperative variables only and 20 preoperative plus 7 intraoperative variables, respectively. All ML models performed better using the combined preoperative plus intraoperative variables. The XGBoost model had the best performance with a mean AUROC of 0.912 (95% CI 0.899-0.924) in the derivation cohort and 0.889 (95% CI 0.844-0.920) in the external validation cohort. The relative similarity between the AUROC values for training and evaluation cohorts supports that the algorithm did not substantially overfit, as discussed below.
Machine learning methodology
An appreciation for machine learning is increasingly important to clinical practice as more and more research makes use of this newer approach. The statistical underpinnings of ML are just as strong as techniques typically taught in undergraduate and medical school statistics classes, but the familiar p-values and confidence intervals are not the typical output of ML algorithms. Supervised learning, as used by the authors of the current study, where the outcome of interest is known and potential predictors are to be evaluated, can be considered as an evolution of the more well-known technique of multiple regression. Models created by ML algorithms often fit large, complex datasets better than regression and can account for nonlinear relationships. While there is a range of measures for the degree of model fit, the authors of this report used the area under the receiver-operating characteristic curve (AUROC) which is one of the most commonly employed. (This graph may be familiar to readers as the one where the false positive rate is plotted on the horizontal axis and the true positive rate on the vertical axis.)
ML algorithms can indeed ‘learn’ outliers in the training dataset so avidly that they make unhelpful predictions when encountering new but related data points. This phenomenon is known as overfitting. To give a concrete example, consider a single patient, in a small case series, undergoing a low-risk procedure who develops CSA-AKI as a result of postoperative administration of a nephrotoxic drug. Assuming that the algorithm is not ‘fed’ any postoperative data (and therefore not ‘aware’ of the nephrotoxic drug), as a result of training on pre- and intraoperative data only, when asked to predict risk for a similar case in the future, a ML algorithm would suggest a much higher CSA-AKI risk than is truly appropriate. Such ‘overfitting’ may be avoided both by training on large datasets and also by utilizing the best practices of evaluating a new model on data not previously ‘seen’ by the algorithm during model development. Just as when a clinician decides to administer a drug supported by new randomized clinical trial evidence, prior to deploying a new ML algorithm in clinical practice, it is essential that the process and results of model development are equally scrutinized. In addition, just as with RCTs, the smart clinician must also ensure that the results are drawn from a similar population.
Whereas the results of a multiple regression model include coefficients which can be scrutinized by clinicians in order to gain an understanding of how the model delivers predictions, the inner workings of most ML algorithms are more opaque. Shapley additive explanations (SHAP) is a newer approach employed by the authors of this study in an attempt to open up the ‘black box’ and therefore increase clinicians’ confidence in its predictions. In this way, they showed that baseline SCr, perfusion time, body length, operation time, and intraoperative blood loss had the highest value in predicting CSA-AKI.
Opinion
Luo et al.1 developed a machine learning model with strong predictive performance of CSA-AKI in pediatric patients (excluding neonates) that incorporates preoperative as well as intraoperative data. Medically, intraoperative blood loss during pediatric cardiac surgery in the United States is often labeled as ‘unmeasurable’ making this predictor variable less useful in our current practice. However, the supervised machine learning concepts employed, and public data availability via Github, make this paper conceptually and practically valuable. Additionally, the authors specifically plot the AUROC for classic regression models and for the ‘winning’ XGBoost model to provide visual evidence of the superiority of ML for this task.
This novel model for CSA-AKI joins a rapidly expanding range of ML models that may offer the potential to improve clinical care. Of interest to specialists in pediatric heart disease, these include early warning systems for clinical deterioration in the pediatric cardiac intensive care unit,2,3 as well as for early prediction of postoperative complete heart block patients likely to benefit from implantation of a permanent pacemaker system.4 In addition, sophisticated models with robust validation to detect circulatory failure have been developed for adult practice.5
Despite the promise of these new approaches, it remains to be seen how ML will be incorporated into clinical practice. A possible use of the model devised by Luo et al would be as a decision aid for intraoperative placement of a peritoneal dialysis catheter. We look forward to studies demonstrating improved patient outcomes following the application of ML in pediatric cardiac practice.
References:
- Luo XQ, Kang YX, Duan SB, Yan P, Song GB, Zhang NY, Yang SK, Li JX, Zhang H. Machine Learning-Based Prediction of Acute Kidney Injury Following Pediatric Cardiac Surgery: Model Development and Validation Study. J Med Internet Res. 2023 Jan 5;25:e41142. doi: 10.2196/41142. PMID: 36603200; PMCID: PMC9893730.
- Ruiz VM, Saenz L, Lopez-Magallon A, Shields A, Ogoe HA, Suresh S, Munoz R, Tsui FR. Early prediction of critical events for infants with single-ventricle physiology in critical care using routinely collected data. J Thorac Cardiovasc Surg. 2019 Jul; 158(1):234-243.e3. doi: 10.1016/j.jtcvs.2019.01.130. Epub 2019 Feb 21. PubMed PMID: 30948317.
- Zoodsma RS, Bosch R, Alderliesten T, Bollen CW, Kappen TH, Koomen E, Siebes A, Nijman J. Continuous Data-Driven Monitoring in Critical Congenital Heart Disease: Clinical Deterioration Model Development. JMIR Cardio. 2023 May 16;7:e45190. doi: 10.2196/45190. PMID: 37191988; PMCID: PMC10230358.
- Duong SQ, Shi Y, Giacone H, Navarre BM, Gal DB, Han B, Sganga D, Ma M, Reddy CD, Shin AY, Kwiatkowski DM, Dubin AM, Scheinker D, Algaze CA. Criteria for Early Pacemaker Implantation in Patients With Postoperative Heart Block After Congenital Heart Surgery. Circ Arrhythm Electrophysiol. 2022 Nov;15(11):e011145. doi: 10.1161/CIRCEP.122.011145. Epub 2022 Oct 28. PMID: 36306332.
- Hyland SL, Faltys M, Hüser M, Lyu X, Gumbsch T, Esteban C, Bock C, Horn M, Moor M, Rieck B, Zimmermann M, Bodenham D, Borgwardt K, Rätsch G, Merz TM. Early prediction of circulatory failure in the intensive care unit using machine learning. Nat Med. 2020 Mar;26(3):364-373. doi: 10.1038/s41591-020-0789-4. Epub 2020 Mar 9. PMID: 32152583.