Study | Validation performance—reported from study | Validation performance—NYU data | Performance difference between study validation performance and NYU original validation performance | Validation performance—NYU retrained (95% CI) | Performance difference between study validation performance and NYU retrained validation performance |
Xie et al6 | AUROC=0.98 | AUROC=0.67 | AUROC difference=0.31 | AUROC=0.76 (0.72 to 0.80) | AUROC difference=−0.22 |
Yan et al7 | Avg F1=0.97 | Most recent values: Avg F1=0.63 | F1 difference=0.34 | Most recent values: AUROC=0.95 (0.93 to 0.96) | * |
Yan et al7 | * | Earliest values: Avg F1=0.51 | * | Earliest values: AUROC=0.70 (0.65 to 0.75) | * |
Mean AUROC=0.98 | Mean AUROC=0.67 | Mean AUROC difference=0.31 | Mean AUROC=0.82 | Mean AUROC difference=−0.22 |
*Value unavailable because authors did not provide an AUROC value when reporting validation performance.
AUROC, area under the receiver–operator curve; NYU, New York University.