Navigating the machine learning pipeline: a scoping review of inpatient delirium prediction models

Tom Strating; Leila Shafiee Hanjani; Ida Tornvall; Ruth Hubbard; Ian A. Scott

doi:10.1136/bmjhci-2023-100767

Article Text

PDF

PDF +
Supplementary
Material

Review

Navigating the machine learning pipeline: a scoping review of inpatient delirium prediction models

Tom Strating1,
Leila Shafiee Hanjani1,
Ida Tornvall1,
Ruth Hubbard1 and
http://orcid.org/0000-0002-7596-0837Ian A. Scott1,2

¹Centre for Health Services Research, The University of Queensland Faculty of Medicine, Brisbane, Queensland, Australia
²Internal Medicine and Clinical Epidemiology, Princess Alexandra Hospital, Woolloongabba, Queensland, Australia

Correspondence to Professor Ian A. Scott; ian.scott{at}health.qld.gov.au

Abstract

Objectives Early identification of inpatients at risk of developing delirium and implementing preventive measures could avoid up to 40% of delirium cases. Machine learning (ML)-based prediction models may enable risk stratification and targeted intervention, but establishing their current evolutionary status requires a scoping review of recent literature.

Methods We searched ten databases up to June 2022 for studies of ML-based delirium prediction models. Eligible criteria comprised: use of at least one ML prediction method in an adult hospital inpatient population; published in English; reporting at least one performance measure (area under receiver-operator curve (AUROC), sensitivity, specificity, positive or negative predictive value). Included models were categorised by their stage of maturation and assessed for performance, utility and user acceptance in clinical practice.

Results Among 921 screened studies, 39 met eligibility criteria. In-silico performance was consistently high (median AUROC: 0.85); however, only six articles (15.4%) reported external validation, revealing degraded performance (median AUROC: 0.75). Three studies (7.7%) of models deployed within clinical workflows reported high accuracy (median AUROC: 0.92) and high user acceptance.

Discussion ML models have potential to identify inpatients at risk of developing delirium before symptom onset. However, few models were externally validated and even fewer underwent prospective evaluation in clinical settings.

Conclusion This review confirms a rapidly growing body of research into using ML for predicting delirium risk in hospital settings. Our findings offer insights for both developers and clinicians into strengths and limitations of current ML delirium prediction applications aiming to support but not usurp clinician decision-making.

artificial intelligence
decision making, computer-assisted
machine learning

Data availability statement

All data relevant to the study are included in the article or uploaded as supplementary information.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

https://doi.org/10.1136/bmjhci-2023-100767

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Delirium is a common but underdiagnosed state of disturbed attention and cognition that afflicts one in four older hospital inpatients.1 It is independently associated with a longer length of hospital stay, mortality, accelerated cognitive decline2 and new-onset dementia.1 Since older people are particularly vulnerable to severe illness from COVID-19 infection, delirium emerged as a frequent acute geriatric syndrome during the pandemic.3 Predicting who is likely to develop delirium before symptom onset may facilitate the targeted implementation of preventive strategies that can avoid up to 40% of cases.4

Risk stratification models enable clinicians to identify patients at high risk of an adverse event and intervene where appropriate.5 The advent of wearables, genomics, and dynamic datasets within electronic health records (EHRs) provides big data to which machine learning (ML) can be applied to individualise clinical risk prediction.6 ML is a subset of artificial intelligence that uses advanced computer programmes to learn patterns and associations within large datasets and develop models (or algorithms), which can then be applied to new data in rapidly producing predictions or classifications, including diagnoses.7 Across developed nations, more than 150 ML applications are approved for use in routine clinical practice, and this number is projected to rise exponentially over the coming years.6 8

The key stages of the ML pipeline that models must traverse, from initial in-silico (computer-based) development to real-world deployment, comprise the following6 (figure 1): (1) data collection; (2) data preparation; (3) feature selection and engineering; (4) model training; (5) model validation, both internal and external; (6) deployment of the model within a working application; and (7) post-deployment monitoring and optimisation of the application. During the development phase (stages 1–3), researchers collect, clean and transform data into computable formats and select relevant features as model inputs. The model is then iteratively improved through several training cycles against static, retrospective datasets (stage 4). In stage 5, the model undergoes two processes of validation: internal validation for accuracy and reproducibility against a random sample from the original training dataset (‘hold out’ sample); and external validation, whereby researchers validate the model on a new external dataset set derived from previously unencountered patients using the same performance metrics. In stage 6, the model is subject to prospective validation using live (or near-live) dynamic data in a form reflecting its future real-world deployment, integrated into a prototype application, and evaluated for its feasibility in clinical workflows. Then, it is assessed for its clinical utility within clinical trials, which compares application-guided patient care and outcomes with the current standard of care. Finally, stage 7 entails monitoring the effectiveness and safety of the model over its life cycle using surveillance data.

Figure 1

Machine learning pipeline.

ML models have enormous potential in facilitating more accurate risk stratification, preventive intervention and avoidance of incident delirium, but external validation, prospective evaluation and clinical adoption remain limited,6 and analysis of the clinical impact of deployed models on patient care is rarely performed.9 10 Previous systematic reviews of delirium prediction models have been limited to in-silico models focusing on performance metrics using static retrospective data,11 12 and the studies within these reviews are limited to those published before 2019. The objectives of this review were to: (1) provide a more contemporary overview of research on all ML delirium prediction models designed for use in the inpatient setting; (2) characterise them according to their stage of development, validation and deployment; and (3) assess the extent to which their performance and utility in clinical practice have been evaluated.

Methods

This review follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews guidelines13 and is registered within the Open Science Framework (OSF) database (osf.io/8r5cd). A scoping review methodology was selected as it allows us to map the broad and emerging ML evidence base in a flexible but systematic manner.14

Literature search

The search strategy was developed by two authors (TS, LSH) and reviewed by a third author (IAS) and a librarian. We searched PubMed, EMBASE, IEEE Xplore, Scopus, Web of Science, CINAHL, PsycInfo, Cochrane, OSF pre-prints and the aiforhealth.app machine learning research dashboard between inception and 14 June 2022, using a mixture of medical subject headings (MeSH) and keywords related to delirium and ML (for the exact search terms, see online supplemental appendix 1). Additional studies were identified by perusing the reference lists of retrieved articles.

Supplemental material

[bmjhci-2023-100767supp001.pdf]

Study selection

Retrieved studies were imported into EndNote 20 and screened for relevance and duplicates in Covidence. Two reviewers (TS, IT) independently screened the titles and abstracts, and two authors (TS, LSH) reviewed the full-text articles. Disagreements between screening authors were resolved by discussion or settled by a third reviewer (IAS). We considered full-length original studies published in peer-reviewed journals, pre-prints and conference proceedings. Eligible studies had to fulfil all the following criteria: use of at least one ML method that predicts delirium; applied to an adult hospital inpatient population; published in English; and reporting at least one of the following performance measures (area under the receiver-operator curve (AUROC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV). Studies were excluded if they were: editorials, position statements, letters to the editor, conference abstracts or press releases; conducted in non-hospital settings; or did not report any model performance metrics.

Data extraction and synthesis

One reviewer (TS) independently performed data extraction using a preagreed form designed in Covidence. The following data items were extracted: title; author; publication year; country (where data were collected); study aim and design; clinical setting; population characteristics; ML modelling method(s); reference standard used to diagnose delirium; frequency of delirium; data source and type; evolutionary stage and respective sample size; model performance measures (comprising, where reported, AUROC, sensitivity, specificity, PPV and NPV, Brier score, calibration plot concordance), primary outcome measures; comparison to standard care; principal discharge diagnosis; and length of stay. Qualitative information on user acceptance of deployed models was also recorded where reported.

We defined a model as being in the ‘development and internal validation’ stage if the dataset used for validating the model came from the same patient population as the training dataset. An ‘external validation’ study was where the model was validated using a dataset from a population temporally or geographically separate from that used to provide the original training data. Finally, we labelled a study as having a ‘deployment’-level study was where the was evaluated in a routine clinical setting.

Corresponding authors were contacted for studies that did not report the reference standard used to define delirium in their dataset. Two authors (LSH, IT) cross-checked the data extracted for a random sample of 25% (n=10) of studies, and disagreement was managed through discussion.

A narrative approach was taken to synthesise the data extracted from the selected studies, including tabular and graphical representations, summarising the number of studies in each stage, year and country published, performance metrics, algorithm type, data type and stage of development. Descriptive statistics for continuous variables comprised mean and SD and median and IQR for normally and non-normally distributed data, respectively. All analyses and visualisations were done within R.15 As this was a scoping review, no attempt was made to assess the quality of individual study design or methods.

Results

The search strategy identified a total of 921 records; after duplicate removal and title and abstract screening, 114 full-text studies were retrieved, of which 3916–54 met the selection criteria for inclusion in the final analysis (figure 2).

Figure 2

Preferred Reporting Items for Systematic Reviews and Meta-Analyses flow chart.

Study characteristics

Study characteristics are summarised in online supplemental table 1. Studies originated from the USA (n=12),17 19–23 25 41 43 50 51 54 Austria (n=9),24 28–31 33 39 47 48 China (n=6),26 32 35 49 52 53 Germany (n=3),37 45 46 South Korea (n=3),27 40 44 Canada (n=3),30 36 38 Brazil (n=1),16 Japan (n=1),34 Spain (n=1)18 and one study was labelled as international.42 Over the 6-year distribution of publications to June 2022, most studies were published in 2021 (n=10) and the first half of 2022 (n=12), indicating considerable growth in research in this area since the publication of previous reviews of studies published up to 2019.11 12 Study design comprised retrospective cohort study (n=25), prospective cohort studies (n=9); secondary analyses of trial data (n=2), prospective pilot study (n=2) and a retrospective case-control study (n=1). Studies mostly used data from EHRs alone to develop their models (n=21), with the remainder including specified clinical assessments (eg, nursing assessment, n=8), compiled clinical databases (eg, data repository or open-access database, n=6), data from a clinical quality improvement registry (n=1), data from both EHRs and clinical assessments (n=1), data from EHRs and a clinical database (n=1) and data solely from electrocardiographs (n=1).

Supplemental material

[bmjhci-2023-100767supp002.pdf]

The median (IQR) sample size of training datasets was 2389 (IQR: 371–27,377) participants, of whom, when reported as a percentage, a median of 20% (IQR: 20%–25%) was used as a ‘hold-out’ sample for internal validation. External validation and deployment studies had a median of 4765 (IQR: 2429–11 355) and 5887 (IQR: 3456–10 975) participants, respectively. The age of participants ranged from a mean of 54.4–84.4 years. Hospital inpatients were treated in surgical wards (n=14), medical wards (n=10), intensive care units (ICU) (n=7) or a combination of all three settings (n=8). The reported reference standards for verifying delirium cases in the training dataset comprised the confusion assessment method for the Intensive Care Unit (CAM-ICU) (n=10), International Classification of Diseases codes (n=14), the CAM (n=7) and the Diagnostic Statistical Manual (n=3). Several alternative screening methods, such as the 4 A’s Test (n=2), were used infrequently, and three studies reported no information as to what reference standard was used. The prevalence of delirium in training and internal validation datasets ranged from 2.0% to 53.6%, and from 10% to 39% in external validation studies. Delirium prevalence was 1.5%28 and 31.2%31 for the two deployment studies that reported data on this outcome. Length of stay ranged from an average of 1.9–13.6 days, but was not reported in 27 (69%) of studies.

Model characteristics

Thirty of thirty-nine publications described the training and internal validation of a delirium model,17 18 21–26 30 32–41 43 44 46–54 with investigators of 6 of these studies (20%) externally validating their model in a subsequent paper.16 19 20 27 29 42 Investigators of three studies (10%) implemented and evaluated their model in real-time clinical workflows,2 8 31 45 but no publications described monitoring or optimising a deployed model.

Figure 3 depicts the numbers of publications that used each type of model across each stage of application maturity. In total, random forest models were the most common (n=11), followed by logistic regression (n=6), gradient boosting (n=5) and artificial neural networks (n=4). Two other papers each described using a decision tree, L1-penalised regression, or natural language processing models, with another seven papers describing different models unique to the study.

Figure 3

Number of publications by machine learning method. If a study describes multiple models, only the best-performing (area under receiver-operator curve) model is shown. LEM, learning from examples module 2; LR, logistic regression; RBF, radial basis function; RF, random forest; SAINTENS, self-attention and intersample attention transformer; SVM, support vector machine.

Performance metrics of each model at their different stages of validation, when reported, are listed in online supplemental table 2. In the absence of any universal task-agnostic standard, we regarded values of AUROC>0.7, of sensitivity and specificity ≥80%, of PPV ≥30% and NPV ≥90%, of Brier scores <0.20 and calibration plots showing high concordance as being acceptable accuracy thresholds for clinical application. For internal validation, omitting two studies for which the AUROC statistic was not reported,40 44 the median AUROC for the remaining models was 0.85 (IQR: 0.78–0.90). For external validation and deployment studies, the reported median AUROC scores were 0.75 (IQR: 0.74–0.81) and 0.92 (IQR: 0.89–0.93), respectively.

Supplemental material

[bmjhci-2023-100767supp003.pdf]

Stratified by algorithm type, the median AUROC (models with >1 publication) for training and internal validation studies was highest for random forest models (0.91, IQR: 0.88–0.91). In order of decreasing performance were natural language processing (AUROC: 0.85, IQR: 0.83–0.91); decision trees (AUROC: 0.83, IQR: 0.78–0.89); artificial neural networks (AUROC: 0.81, IQR: 0.76–0.86); gradient boosting (AUROC: 0.81, IQR: 0.77–0.85); artificial neural networks (AUROC: 0.81, IQR: 0.75–0.87) and logistic regression models (AUROC: 0.80, IQR: 0.78–0.82).

In regards to external validation, a gradient boosting algorithm performed best (AUROC: 0.86), followed by random forest models (AUROC: 0.78, IQR: 0.75–0.80) and L1-penalised regression (AUROC: 0.75, IQR: 0.75–0.75). For prospective studies of deployed models, the best performance was observed in one study using natural language processing, with an AUROC score of 0.94,45 with random forest models achieving a median AUROC score of 0.89 (IQR: 0.87–0.90). The AUROC performance metrics for all models, stratified by stage of maturity, is presented graphically in figure 4.

Figure 4

Graphical representation of AUROC performance metrics stratified by stage of development. Son et al44 andOh et al40 did not report AUROC but are included in the analysis as they reported other performance metrics. AUROC, area under receiver-operator curve.

The median sensitivity and specificity for training studies were 75% (IQR: 64.1%–82.3%) and 82.2% (73.3%–90.4%), respectively. For external validation studies, median sensitivity and specificity dropped to 73% (IQR: 67.5%–81.7%) and 69% (IQR: 48%–72%), respectively. However, in deployment-level studies, median sensitivity and specificity were 87.1% (IQR: 80.6%–93.5%) and 86.4% (IQR: 84.3%–88.5%), respectively. The PPV and NPVs of included ML models were only reported for 10 of 39 studies (26%), which ranged respectively from 5.8% to 91.6% and from 90.6% to 99.5%.

Of the total, only 14 studies (35.9%) reported calibration metrics which showed considerable variation. Using calibration plots, four studies reported poor calibration, an equal number reported reasonable calibration, while the remainder employed alternative calibration methods with variable results (see online supplemental table 1). The Brier score was reported for only five studies (13%) and ranged from 0.14 to 0.22.

Clinical application

Three articles from two investigators subjected their prototype model to prospective validation using live data in a form reflecting its future application to clinical workflows.28 31 45 Sun et al45 trained three separate models to predict delirium, acute kidney injury and sepsis. They found their delirium model performed slightly worse using live data from three hospitals at admission (AUROC decreased by 3.6%) and when deployed in another participating hospital with data separate to that of the training set, performance dropped by another 0.8% at discharge. Sun et al reported user feedback only for the acute kidney injury model.

Jauk et al28 implemented their delirium prediction model in an Austrian hospital system for 7 months and thereafter for an additional month in the trauma surgery department of another affiliated hospital.31 The prediction model performed somewhat worse on prospective data (AUROC: 0.86) as it did on retrospective data used in the training and internal validation study33 (AUROC: 0.91). In addition, predictions of the random forest model used in this study correlated strongly with nurses' ratings of delirium risk in a sample of internal medicine patients (correlation coefficient (r)=0.81 for blinded and r=0.62 for non-blinded comparison). In the external validation study, the model achieved an AUROC value above 0.85 across three prediction times (on admission: 0.863; first evening: 0.851; second evening: 0.857). However, when the model was re-trained using local data, the AUROC value exceeded 0.92 for all three prediction times, and correctly predicted all 29 patients who were deemed high risk for delirium by a senior physician (sensitivity=100%, specificity=90.6%). In a qualitative survey, the 13 health professionals involved in the project perceived the ML application as useful and easy to use.

Discussion

This scoping review examined contemporary research around ML models for predicting delirium in adult inpatient settings and identified an additional 22 studies published since late 2019 which was the finish date for previous reviews.11 12 We have mapped the development and implementation stage and associated performance metrics of these new models according to a six-stage evolutionary ML pipeline. Importantly, we included three novel implementation studies which demonstrated good predictive accuracy and user acceptance, underscoring the potential clinical utility of ML models for delirium prediction.

However, our review reveals several limitations in the existing research that future studies need to address. First, training data in most studies comprised routinely collected data obtained retrospectively from EHRs which, while providing vast quantities of data for training complex models, suffer from inaccuracies and omissions relating to key predictor variables. Only a quarter of studies18 26 28 29 31 32 35 37 40 44 45 in this review sourced prespecified and prospectively collected data, such that missing or incomplete data relevant to model optimisation, and which could not be remedied using imputation methods, emerged as a critical limitation for many studies. For instance, the EHR-derived models of Zhao et al53 lacked microbiological, radiological and biomarker data relevant to delirium, limiting their predictive accuracy. Similarly, missing information about medication use and frailty indices posed a limiting factor in several other studies.17 22 49–51 Many studies also did not have access to demographic data of their study population, such as socioeconomic status, gender and race.26 53 Reliance on data sources with missing data and unrepresentative of target populations weakens model performance and introduces biases, generating models that may exacerbate healthcare inequities.7

Second, similar to the findings of previous reviews,11 12 most models described in our scoping review did not mature past the stage of internal validation. Only six studies validated their model on an external dataset16 19 20 27 29 42 despite evidence that models that perform well on ‘hold out’ training data usually have lower performance when applied to more noisy datasets from different institutions due to model overfitting.5

Third, of all 39 included studies, only those of Jauk et al28 31 and Sun et al45 subjected their models to a prospective evaluation using live data in clinical practice. The extent to which clinicians will adopt a model depends on their trust in its predictive accuracy and utility and the ease with which it can be integrated into clinical workflows.7 Sun and colleagues45 demonstrated their deep learning model performed equally well in training and prospective validation studies.29 In a subsequent case study, the authors demonstrated an instance where their application correctly predicted postoperative delirium in a patient with a negative preoperative CAM-ICU, demonstrating its clinical utility in a surgical ward.55 In addition, they found ML applications could be particularly useful for the early detection of delirium in wards where delirium screening is often not performed and delirium is underdiagnosed.1

Similarly, Jauk and colleagues28 analysed 5530 predictions over 7 months of deployment, finding their model performance was reliable and attracted high satisfaction ratings by a senior physician. In a later qualitative study, the 47 nurses and physicians associated with the project rated the delirium prediction model as useful, easy to use and interpretable without increasing workload.56 These favourable findings were replicated in a follow-up study where the random forest model was implemented in a separate hospital network.31 However, cross-hospital evaluations underscored the need to re-train the model with local data to mitigate declines in performance when applied to new clinical settings.31 45 However, neither of these models has been subjected to clinical trials to establish impacts on patient care or outcomes.

Our review has some limitations. As our study was a scoping exercise, and in the absence of an agreed risk of bias assessment tool for ML prediction studies, we chose not to critically appraise the quality of individual studies. For similar reasons, and given the heterogeneity of the data source, model type and performance metrics reported in included studies, quantitative meta-analysis was not performed.

Conclusion

Prediction models derived using ML methods can potentially identify individuals at risk of developing delirium before symptom onset to whom preventive strategies can be targeted, which may, in turn, reduce incident delirium and improve patient outcomes. This scoping review identified all publications describing ML-based delirium prediction models over the last 5 years, evaluated their stage in the ML evolution pipeline, and assessed their performance and utility. Relatively few were subject to external validation, which, when performed, showed degraded model performance. In addition, while few studies underwent prospective evaluation in real-world clinical settings, performance and user acceptance seemed promising in those that did. However, given the limitations of current delirium prediction models, they should not be seen as substitutes for expert clinician judgement.

Data availability statement

All data relevant to the study are included in the article or uploaded as supplementary information.

Ethics statements

Patient consent for publication

Ethics approval

Not applicable.

Acknowledgments

We thank librarian Nicole Rayner for helping to construct the search strategy.

References

↵
1. Richardson SJ,
2. Davis DHJ,
3. Stephan BCM, et al
. Recurrent delirium over 12 months predicts dementia: results of the delirium and cognitive impact in dementia (DECIDE) study. Age and Ageing 2021;50:914–20. doi:10.1093/ageing/afaa244
OpenUrl PubMed
↵
1. Han JH,
2. Shintani A,
3. Eden S, et al
. Delirium in the emergency department: an independent predictor of death within 6 months. Ann Emerg Med 2010;56:244–52. doi:10.1016/j.annemergmed.2010.03.003
OpenUrl CrossRef PubMed
↵
1. Inouye SK
. The importance of delirium and delirium prevention in older adults during lockdowns. JAMA 2021;325:1779–80. doi:10.1001/jama.2021.2211
OpenUrl
↵
1. Inouye SK,
2. Westendorp RGJ,
3. Saczynski JS
. Delirium in elderly people. Lancet 2014;383:911–22. doi:10.1016/S0140-6736(13)60688-1
OpenUrl CrossRef PubMed Web of Science
↵
1. Haendel MA,
2. Chute CG,
3. Robinson PN
. Classification, ontology, and precision medicine. N Engl J Med 2018;379:1452–62. doi:10.1056/NEJMra1615014
OpenUrl CrossRef
↵
1. Scott IA
. Demystifying machine learning: a primer for physicians. Intern Med J 2021;51:1388–400. doi:10.1111/imj.15200
OpenUrl PubMed
↵
1. Scott IA,
2. Carter S,
3. Coiera E
. Clinician checklist for assessing suitability of machine learning applications in healthcare. BMJ Health Care Inform 2021;28:e100251. doi:10.1136/bmjhci-2020-100251
↵
1. Muehlematter UJ,
2. Daniore P,
3. Vokinger KN
. Approval of artificial intelligence and machine learning-based medical devices in the USA and Europe (2015-20): a comparative analysis. Lancet Digit Health 2021;3:e195–203. doi:10.1016/S2589-7500(20)30292-2
OpenUrl CrossRef
↵
1. Scott I,
2. Cook D,
3. Coiera E
. Evidence-based medicine and machine learning: a partnership with a common purpose. BMJ Evid Based Med 2021;26:290–4. doi:10.1136/bmjebm-2020-111379
OpenUrl Abstract/FREE Full Text
↵
1. Goldstein BA,
2. Navar AM,
3. Pencina MJ, et al
. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J Am Med Inform Assoc 2017;24:198–208. doi:10.1093/jamia/ocw042
OpenUrl CrossRef PubMed
↵
1. Chua SJ,
2. Wrigley S,
3. Hair C, et al
. Prediction of delirium using data mining: a systematic review. J Clin Neurosci 2021;91:288–98. doi:10.1016/j.jocn.2021.07.029
OpenUrl
↵
1. Ruppert MM,
2. Lipori J,
3. Patel S, et al
. ICU delirium-prediction models: a systematic review. Crit Care Explor 2020;2:e0296. doi:10.1097/CCE.0000000000000296
↵
1. Tricco AC,
2. Lillie E,
3. Zarin W, et al
. PRISMA extension for scoping reviews (PRISMA-SCR): checklist and explanation. Ann Intern Med 2018;169:467–73. doi:10.7326/M18-0850
OpenUrl CrossRef PubMed
↵
1. Munn Z,
2. Peters MDJ,
3. Stern C, et al
. Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach. BMC Med Res Methodol 2018;18:143. doi:10.1186/s12874-018-0611-x
↵
1. Team, RC
. A language and environment for statistical computing. In: R foundation for statistical computing. Vienna, Austria, 2013. Available: http://www.R-project.org
↵
1. Amador T,
2. Saturnino S,
3. Veloso A, et al
. Early identification of ICU patients at risk of complications: regularization based on robustness and stability of explanations. Artif Intell Med 2022;128:102283. doi:10.1016/j.artmed.2022.102283
OpenUrl
↵
1. Bishara A,
2. Chiu C,
3. Whitlock EL, et al
. Postoperative delirium prediction using machine learning models and preoperative electronic health record data. BMC Anesthesiol 2022;22:8. doi:10.1186/s12871-021-01543-y
↵
1. Cano‐Escalera G,
2. Graña M,
3. Irazusta J, et al
. Risk factors for prediction of delirium at hospital admittance. Expert Systems 2022;39:e12698. doi:10.1111/exsy.12698
↵
1. Castro VM,
2. Hart KL,
3. Sacks CA, et al
. Longitudinal validation of an electronic health record delirium prediction model applied at admission in COVID-19 patients. Gen Hosp Psychiatry 2022;74:9–17. doi:10.1016/j.genhosppsych.2021.10.005
OpenUrl
↵
1. Castro VM,
2. Sacks CA,
3. Perlis RH, et al
. Development and external validation of a delirium prediction model for hospitalized patients with Coronavirus disease 2019. J Acad Consult Liaison Psychiatry 2021;62:298–308. doi:10.1016/j.jaclp.2020.12.005
OpenUrl
↵
1. Coombes CE,
2. Coombes KR,
3. Fareed N
. A novel model to label delirium in an intensive care unit from clinician actions. BMC Med Inform Decis Mak 2021;21:97. doi:10.1186/s12911-021-01461-6
↵
1. Corradi JP,
2. Thompson S,
3. Mather JF, et al
. Prediction of incident delirium using a random forest Classifier. J Med Syst 2018;42:261. doi:10.1007/s10916-018-1109-0
↵
1. Davoudi A,
2. Ebadi A,
3. Rashidi P, et al
. Delirium prediction using machine learning models on preoperative electronic health records data. Proc IEEE Int Symp Bioinformatics Bioeng 2017;2017:568–73. doi:10.1109/BIBE.2017.00014
OpenUrl
↵
1. Gutheil J,
2. Donsa K
. SAINTENS: self-attention and Intersample attention transformer for digital biomarker development using tabular healthcare real world data. Stud Health Technol Inform 2022;293:212–20. doi:10.3233/SHTI220371
OpenUrl
↵
1. Halladay CW,
2. Sillner AY,
3. Rudolph JL
. Performance of electronic prediction rules for prevalent delirium at hospital admission. JAMA Netw Open 2018;1:e181405. doi:10.1001/jamanetworkopen.2018.1405
↵
1. Hu X-Y,
2. Liu H,
3. Zhao X, et al
. Automated machine learning-based model predicts postoperative delirium using readily extractable perioperative collected electronic data. CNS Neurosci Ther 2022;28:608–18. doi:10.1111/cns.13758
OpenUrl
↵
1. Hur S,
2. Ko RE,
3. Yoo J, et al
. A machine learning-based algorithm for the prediction of intensive care unit delirium (PRIDE): retrospective study. JMIR Med Inform 2021;9:e23401. doi:10.2196/23401
↵
1. Jauk S,
2. Kramer D,
3. Großauer B, et al
. Risk prediction of delirium in hospitalized patients using machine learning: an implementation and prospective evaluation study. J Am Med Inform Assoc 2020;27:1383–92. doi:10.1093/jamia/ocaa113
OpenUrl
↵
1. Jauk S,
2. Kramer D,
3. Quehenberger F, et al
. Information adapted machine learning models for prediction in clinical workflow. Stud Health Technol Inform 2019;260:65–72.
OpenUrl
↵
1. Jauk S,
2. Kramer D,
3. Schulz S, et al
. Evaluating the impact of incorrect diabetes coding on the performance of multivariable prediction models. Stud Health Technol Inform 2018;251:249–52.
OpenUrl
↵
1. Jauk S,
2. Veeranki SPK,
3. Kramer D, et al
. External validation of a machine learning based delirium prediction software in clinical routine. Stud Health Technol Inform 2022;293:93–100. doi:10.3233/SHTI220353
OpenUrl
↵
1. Ji M,
2. Xing S,
3. Yang Y
. Pathophysiological factors of delirium among critically ill elders after non-cardiac surgery based on artificial neural networks: a pilot study. Anaesthesia, Pain and Intensive Care 2018;22:424–30.
OpenUrl
↵
1. Kramer D,
2. Veeranki S,
3. Hayn D, et al
. Development and validation of a multivariable prediction model for the occurrence of delirium in hospitalized gerontopsychiatry and internal medicine patients. Stud Health Technol Inform 2017;236:32–9.
OpenUrl
↵
1. Kurisu K,
2. Inada S,
3. Maeda I, et al
. A decision tree prediction model for a short-term outcome of delirium in patients with advanced cancer receiving pharmacological interventions: a secondary analysis of a multicenter and prospective observational study (phase-R). Palliat Support Care 2022;20:153–8. doi:10.1017/S1478951521001565
OpenUrl
↵
1. Li Q,
2. Zhao Y,
3. Chen Y, et al
. Developing a machine learning model to identify delirium risk in geriatric internal medicine inpatients. Eur Geriatr Med 2022;13:173–83. doi:10.1007/s41999-021-00562-9
OpenUrl
↵
1. Lucini FR,
2. Fiest KM,
3. Stelfox HT, et al
. Delirium prediction in the intensive care unit: a temporal approach. Annu Int Conf IEEE Eng Med Biol Soc 2020;2020:5527–30. doi:10.1109/EMBC44109.2020.9176042
OpenUrl
↵
1. Menzenbach J,
2. Kirfel A,
3. Guttenthaler V, et al
. Pre-operative prediction of postoperative delirium by appropriate screening (PROPDESC) development and validation of a pragmatic POD risk screening score based on routine preoperative data. J Clin Anesth 2022;78:110684. doi:10.1016/j.jclinane.2022.110684
OpenUrl
↵
1. Mufti HN,
2. Hirsch GM,
3. Abidi SR, et al
. Exploiting machine learning models and methods for the prediction of agitated delirium after cardiac surgery: models development and validation study. JMIR Med Inform 2019;7:e14993. doi:10.2196/14993
↵
1. Netzer M,
2. Hackl WO,
3. Schaller M, et al
. Evaluating performance and Interpretability of machine learning methods for predicting delirium in gerontopsychiatric patients. Stud Health Technol Inform 2020;271:121–8. doi:10.3233/SHTI200087
OpenUrl
↵
1. Oh J,
2. Cho D,
3. Park J, et al
. Prediction and early detection of delirium in the intensive care unit by using heart rate variability and machine learning. Physiol Meas 2018;39:035004. doi:10.1088/1361-6579/aaab07
↵
1. Oosterhoff JHF,
2. Karhade AV,
3. Oberai T, et al
. Prediction of postoperative delirium in geriatric hip fracture patients: a clinical prediction model using machine learning models. Geriatr Orthop Surg Rehabil 2021;12:21514593211062277. doi:10.1177/21514593211062277
↵
1. Oosterhoff JHF,
2. Oberai T,
3. Karhade AV, et al
. Does the SORG orthopaedic research group hip fracture delirium algorithm perform well on an independent intercontinental cohort of patients with hip fractures who are 60 years or older Clin Orthop Relat Res 2022;480:2205–13. doi:10.1097/CORR.0000000000002246
OpenUrl
↵
1. Racine AM,
2. Tommet D,
3. D’Aquila ML, et al
. Machine learning to develop and internally validate a predictive model for post-operative delirium in a prospective, observational clinical cohort study of older surgical patients. J Gen Intern Med 2021;36:265–73. doi:10.1007/s11606-020-06238-7
OpenUrl
↵
1. Son CS,
2. Kang WS,
3. Lee JH, et al
. Machine learning to identify psychomotor behaviors of delirium for patients in long-term care facility. IEEE J Biomed Health Inform 2022;26:1802–14. doi:10.1109/JBHI.2021.3116967
OpenUrl
↵
1. Sun H,
2. Depraetere K,
3. Meesseman L, et al
. Machine learning-based prediction models for different clinical risks in different hospitals: evaluation of live performance. J Med Internet Res 2022;24:e34295. doi:10.2196/34295
↵
1. Sun H,
2. Depraetere K,
3. Meesseman L, et al
. A Scalable approach for developing clinical risk prediction applications in different hospitals. J Biomed Inform 2021;118:103783. doi:10.1016/j.jbi.2021.103783
OpenUrl
↵
1. Veeranki SPK,
2. Hayn D,
3. Jauk S, et al
. An improvised classification model for predicting delirium. Stud Health Technol Inform 2019;264:1566–7. doi:10.3233/SHTI190537
OpenUrl
↵
1. Veeranki SPK,
2. Hayn D,
3. Kramer D, et al
. Effect of nursing assessment on predictive delirium models in hospitalised patients. Stud Health Technol Inform 2018;248:124–31.
OpenUrl
↵
1. Wang Y,
2. Lei L,
3. Ji M, et al
. Predicting postoperative delirium after Microvascular decompression surgery with machine learning. J Clin Anesth 2020;66:109896. doi:10.1016/j.jclinane.2020.109896
OpenUrl PubMed
↵
1. Wong A,
2. Young AT,
3. Liang AS, et al
. Development and validation of an electronic health record-based machine learning model to estimate delirium risk in newly hospitalized patients without known cognitive impairment. JAMA Netw Open 2018;1:e181018. doi:10.1001/jamanetworkopen.2018.1018
↵
1. Xue B,
2. Li D,
3. Lu C, et al
. Use of machine learning to develop and evaluate models using preoperative and intraoperative data to identify risks of postoperative complications. JAMA Netw Open 2021;4:e212240. doi:10.1001/jamanetworkopen.2021.2240
↵
1. Xue X,
2. Chen W,
3. Chen X
. A novel Radiomics-based machine learning framework for prediction of acute kidney injury-related delirium in patients who underwent cardiovascular surgery. Comput Math Methods Med 2022;2022:4242069. doi:10.1155/2022/4242069
↵
1. Zhao H,
2. You J,
3. Peng Y, et al
. Machine learning algorithm using electronic chart-derived data to predict delirium after elderly hip fracture surgeries: a retrospective case-control study. Front Surg 2021;8:634629. doi:10.3389/fsurg.2021.634629
↵
1. Zhao Y,
2. Luo Y
. Unsupervised learning to Subphenotype delirium patients from electronic health records. Zhao Y, Luo Y, eds. 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); Houston, TX, USA.BIBM, 2021 doi:10.1109/BIBM52615.2021.9669806
↵
1. Fliegenschmidt J,
2. Hulde N,
3. Preising MG, et al
. Artificial intelligence predicts delirium following cardiac surgery: a case study. J Clin Anesth 2021;75:110473. doi:10.1016/j.jclinane.2021.110473
OpenUrl
↵
1. Jauk S,
2. Kramer D,
3. Avian A, et al
. Technology acceptance of a machine learning algorithm predicting delirium in a clinical setting: a mixed-methods study. J Med Syst 2021;45:52. doi:10.1007/s10916-021-01728-5

Supplementary materials

Supplementary Data

This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Data supplement 1
Data supplement 2
Data supplement 3

Footnotes

Contributors All authors contributed to the conception and design of the work. TS, LSH and IT contributed to database searching. TS, LSH, IT and IAS contributed to full-text screening. TS, LSH, IT and IAS contributed to data extraction and analysis. TS and IAS drafted the paper. All authors commented on and helped revise iterative drafts. IAS is the guarantor who accepts full responsibility for the finished article and had access to all data from literature searches. All authors gave final approval for the manuscript to be published and agreed to be accountable for all aspects of the work.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

[1] ↵
Richardson SJ,
Davis DHJ,
Stephan BCM, et al
. Recurrent delirium over 12 months predicts dementia: results of the delirium and cognitive impact in dementia (DECIDE) study. Age and Ageing 2021;50:914–20. doi:10.1093/ageing/afaa244
OpenUrl PubMed

[2] Richardson SJ,

[3] Davis DHJ,

[4] Stephan BCM, et al

[5] ↵
Han JH,
Shintani A,
Eden S, et al
. Delirium in the emergency department: an independent predictor of death within 6 months. Ann Emerg Med 2010;56:244–52. doi:10.1016/j.annemergmed.2010.03.003
OpenUrl CrossRef PubMed

[6] Han JH,

[7] Shintani A,

[8] Eden S, et al

[9] ↵
Inouye SK
. The importance of delirium and delirium prevention in older adults during lockdowns. JAMA 2021;325:1779–80. doi:10.1001/jama.2021.2211
OpenUrl

[10] Inouye SK

[11] ↵
Inouye SK,
Westendorp RGJ,
Saczynski JS
. Delirium in elderly people. Lancet 2014;383:911–22. doi:10.1016/S0140-6736(13)60688-1
OpenUrl CrossRef PubMed Web of Science

[12] Inouye SK,

[13] Westendorp RGJ,

[14] Saczynski JS

[15] ↵
Haendel MA,
Chute CG,
Robinson PN
. Classification, ontology, and precision medicine. N Engl J Med 2018;379:1452–62. doi:10.1056/NEJMra1615014
OpenUrl CrossRef

[16] Haendel MA,

[17] Chute CG,

[18] Robinson PN

[19] ↵
Scott IA
. Demystifying machine learning: a primer for physicians. Intern Med J 2021;51:1388–400. doi:10.1111/imj.15200
OpenUrl PubMed

[20] Scott IA

[21] ↵
Scott IA,
Carter S,
Coiera E
. Clinician checklist for assessing suitability of machine learning applications in healthcare. BMJ Health Care Inform 2021;28:e100251. doi:10.1136/bmjhci-2020-100251

[22] Scott IA,

[23] Carter S,

[24] Coiera E

[25] ↵
Muehlematter UJ,
Daniore P,
Vokinger KN
. Approval of artificial intelligence and machine learning-based medical devices in the USA and Europe (2015-20): a comparative analysis. Lancet Digit Health 2021;3:e195–203. doi:10.1016/S2589-7500(20)30292-2
OpenUrl CrossRef

[26] Muehlematter UJ,

[27] Daniore P,

[28] Vokinger KN

[29] ↵
Scott I,
Cook D,
Coiera E
. Evidence-based medicine and machine learning: a partnership with a common purpose. BMJ Evid Based Med 2021;26:290–4. doi:10.1136/bmjebm-2020-111379
OpenUrl Abstract/FREE Full Text

[30] Scott I,

[31] Cook D,

[32] Coiera E

[33] ↵
Goldstein BA,
Navar AM,
Pencina MJ, et al
. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J Am Med Inform Assoc 2017;24:198–208. doi:10.1093/jamia/ocw042
OpenUrl CrossRef PubMed

[34] Goldstein BA,

[35] Navar AM,

[36] Pencina MJ, et al

[37] ↵
Chua SJ,
Wrigley S,
Hair C, et al
. Prediction of delirium using data mining: a systematic review. J Clin Neurosci 2021;91:288–98. doi:10.1016/j.jocn.2021.07.029
OpenUrl

[38] Chua SJ,

[39] Wrigley S,

[40] Hair C, et al

[41] ↵
Ruppert MM,
Lipori J,
Patel S, et al
. ICU delirium-prediction models: a systematic review. Crit Care Explor 2020;2:e0296. doi:10.1097/CCE.0000000000000296

[42] Ruppert MM,

[43] Lipori J,

[44] Patel S, et al

[45] ↵
Tricco AC,
Lillie E,
Zarin W, et al
. PRISMA extension for scoping reviews (PRISMA-SCR): checklist and explanation. Ann Intern Med 2018;169:467–73. doi:10.7326/M18-0850
OpenUrl CrossRef PubMed

[46] Tricco AC,

[47] Lillie E,

[48] Zarin W, et al

[49] ↵
Munn Z,
Peters MDJ,
Stern C, et al
. Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach. BMC Med Res Methodol 2018;18:143. doi:10.1186/s12874-018-0611-x

[50] Munn Z,

[51] Peters MDJ,

[52] Stern C, et al

[53] ↵
Team, RC
. A language and environment for statistical computing. In: R foundation for statistical computing. Vienna, Austria, 2013. Available: http://www.R-project.org

[54] Team, RC

[55] ↵
Amador T,
Saturnino S,
Veloso A, et al
. Early identification of ICU patients at risk of complications: regularization based on robustness and stability of explanations. Artif Intell Med 2022;128:102283. doi:10.1016/j.artmed.2022.102283
OpenUrl

[56] Amador T,

[57] Saturnino S,

[58] Veloso A, et al

[59] ↵
Bishara A,
Chiu C,
Whitlock EL, et al
. Postoperative delirium prediction using machine learning models and preoperative electronic health record data. BMC Anesthesiol 2022;22:8. doi:10.1186/s12871-021-01543-y

[60] Bishara A,

[61] Chiu C,

[62] Whitlock EL, et al

[63] ↵
Cano‐Escalera G,
Graña M,
Irazusta J, et al
. Risk factors for prediction of delirium at hospital admittance. Expert Systems 2022;39:e12698. doi:10.1111/exsy.12698

[64] Cano‐Escalera G,

[65] Graña M,

[66] Irazusta J, et al

[67] ↵
Castro VM,
Hart KL,
Sacks CA, et al
. Longitudinal validation of an electronic health record delirium prediction model applied at admission in COVID-19 patients. Gen Hosp Psychiatry 2022;74:9–17. doi:10.1016/j.genhosppsych.2021.10.005
OpenUrl

[68] Castro VM,

[69] Hart KL,

[70] Sacks CA, et al

[71] ↵
Castro VM,
Sacks CA,
Perlis RH, et al
. Development and external validation of a delirium prediction model for hospitalized patients with Coronavirus disease 2019. J Acad Consult Liaison Psychiatry 2021;62:298–308. doi:10.1016/j.jaclp.2020.12.005
OpenUrl

[72] Castro VM,

[73] Sacks CA,

[74] Perlis RH, et al

[75] ↵
Coombes CE,
Coombes KR,
Fareed N
. A novel model to label delirium in an intensive care unit from clinician actions. BMC Med Inform Decis Mak 2021;21:97. doi:10.1186/s12911-021-01461-6

[76] Coombes CE,

[77] Coombes KR,

[78] Fareed N

[79] ↵
Corradi JP,
Thompson S,
Mather JF, et al
. Prediction of incident delirium using a random forest Classifier. J Med Syst 2018;42:261. doi:10.1007/s10916-018-1109-0

[80] Corradi JP,

[81] Thompson S,

[82] Mather JF, et al

[83] ↵
Davoudi A,
Ebadi A,
Rashidi P, et al
. Delirium prediction using machine learning models on preoperative electronic health records data. Proc IEEE Int Symp Bioinformatics Bioeng 2017;2017:568–73. doi:10.1109/BIBE.2017.00014
OpenUrl

[84] Davoudi A,

[85] Ebadi A,

[86] Rashidi P, et al

[87] ↵
Gutheil J,
Donsa K
. SAINTENS: self-attention and Intersample attention transformer for digital biomarker development using tabular healthcare real world data. Stud Health Technol Inform 2022;293:212–20. doi:10.3233/SHTI220371
OpenUrl

[88] Gutheil J,

[89] Donsa K

[90] ↵
Halladay CW,
Sillner AY,
Rudolph JL
. Performance of electronic prediction rules for prevalent delirium at hospital admission. JAMA Netw Open 2018;1:e181405. doi:10.1001/jamanetworkopen.2018.1405

[91] Halladay CW,

[92] Sillner AY,

[93] Rudolph JL

[94] ↵
Hu X-Y,
Liu H,
Zhao X, et al
. Automated machine learning-based model predicts postoperative delirium using readily extractable perioperative collected electronic data. CNS Neurosci Ther 2022;28:608–18. doi:10.1111/cns.13758
OpenUrl

[95] Hu X-Y,

[96] Liu H,

[97] Zhao X, et al

[98] ↵
Hur S,
Ko RE,
Yoo J, et al
. A machine learning-based algorithm for the prediction of intensive care unit delirium (PRIDE): retrospective study. JMIR Med Inform 2021;9:e23401. doi:10.2196/23401

[99] Hur S,

[100] Ko RE,

[101] Yoo J, et al

[102] ↵
Jauk S,
Kramer D,
Großauer B, et al
. Risk prediction of delirium in hospitalized patients using machine learning: an implementation and prospective evaluation study. J Am Med Inform Assoc 2020;27:1383–92. doi:10.1093/jamia/ocaa113
OpenUrl

[103] Jauk S,

[104] Kramer D,

[105] Großauer B, et al

[106] ↵
Jauk S,
Kramer D,
Quehenberger F, et al
. Information adapted machine learning models for prediction in clinical workflow. Stud Health Technol Inform 2019;260:65–72.
OpenUrl

[107] Jauk S,

[108] Kramer D,

[109] Quehenberger F, et al

[110] ↵
Jauk S,
Kramer D,
Schulz S, et al
. Evaluating the impact of incorrect diabetes coding on the performance of multivariable prediction models. Stud Health Technol Inform 2018;251:249–52.
OpenUrl

[111] Jauk S,

[112] Kramer D,

[113] Schulz S, et al

[114] ↵
Jauk S,
Veeranki SPK,
Kramer D, et al
. External validation of a machine learning based delirium prediction software in clinical routine. Stud Health Technol Inform 2022;293:93–100. doi:10.3233/SHTI220353
OpenUrl

[115] Jauk S,

[116] Veeranki SPK,

[117] Kramer D, et al

[118] ↵
Ji M,
Xing S,
Yang Y
. Pathophysiological factors of delirium among critically ill elders after non-cardiac surgery based on artificial neural networks: a pilot study. Anaesthesia, Pain and Intensive Care 2018;22:424–30.
OpenUrl

[119] Ji M,

[120] Xing S,

[121] Yang Y

[122] ↵
Kramer D,
Veeranki S,
Hayn D, et al
. Development and validation of a multivariable prediction model for the occurrence of delirium in hospitalized gerontopsychiatry and internal medicine patients. Stud Health Technol Inform 2017;236:32–9.
OpenUrl

[123] Kramer D,

[124] Veeranki S,

[125] Hayn D, et al

[126] ↵
Kurisu K,
Inada S,
Maeda I, et al
. A decision tree prediction model for a short-term outcome of delirium in patients with advanced cancer receiving pharmacological interventions: a secondary analysis of a multicenter and prospective observational study (phase-R). Palliat Support Care 2022;20:153–8. doi:10.1017/S1478951521001565
OpenUrl

[127] Kurisu K,

[128] Inada S,

[129] Maeda I, et al

[130] ↵
Li Q,
Zhao Y,
Chen Y, et al
. Developing a machine learning model to identify delirium risk in geriatric internal medicine inpatients. Eur Geriatr Med 2022;13:173–83. doi:10.1007/s41999-021-00562-9
OpenUrl

[131] Li Q,

[132] Zhao Y,

[133] Chen Y, et al

[134] ↵
Lucini FR,
Fiest KM,
Stelfox HT, et al
. Delirium prediction in the intensive care unit: a temporal approach. Annu Int Conf IEEE Eng Med Biol Soc 2020;2020:5527–30. doi:10.1109/EMBC44109.2020.9176042
OpenUrl

[135] Lucini FR,

[136] Fiest KM,

[137] Stelfox HT, et al

[138] ↵
Menzenbach J,
Kirfel A,
Guttenthaler V, et al
. Pre-operative prediction of postoperative delirium by appropriate screening (PROPDESC) development and validation of a pragmatic POD risk screening score based on routine preoperative data. J Clin Anesth 2022;78:110684. doi:10.1016/j.jclinane.2022.110684
OpenUrl

[139] Menzenbach J,

[140] Kirfel A,

[141] Guttenthaler V, et al

[142] ↵
Mufti HN,
Hirsch GM,
Abidi SR, et al
. Exploiting machine learning models and methods for the prediction of agitated delirium after cardiac surgery: models development and validation study. JMIR Med Inform 2019;7:e14993. doi:10.2196/14993

[143] Mufti HN,

[144] Hirsch GM,

[145] Abidi SR, et al

[146] ↵
Netzer M,
Hackl WO,
Schaller M, et al
. Evaluating performance and Interpretability of machine learning methods for predicting delirium in gerontopsychiatric patients. Stud Health Technol Inform 2020;271:121–8. doi:10.3233/SHTI200087
OpenUrl

[147] Netzer M,

[148] Hackl WO,

[149] Schaller M, et al

[150] ↵
Oh J,
Cho D,
Park J, et al
. Prediction and early detection of delirium in the intensive care unit by using heart rate variability and machine learning. Physiol Meas 2018;39:035004. doi:10.1088/1361-6579/aaab07

[151] Oh J,

[152] Cho D,

[153] Park J, et al

[154] ↵
Oosterhoff JHF,
Karhade AV,
Oberai T, et al
. Prediction of postoperative delirium in geriatric hip fracture patients: a clinical prediction model using machine learning models. Geriatr Orthop Surg Rehabil 2021;12:21514593211062277. doi:10.1177/21514593211062277

[155] Oosterhoff JHF,

[156] Karhade AV,

[157] Oberai T, et al

[158] ↵
Oosterhoff JHF,
Oberai T,
Karhade AV, et al
. Does the SORG orthopaedic research group hip fracture delirium algorithm perform well on an independent intercontinental cohort of patients with hip fractures who are 60 years or older Clin Orthop Relat Res 2022;480:2205–13. doi:10.1097/CORR.0000000000002246
OpenUrl

[159] Oosterhoff JHF,

[160] Oberai T,

[161] Karhade AV, et al

[162] ↵
Racine AM,
Tommet D,
D’Aquila ML, et al
. Machine learning to develop and internally validate a predictive model for post-operative delirium in a prospective, observational clinical cohort study of older surgical patients. J Gen Intern Med 2021;36:265–73. doi:10.1007/s11606-020-06238-7
OpenUrl

[163] Racine AM,

[164] Tommet D,

[165] D’Aquila ML, et al

[166] ↵
Son CS,
Kang WS,
Lee JH, et al
. Machine learning to identify psychomotor behaviors of delirium for patients in long-term care facility. IEEE J Biomed Health Inform 2022;26:1802–14. doi:10.1109/JBHI.2021.3116967
OpenUrl

[167] Son CS,

[168] Kang WS,

[169] Lee JH, et al

[170] ↵
Sun H,
Depraetere K,
Meesseman L, et al
. Machine learning-based prediction models for different clinical risks in different hospitals: evaluation of live performance. J Med Internet Res 2022;24:e34295. doi:10.2196/34295

[171] Sun H,

[172] Depraetere K,

[173] Meesseman L, et al

[174] ↵
Sun H,
Depraetere K,
Meesseman L, et al
. A Scalable approach for developing clinical risk prediction applications in different hospitals. J Biomed Inform 2021;118:103783. doi:10.1016/j.jbi.2021.103783
OpenUrl

[175] Sun H,

[176] Depraetere K,

[177] Meesseman L, et al

[178] ↵
Veeranki SPK,
Hayn D,
Jauk S, et al
. An improvised classification model for predicting delirium. Stud Health Technol Inform 2019;264:1566–7. doi:10.3233/SHTI190537
OpenUrl

[179] Veeranki SPK,

[180] Hayn D,

[181] Jauk S, et al

[182] ↵
Veeranki SPK,
Hayn D,
Kramer D, et al
. Effect of nursing assessment on predictive delirium models in hospitalised patients. Stud Health Technol Inform 2018;248:124–31.
OpenUrl

[183] Veeranki SPK,

[184] Hayn D,

[185] Kramer D, et al

[186] ↵
Wang Y,
Lei L,
Ji M, et al
. Predicting postoperative delirium after Microvascular decompression surgery with machine learning. J Clin Anesth 2020;66:109896. doi:10.1016/j.jclinane.2020.109896
OpenUrl PubMed

[187] Wang Y,

[188] Lei L,

[189] Ji M, et al

[190] ↵
Wong A,
Young AT,
Liang AS, et al
. Development and validation of an electronic health record-based machine learning model to estimate delirium risk in newly hospitalized patients without known cognitive impairment. JAMA Netw Open 2018;1:e181018. doi:10.1001/jamanetworkopen.2018.1018

[191] Wong A,

[192] Young AT,

[193] Liang AS, et al

[194] ↵
Xue B,
Li D,
Lu C, et al
. Use of machine learning to develop and evaluate models using preoperative and intraoperative data to identify risks of postoperative complications. JAMA Netw Open 2021;4:e212240. doi:10.1001/jamanetworkopen.2021.2240

[195] Xue B,

[196] Li D,

[197] Lu C, et al

[198] ↵
Xue X,
Chen W,
Chen X
. A novel Radiomics-based machine learning framework for prediction of acute kidney injury-related delirium in patients who underwent cardiovascular surgery. Comput Math Methods Med 2022;2022:4242069. doi:10.1155/2022/4242069

[199] Xue X,

[200] Chen W,

[201] Chen X

[202] ↵
Zhao H,
You J,
Peng Y, et al
. Machine learning algorithm using electronic chart-derived data to predict delirium after elderly hip fracture surgeries: a retrospective case-control study. Front Surg 2021;8:634629. doi:10.3389/fsurg.2021.634629

[203] Zhao H,

[204] You J,

[205] Peng Y, et al

[206] ↵
Zhao Y,
Luo Y
. Unsupervised learning to Subphenotype delirium patients from electronic health records. Zhao Y, Luo Y, eds. 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); Houston, TX, USA.BIBM, 2021 doi:10.1109/BIBM52615.2021.9669806

[207] Zhao Y,

[208] Luo Y

[209] ↵
Fliegenschmidt J,
Hulde N,
Preising MG, et al
. Artificial intelligence predicts delirium following cardiac surgery: a case study. J Clin Anesth 2021;75:110473. doi:10.1016/j.jclinane.2021.110473
OpenUrl

[210] Fliegenschmidt J,

[211] Hulde N,

[212] Preising MG, et al

[213] ↵
Jauk S,
Kramer D,
Avian A, et al
. Technology acceptance of a machine learning algorithm predicting delirium in a clinical setting: a mixed-methods study. J Med Syst 2021;45:52. doi:10.1007/s10916-021-01728-5

[214] Jauk S,

[215] Kramer D,

[216] Avian A, et al

Log in using your username and password

Main menu

Log in using your username and password

You are here

Abstract

Data availability statement

Statistics from Altmetric.com

Request Permissions

Introduction

Methods

Literature search

Supplemental material

Study selection

Data extraction and synthesis

Results

Study characteristics

Supplemental material

Model characteristics

Supplemental material

Clinical application

Discussion

Conclusion

Data availability statement

Ethics statements

Patient consent for publication

Ethics approval

Acknowledgments

References

Supplementary materials

Supplementary Data

Footnotes

Read the full text or download the PDF:

Log in using your username and password