Artificial intelligence framework for simulating clinical decision-making: A Markov decision process approach

https://doi.org/10.1016/j.artmed.2012.12.003Get rights and content

Abstract

Objective

In the modern healthcare system, rapidly expanding costs/complexity, the growing myriad of treatment options, and exploding information streams that often do not effectively reach the front lines hinder the ability to choose optimal treatment decisions over time. The goal in this paper is to develop a general purpose (non-disease-specific) computational/artificial intelligence (AI) framework to address these challenges. This framework serves two potential functions: (1) a simulation environment for exploring various healthcare policies, payment methodologies, etc., and (2) the basis for clinical artificial intelligence – an AI that can “think like a doctor”.

Methods

This approach combines Markov decision processes and dynamic decision networks to learn from clinical data and develop complex plans via simulation of alternative sequential decision paths while capturing the sometimes conflicting, sometimes synergistic interactions of various components in the healthcare system. It can operate in partially observable environments (in the case of missing observations or data) by maintaining belief states about patient health status and functions as an online agent that plans and re-plans as actions are performed and new observations are obtained. This framework was evaluated using real patient data from an electronic health record.

Results

The results demonstrate the feasibility of this approach; such an AI framework easily outperforms the current treatment-as-usual (TAU) case-rate/fee-for-service models of healthcare. The cost per unit of outcome change (CPUC) was $189 vs. $497 for AI vs. TAU (where lower is considered optimal) – while at the same time the AI approach could obtain a 30–35% increase in patient outcomes. Tweaking certain AI model parameters could further enhance this advantage, obtaining approximately 50% more improvement (outcome change) for roughly half the costs.

Conclusion

Given careful design and problem formulation, an AI simulation framework can approximate optimal decisions even in complex and uncertain environments. Future work is described that outlines potential lines of research and integration of machine learning algorithms for personalized medicine.

Introduction

There are multiple major problems in the functioning and delivery of the modern healthcare system – rapidly expanding costs and complexity, the growing myriad of treatment options, and exploding information streams that often do not, or at most ineffectively, reach the front lines. Even the answer to the basic healthcare question of “What is wrong with this person” often remains elusive in the modern era – let alone clear answers on the most effective treatment for an individual or how we achieve lower costs and greater efficiency. With the expanding use of electronic health records (EHRs) and growth of large public biomedical datasets (e.g. GenBank, caBig), the area is ripe for applications of computational and artificial intelligence (AI) techniques in order to uncover fundamental patterns that can be used to predict optimal treatments, minimize side effects, reduce medical errors/costs, and better integrate research and practice [1].

These challenges represent significant opportunities for improvement. Currently, patients receive correct diagnoses and treatment less than 50% of the time (at first pass) [2]. There is stark evidence of a 13–17-year gap between research and practice in clinical care [3]. This reality suggests that the current methods for moving scientific results into actual clinical practice are lacking. Furthermore, evidence-based treatments derived from such research are often out-of-date by the time they reach widespread use and do not always account for real-world variation that typically impedes effective implementation [4]. At the same time, healthcare costs continue to spiral out-of-control, on pace to reach 30% of gross domestic product by 2050 at current growth rates [5]. Training a human doctor to understand/memorize all the complexity of modern healthcare, even in their specialty domain, is a costly and lengthy process – for instance, training a human surgeon now takes on average 10 years or 10,000 h of intensive involvement [6].

The goal in this paper is to develop a general purpose (non-disease-specific) computational/AI framework in an attempt to address these challenges. Such a framework serves two potential functions. First, it provides a simulation environment for understanding and predicting the consequences of various treatment or policy choices. Such simulation modeling can help improve decision-making and the fundamental understanding of the healthcare system and clinical process – its elements, their interactions, and the end result – by playing out numerous potential scenarios in advance. Secondly, such a framework can provide the basis for clinical artificial intelligence that can deliberate in advance, form contingency plans to cope with uncertainty, and adjust to changing information on the fly. In essence, we are attempting to replicate clinician decision-making via simulation. With careful design and problem formulation, we hypothesize that such an AI simulation framework can approximate optimal decisions even in complex and uncertain environments, and approach – and perhaps surpass – human decision-making performance for certain tasks. We test this hypothesis using real patient data from an EHR.

Combining autonomous AI with human clinicians may serve as the most effective long-term path. Let humans do what they do well, and let machines do what they do well. In the end, we may maximize the potential of both. Such technology has the potential to function in multiple roles: enhanced telemedicine services, automated clinician's assistants, and next-generation clinical decision support systems (CDSS) [7], [8].

In previous work, we have detailed computational approaches for determining optimal treatment decisions at single timepoints via the use of data mining/machine learning techniques. Initial results of such approaches have achieved success rates of near 80% in predicting optimal treatment for individual patients with complex, chronic illness, and hold promise for further improvement [7], [9]. Predictive algorithms based on such data-driven models are essentially an individualized form of practice-based evidence drawn from the live population. Another term for this is “personalized medicine”.

The ability to adapt specific treatments to fit the characteristics of an individual's disorder transcends the traditional disease model. Prior work in this area has primarily addressed the utility of genetic data to inform individualized care. However, it is likely that the next decade will see the integration of multiple sources of data – genetic, clinical, and socio-demographic – to build a more complete profile of the individual, their inherited risks, and the environmental/behavioral factors associated with disorder and the effective treatment thereof [10]. Indeed, we already see the trend of combining clinical and genetic indicators in prediction of cancer prognosis as a way of developing cheaper, more effective prognostic tools [11], [12], [13]. Such computational approaches can serve as a component of a larger potential framework for real-time data-driven clinical decision support, or “adaptive decision support”. This framework can be integrated into an existing clinical workflow, essentially functioning as a form of artificial intelligence that “lives” within the clinical system, can “learn” over time, and can adapt to the variation seen in the actual real-world population (Fig. 1). The approach is two-pronged – both developing new knowledge about effective clinical practices as well as modifying existing knowledge and evidence-based models to fit real-world settings [7], [9].

The focus of the current study is to extend the prior work beyond optimizing treatments at single decision points in clinical settings. This paper considers sequential decision processes, in which a sequence of interrelated decisions must be made over time, such as those encountered in the treatment of chronic disorders.

At a broad level, modeling of dynamic sequential decision-making in medicine has a long and varied history. Among these modeling techniques are the Markov-based approaches used here, originally described in terms of medical decision-making by Beck and Pauker [14]. Other approaches utilize dynamic influence diagrams [15] or decision trees [16], [17] to model temporal decisions. An exhaustive review of these approaches is beyond the scope of this article, but a general overview of simulation modeling techniques can be found in Stahl [17]. In all cases, the goal is to determine optimal sequences of decisions out to some horizon. The treatment of time – whether it is continuous or discrete, and (if the latter) how time units are determined – is a critical aspect in any modeling effort [17], as are the trade-offs between solution quality and solution time [15]. Problems can be either finite-horizon or infinite-horizon. In either case, utilities/rewards of various decisions can be undiscounted or discounted, where discounting increases the importance of short-term utilities/rewards over long-term ones [18].

Markov decision processes (MDPs) are one efficient technique for determining such optimal sequential decisions (termed a “policy”) in dynamic and uncertain environments [18], [19], and have been explored in medical decision-making problems in recent years [18], [20]. MDPs (and their partially observable cousins) directly address many of the challenges faced in clinical decision-making [17], [18]. Clinicians typically determine the course of treatment considering current health status as well as some internal approximation of the outcome of possible future treatment decisions. However, the effect of treatment for a given patient is non-deterministic (i.e. uncertain), and attempting to predict the effects of a series of treatments over time only compounds this uncertainty. A Markov approach provides a principled, efficient method to perform probabilistic inference over time given such non-deterministic action effects. Other complexities (and/or sources of uncertainty) include limited resources, unpredictable patient behavior (e.g., lack of medication adherence), and variable treatment response time. These sources of uncertainty can be directly modeled as probabilistic components in a Markov model [19]. Additionally, the use of outcome deltas, averse to clinical outcomes themselves, can provide a convenient history meta-variable for maintaining the central Markov assumption: that the state at time t depends only on the information at time t  1 [17]. Currently, most treatment decisions in the medical domain are made via ad hoc or heuristic approaches, but there is a growing body of evidence that such complex treatment decisions are better handled through modeling rather than intuition alone [18], [21].

Partially observable Markov decision processes (POMDPs) extend MDPs by maintaining internal belief states about patient status, treatment effect, etc., similar to the cognitive planning aspects in a human clinician [22], [23]. This is essential for dealing with real-world clinical issues of noisy observations and missing data (e.g. no observation at a given timepoint). By using temporal belief states, POMDPs can account for the probabilistic relationship between observations and underlying health status over time and reason/predict even when observations are missing, while still using existing methods to perform efficient Bayesian inference. MDPs/POMDPs can also be designed as online AI agents – determining an optimal policy at each timepoint (t), taking an action based on that optimal policy, then re-determining the optimal policy at the next timepoint (t + 1) based on new information and/or the observed effects of performed actions [24], [25].

A challenge in applying MDP/POMDPs is that they require a data-intensive estimation step to generate reasonable transition models – how belief states evolve over time – and observation models – how unobserved variables affect observed quantities. Large state/decision spaces are also computationally expensive to solve particularly in the partially observable setting, and must adhere to specific Markov assumptions that the current timepoint (t) is dependent only on the previous timepoint (t  1). Careful formulation of the problem and state space is necessary to handle such issues [17], [19].

There have been many applications in other domains, such as robotics, manufacturing, and inventory control [17], [19], [26]. However, despite such applicability of sequential decision-making techniques like MDPs to medical decision-making, there have been relatively few applications in healthcare [18], [19].

Here, we outline a MDP/POMDP simulation framework using agents based on clinical EHR data drawn from real patients in a chronic care setting. We attempt to optimize “clinical utility” in terms of cost-effectiveness of treatment (utilizing both outcomes and costs) while accurately reflecting realistic clinical decision-making. The focus is on the physician's (or physician agent's) optimization of treatment decisions over time. We compare the results of these computational approaches with existing treatment-as-usual approaches to test our primary hypothesis – whether we can construct a viable AI framework from existing techniques that can approach or even surpass human decision-making performance (see Section 1.2).

The framework is structured as a multi-agent system (MAS) for future potential studies, though at the current juncture this aspect is not fully leveraged. However, combining MDPs and MAS opens up many interesting opportunities. For instance, we can model personalized treatment simply by having each patient agent maintain their own individualized transition model (see Section 4). MAS can capture the sometimes synergistic, sometimes conflicting nature of various components of such systems and exhibit emergent, complex behavior from simple interacting agents [14], [27]. For instance, a physician may prescribe a medication, but the patient may not adhere to treatment [20].

Section snippets

Data

Clinical data, including outcomes, treatment information, demographic information, and other clinical indicators, was obtained from the electronic health record (EHR) at Centerstone for 961 patients who participated in the client-directed outcome-informed (CDOI) pilot study in 2010 [9], as well as patients who participated in the ongoing evaluation of CDOI post-pilot phase. This sample contained 5807 patients, primarily consisting of major clinical depression diagnoses, with a significant

General results

Nearly 100 different constructs were evaluated during this study. For brevity, a sampling of the main results is shown in Table 1 (with OSF = 0). In general, the purely probabilistic decision-making models performed poorly, and are not shown here. In all tables, results are based averages/percentages across all patients (n = 500, see Section 2.1) in each construct simulation. As defined in Section 2.3, the goal here (i.e. optimality) is defined as maximizing patient improvement while minimizing

Summarization of findings

The goal in this paper was to develop a general purpose (non-disease-specific) computational/AI framework in attempt to address fundamental healthcare challenges – rising costs, sub-optimal quality, difficulty moving research evidence into practice, among others. This framework serves two potential purposes:

  • (1)

    A simulation environment for exploring various healthcare policies, payment methodologies, etc.

  • (2)

    The basis for clinical artificial intelligence – an AI that can “think like a doctor”.

This

Conflict of interest statement

The authors have no conflict of interest related to the research presented herein.

Acknowledgements

This research is funded by the Ayers Foundation, the Joe C. Davis Foundation, and Indiana University. The funders had no role in the design, implementation, or analysis of this research. The author would like to acknowledge the support of the following Centerstone Research Institute staff in this work: Dr. Tom Doub, Dr. Dennis Morrison, Dr. April Bragg, and Dr. Rebecca Selove. The opinions expressed herein do not necessarily reflect the views of Centerstone, Indiana University, or their

References (39)

  • P.R. Orszag et al.

    The challenge of rising health care costs – a view from the Congressional Budget Office

    New England Journal of Medicine

    (2007)
  • G.P. Jackson et al.

    How long does it take to train a surgeon?

    BMJ

    (2009)
  • C.C. Bennett et al.

    Data mining and electronic health records: selecting optimal clinical treatments in practice

  • C.C. Bennett et al.

    Data mining session-based patient reported outcomes (PROs) in a mental health setting: toward data-driven clinical decision support and personalized treatment

  • I.S. Kohane

    The twin questions of personalized medicine: who are you and whom do you most resemble?

    Genome Medicine

    (2009)
  • Y. Sun Y et al.

    Improved breast cancer prognosis through the combination of clinical and genetic markers

    Bioinformatics

    (2007)
  • O. Gevaert et al.

    Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks

    Bioinformatics

    (2006)
  • A.L. Boulesteix et al.

    Microarray-based classification and clinical predictors: on combined classifiers and additional predictive value

    Bioinformatics

    (2008)
  • J.R. Beck et al.

    The Markov process in medical prognosis

    Medical Decision Making

    (1983)
  • Cited by (257)

    View all citing articles on Scopus
    View full text