Original Research

Physician performance scores used to predict emergency department admission numbers and excessive admissions burden

Abstract

Background Overcrowding in hospitals is associated with a panoply of adverse events. Inappropriate decisions in the emergency department (ED) contribute to overcrowding. The performance of individual physicians as part of the admitting team is a critical factor in determining the overall rate of admissions. While previous attempts to model admission numbers have been based on a range of variables, none have included measures of individual staff performance. We construct reliable objective measures of staff performance and use these, among other factors, to predict the number of daily admissions. Such modelling will enable enhanced workforce planning and timely intervention to reduce inappropriate admissions and overcrowding.

Methods A database was created of 232 245 ED attendances at Meir Medical Center in central Israel, spanning the years 2016–2021. We use several measures of physician performance together with historic caseload data and other variables to derive statistical models for the prediction of ED arrival and admission numbers.

Results Our models predict arrival numbers with a mean absolute percentage error (MAPE) of 6.85%, and admission numbers with a MAPE of 10.6%, and provide a same-day alert for heavy admissions burden with 75% sensitivity for a false-positive rate of 20%. The inclusion of physician performance measures provides an essential boost to model performance.

Conclusions Arrival number and admission numbers can be predicted with sufficient fidelity to enable interventions to reduce excess admissions and smooth patient flow. Individual staff performance has a strong effect on admission rates and is a critical variable for the effective modelling of admission numbers.

What is already known on this topic

  • Mathematical models can be used to predict the numbers of arrivals to an emergency department, and the numbers of subsequent inpatient admissions. Previously published models did not include any measures of staff performance in making their predictions, despite evidence that this an important factor in the decision-making process.

What this study adds

  • This study demonstrates the creation and use of a predictive model for inpatient admissions, which includes measures of staff performance, and demonstrates the increased performance and practical utility of such a model in predicting a high burden of admissions.

How this study might affect research, practice or policy

  • Models that use staff performance to predict admissions can be used as a tool to reduce admissions, by ensuring an appropriate balance of staff performance, particularly when a high burden of admissions is predicted

Introduction

Background

Overcrowding in hospital wards is associated with a panoply of adverse consequences for patients and staff alike.1–4 Internal medicine wards in Israeli hospitals are notorious for being overcrowded, with an annual nationwide average occupation of 97% in 2018 and 2019.5 These wards are struggling with patients in the corridor and team burnout, which is reflected in reduced quality of care to the most vulnerable patients, and the increasing challenge in recruiting high quality and motivated medical staff. The key to improvement is a reduction in the occupancy of the wards, by minimising unnecessary emergency department (ED) admissions, employing rapid workup and early discharge with emphasis on ambulatory treatment when possible. These approaches result in shorter inpatient stays, reduced inpatient mortality and increased patient and staff satisfaction.6 7

The appropriateness of admissions-related decisions in the ED has a direct effect on overcrowding, and excess admissions are specifically associated with poorer clinical and organisational outcomes.8 9 Multiple non-medical factors can influence the decision to admit an ED attendee, including the workload of the ED, hospital bed occupancy rates, the hour and day of week, the date and its correspondence to public events, individual staff performance, and even the weather.10 11

By applying statistical inference techniques, it is possible to create models for the prediction of admission caseload, based on social, institutional, staffing and stochastic factors. Hitherto dozens of models have been published that attempt to forecast ED demand,12 13 as well as predicting admission likelihood for individual patients.14 Only a few make any attempt to forecast overall admission numbers.15–19 No study hitherto has attempted to include all available factors, and particularly estimates of staff performance, in modelling admissions. No study has used such models as tools for optimisation of staffing policies in order to reduce admissions. Furthermore, very few of the published studies of any type have specifically involved the Israeli hospital system which, as with any locale, has its own unique set of demographic and administrative challenges.

The aim of this paper is to demonstrate the creation and testing of a model for the prediction of admission numbers, to confirm the hypothesis that measures of staff performance are important inputs into such models and to demonstrate the feasibility of using such models as tools to optimise staff allocation and reduce admissions.

For the current study, we use 5 years’ of ED case records at Meir Medical Center, a busy secondary-level general hospital in central Israel. Using this data, we construct metrics for individual staff performance, as well as time series for historic arrival and admission numbers, ED caseload and other factors. Based on these, we construct models to predict both daily ED arrivals and daily admissions from the ED to the internal medicine wards. We also derive a tool that can produce an alert when any given day is likely to see an excessive number of admissions.

Methods

Study population and data set

Meir Medical Center is a busy, secondary-level general hospital in Kefar Sava in central Israel. It serves a demographically diverse population of close to one million. Adult patients seeking urgent care are seen at triage, and either sent to the internal medicine ED or to one of several specialist surgical EDs. Starting in March 2020, due to the incipient COVID-19 epidemic, the internal medicine ED was split into two sections, one for patients for whom there is a high suspicion of COVID-19 and one for the rest. The determination as to which section a patient should attend was made at triage.

Our database comprises every attendance at the internal medicine ED between 1 February 2016 and 31 March 2021. The basic anonymised data set comprised the following: the date and time of ED presentation, which unit was attended (the COVID ED or the regular ED), the date and time of the patient’s release from the ED, the type of disposal (admission, discharge, self-discharge against advice, etc) and the name of the physician who signed the disposal. Additionally, for patients who were admitted: the name of the inpatient unit, and the date and time of subsequent discharge. Records for multiple attendances by the same individual were linked. The following derived data were calculated from the basic data and stored alongside it: ED occupancy at the time of ED receipt, ED occupancy at the time of ED release, whether the patient returned to the ED within 7 days of discharge and whether the patient was admitted for less than 24 hours (a zero-day admission).

For the 61 months covered by the study, there were 232 246 attendances by 128 570 individual patients, who were treated by 886 individual physicians. For each day covered by the study, hour-by-hour weather conditions for Kefar Sava were fetched from a commercial provider. All statistical analysis was carried out in R 4.1.2 64-bit.20 A list of R packages used is available in the supplementary material. This study received ethical approval from the local Helsinki committee. There was no patient or public involvement in the design of this study.

Measuring physician performance

In order to comprise an effective tool for predicting admission numbers, we require measures of physician performance that are stable, reproducible, simple and comparable. The most direct measure of an ED physician’s proclivity to admit patients is her or his historical admission rate. Other hypothesised measures of performance that we consider include the physician’s rate of zero-day admissions, as a proxy for inappropriate admissions, the rate of a physician’s discharges who return to the ED within 7 days, as a proxy for inappropriate discharges and the speed at which a physician works, measured in terms of chart signatures per hour, as a general performance correlate.

Since any physician’s performance is expected to vary with experience, it is not sufficient to calculate these statistics as a simple average over all time. Instead, each score is computed as an exponential weighted average, which favours more recent performance over older data. Separate scores are calculated for work in the COVID ED and the regular ED, since working conditions and hence performance differ dramatically between them. In order to compare directly between each of the different performance measures, and in order to combine performance scores from the COVID and regular EDs, every performance measure is converted to a peer-referenced t-score, in which a score of zero represents performance equal to the mean of a physician’s peers, and a score of ±1 represents performance one SD above or below the mean. The performance score in any given category for a physician who works in both the regular and COVID EDs is a weighted average of the separate scores for each ED, reflecting his/her relative workload in each location.

For the purposes of relating admission numbers to the historical performance of the admitting team, we must construct a weighted average over that team, for each of these performance t-scores, for each day. The weights for this average are chosen such that the team-performance score for any given day reflects the predicted contribution of each individual team member to the day’s work, based on posted hours and the individual’s rate of seeing patients. Full details of this calculation are provided in online supplemental appendix S1.

Modelling ED arrival and admission numbers

For both arrival and admission numbers, we construct a predictive model based on a negative-binomial generalised linear regression. The independent variables for each model are listed in table 1. As in previous studies, the inclusion of historic arrivals and admissions data provides a key boost to model performance.17 The most accurate predictions are made using arrivals data as recent as the hours leading up the prediction being made.

Table 1
|
Factors affecting daily ED arrivals (left column) and admission numbers (right column) at Meir Medical Center

Redundant independent variables were removed from each model by an iterative procedure. At each step, the least significant variable is removed, and a new model generated, until all remaining variables have a statistically significant effect on model output. The final form for each model was shown to have similar predictive power to the initial model. In order to improve fitting performance and predictivity at the extremes of the range, at the expense of a slight loss in predictivity at the centrum of the data, weighting was applied which favoured data for days with a number of arrivals that was far from the mean. Full details of the arrivals model and the fitting procedure are provided in online supplemental appendix S2, and full details of the admissions model in online supplemental appendix S3.

Modelling crises in admission numbers

A useful consequence of our modelling efforts would be the ability to predict the occurrence of admission numbers sufficiently large that they pose a danger to the continued effective operation of the inpatient system—we shall term this situation a crisis. If a crisis can be predicted with sufficient early warning, it may be that a managerial intervention, such as the deployment of extra staff, can be enacted.

For our purposes, we shall define our crisis as any day in which the number of admissions is above the 90th percentile. At our hospital, this approximately correlates with a rate of emergency admissions which requires each resident doctor on duty in the internal medicine wards to work non-stop with minimal rest and little capacity to deal with emergent situations.

We construct a binary logistic regression model, where the outcome variable is whether any particular day meets the above definition for a crisis. We include in our base model the same independent variables as per the ‘Modelling ED arrivals and admission numbers’ section, and in addition, for each of the 14 days prior, whether or not that given day hosted a crisis. We then use the same procedure as described above to eliminate extraneous variables. Full details of the resulting model are provided in online supplemental appendix S4. As before, to make the most accurate predictions, the model requires ED arrivals data from the morning of the day in question.

Results

Predicting arrival numbers

Figure 1 details the performance of the arrivals model. The actual number of admissions is shown to be well correlated with the predicted number, across the full range of values. The mean average percentage error (MAPE) is 6.85%.

Figure 1
Figure 1

Plot of predicted number of ED arrivals, versus actual number of arrivals, for the arrivals model described in ‘Modelling ED arrival and admission numbers’. Data for a perfectly predictive model would be aligned exactly along the black line. The grey ribbon is for the two-tailed 90% confidence interval associated with the predictions.

To test the ability of the model to retain predictive power when confronted by data on which it has not been trained, an alternative model was created, which is identical to the full model in all respects, except that it was trained on a randomly selected subset of 80% of the full data. Predictions were generated for the remaining 20% of the data set, on which the model was not trained. The performance of the model remained unchanged.

Predicting admission numbers

Figure 2 shows the performance of the admissions model. The actual number of admissions is shown to be well correlated with the predicted number, across the full range of values. The MAPE is 10.6%.

Figure 2
Figure 2

Plot of predicted number of admissions, versus actual number of admissions, for the admissions model described in ‘Modelling ED arrival and admission numbers’. Data for a perfectly predictive model would be aligned exactly along the dotted line. The grey ribbon is for the two-tailed 90% confidence interval associated with the predictions

Of all the physician performance metrics considered in ‘Measuring physician performance’ section, only the team admission rate metric was included in the final analysis. Although the zero-day admissions rate and the physician speed are indeed correlated with admission numbers, they have no correlation independent of team admission rate and therefore provide no extra predictive value to the model, though we note that clinician speed remains indirectly incorporated into the team admission rate metric as described in the ‘Measuring physician performance’ section. ED returnee rate was not found to correlate with admission numbers at all. Likewise, the amount of rainfall was not found to have an independent influence on admissions and was not included in the final model. Examination of model diagnostics shows that the negative binomial approximation is a good fit to the data, and there is no evidence for non-linearity or collinearity in the model variables. As with the arrivals model, the performance of the admissions model remained unchanged when confronted by data on which it had not been trained. To demonstrate the utility of clinician performance as an input variable, the same model was generated without including any performance variables: the MAPE is degraded to 11.1%.

It is possible to use the model as a tool to estimate the effect of prospective staffing changes on the number of admissions, by hypothetically varying the composition of the ED team, and observing the effect on output predictions. A precursory analysis shows that simply by swapping physicians between shifts in a single week, it is possible to strengthen the team performance sufficiently on those days predicted to be busiest—Sundays and days after public holidays—to provide a 12% across the board reduction in admissions, and a 40% reduction in the variance in admission number.

Predicting crises in admission numbers

The response variable of the model from the ‘Modelling crises in admission numbers’ section is a probability for encountering a crisis on the given day in question. A crisis is predicted when this probability exceeds some prespecified value, and sensitivity of the test is fixed by choosing this value. The resulting receiver operating characteristic curve (ROC) for this test is shown in the left panel of figure 3, and has an area under curve (AUC) of 0.903.

Figure 3
Figure 3

Plots illustrating the performance of the binary classifier model for predicting crises in admissions numbers, as described in ’Modelling crises in admission numbers’. The left panel shows a ROC curve for the model, while the right panel shows a curve of PPV vs sensitivity

The dashed line in the right-panel of figure 3 shows the sensitivity versus positive predictive value (PPV) for this test, when it is applied indiscriminately to every single day. Despite the high intrinsic performance, we see that the practical performance of this test is rather poor. For instance, if a false-positive rate of at most 0.25 (ie, PPV of 0.75) is required, then the sensitivity falls to around 0.2. This occurs because the pretest probability of a crisis is low (by definition 0.1) when the test is used in this way.

The practical performance of the test can be improved by applying the test discriminately to those days with a higher likelihood of hosting a crisis. Our data show that Sunday (which is the first day of the working week in Israel) and any weekday following a public holiday were found to significantly increase the odds of a crisis. Indeed, half of all crises occur on these days alone. Conversely, almost no crises occurred at weekends or during public holidays.

The solid line in the right-panel of figure 3 shows the sensitivity versus PPV for the test when applied only to Sundays or weekdays following public holidays, for which the pretest probability of a crisis is 0.35. Again considering a false-positive rate of at most 0.2, the sensitivity of the test is now improved to 0.75. The test thus becomes a practical tool for enabling same-day interventions to either directly reduce, or to mitigate the effects of, increased admissions on days at heightened risk of a crisis.

As before, we can examine the effect on model predictions of hypothetical changes to ED team composition. By swapping shifts to strengthen the busiest days at the expense of quieter ones, we find a dramatic reduction of 87% in the number of crises on the busiest days, and an overall reduction of around 33%.

Discussion

Using a database of 230 000 ED attendances spanning 61 months at Meir Medical Center, Israel, we have created a regression model that is capable of predicting total arrival numbers for the concurrent 24-hour period. The model uses historical, environmental and calendric factors and historical arrival numbers as input. It is most accurate when provided with arrivals data for the hours prior to the prediction being made, and the MAPE of the model is 6.85%.

Having postulated that historical physician performance can predict admissions-related decisions, we have constructed several measures of team performance which are intended to correlate with these decisions. We find that physicians’ historical admission rate, rate of zero-day admissions and throughput in terms of patients seen per hour, are highly predictive. However, these measures are strongly collinear and in practice only the historical admission rate acts as an independent predictive variable. We find that the rate of ED returnees has no predictive value for admissions-related decisions.

Using the historical admission rate for the individuals working that shift, together with historical admissions and arrivals data, and environmental and calendric factors, we have constructed a regression model capable of predicting same-day admission numbers. The MAPE of the model is 10.6% and the inclusion of physician performance variables provides a clear boost to performance.

Using similar techniques, a mathematical test was developed that is able to provide a same-day prediction for when numbers of admissions rise above that with which the inpatient system is able to safely cope. When applied to those days at the highest risk of such a crisis, the sensitivity of the test is around 0.75 for detecting an impending crisis, for a false-positive rate of 0.25. Such advanced warning would enable same-day interventions to mitigate the effect of such a crisis.

In relation to the existing literature, the performance of our model is at least as good as those previously published.12 13 The use of team performance metrics to augment admissions modelling is new in the medical literature, and results in a MAPE that already exceeds the best previously published.15–19 The application of our model to predict crises in admissions, and the efficacy of attempts to mitigate their consequences, is new. The application of these techniques to medical admissions data is also new in the context of Israeli healthcare.

The main limitation of the present study is that the formulated models are directly applicable only to our own institution, although similar models could be derived easily for any other ED given the requisite data. Although our principal objective is the modelling of medical admissions, there is no reason why a very similar approach cannot be taken regarding surgical or orthopaedic admissions.

Given the large size of our primary dataset, it is unlikely that we can improve the performance of either our arrivals or admissions model by expanding its size. Conversely, the inclusion of hitherto unmodelled variables, such as local traffic patterns or individual ward occupancy, could well reduce residual variance and enhance predictivity. Technological upgrades involving newer machine learning techniques may offer some improved model performance.

Summarily, we have shown that consideration of physician performance is vital for models that predict ED admission numbers. A preliminary analysis shows that dramatic reductions in admissions for minimal outlay are possible by using such models as tools to optimise ED staffing. Our most immediate task is to demonstrate the practical application of our work to ED workforce planning in our institution.