Discussion
This single-centre data informatics feasibility study demonstrated for the first time that forecasting ICU bed availability was possible using only routinely collected hospital bed management data without the need for detailed and sensitive clinical data. Our data-driven bed availability prediction performed better over shorter prediction time frames.
We are the first to present a novel reformulation to the bed occupancy problem by classifying the availability of one or more beds on a future date. This is an important distinction when trying to predict the exact number of available beds, as it is a simpler problem to solve. Although more limited in clinical scope, this formulation has specific applications when assessing if a bed will be available for upcoming admissions to the unit while also being easier to interpret and use. Having confirmed our model could forecast the availability of at least 1 PICU bed, the methodology could in the future be generalised to classify any number of available beds. For example, the forecasting of at least two ICU beds availability would account for having an emergency bed available if elective surgery planned for the day were to proceed. Forecasting ICU bed availability allows time for an elective surgery to be rescheduled and to reallocate resources thereby reducing wastage. Preventing last minute surgery cancellation may help to reduce distress for the patient and their family. Future clinical translation and implementation of any data-driven bed availability forecasting model will depend on achieving a forecasting accuracy or an AUC level that is of clinical importance. Currently, there is no agreed clinically significant accuracy or AUC level in the literature for these types of bed availability forecasting models, and this warrants future coinvestigations between clinicians and data scientists.
Using hospital bed management data to predict hospital-wide bed availability has been reported previously,7 19 20 but we are the first to use this approach to predict daily ICU bed capacity specifically. Previously, the prediction of future intensive care bed availability relies on detailed and sensitive clinical data.21 The majority of the previous work in bed availability prediction required, at the very least, detailed patient LoS data, for example, summary LoS statistics such as the mean22 23 or use techniques such as compartment modelling24 and discrete event simulation.25 26 Other approaches use detailed patient-level clinical, demographic and physiology data to predict individual remaining LoS.18 However, they could only predict short horizons into the future based on the LoS of patients in the unit with specific pathologies. Extracting and analysing detailed patient-level data are also challenging requiring complex ethical and regulatory considerations. Using state-of-the-art forecasting methods applied only to the temporal trends present in non-sensitive, non-patient-based routinely collected hospital management data to predict ICU bed capacity as we did in our project offered a clear advantage for wider uptake. This is because secondary analysis of routinely collected hospital bed management data can be performed by all hospitals in Britain as quality improvement initiatives.
For the classification task, Gradient boosted tree methods had the best performance over 7 and 14 days and logistic regression performed the best over 1-day prediction. Simpler linear models performed equally when compared with the complex non-linear methods in the regression task and were a surprising finding in our results, especially when other similar work in literature has moved towards non-linear neural network-based models. Kutafina et al7 used a recursive neural network-based model architecture while Kumar et al20 used a standard neural network with both showing improved results over older literature which used linear models.19 It is well known that recurrent neural network based models such as LSTMs are typically very suitable for long-term forecasting based on their ability to memorise key temporal patterns.27 In contrast, in our study, LSTMs performance only equalled the simpler linear models, as did other state-of-the-art forecasting models such as DLinear which automatically extract frequency-based information such as seasonality.28 It is possible that the propensity of these models to overfit the data coupled with the large number of available hyperparameters and the high relative variance observed in this dataset led to difficulty in finding an optimal model. This did, however, lead to models with a degree of explainability as it was possible to directly evaluate the impact of feature importance with both linear and XGBoost models.
All the top-performing models in this project had access to features not only relating to previous PICU occupancy, but also resource data such as PBB, admissions, discharges/transfers for all units and wards, and temporal information such as days, weeks and public/school holidays. One advantage of the models used was that when using scaled data, it was possible to directly understand the importance of specific features on the predictions. The predictions from the regression models relied most heavily on recent PICU-related information (figure 4) while the most important features used by the classification models were mostly temporal-related features (figure 5). In particular, whether or not it was a weekend. The models in our study also predominantly used PICU features as opposed to features from other wards, indicating that in general resource information from other wards outside of the PICU were of lesser importance for predicting future bed availability.
Figure 4Feature importance showing the 10 features with the largest magnitude of top-performing regression models used to determine whether a PICU bed will be available in (A) 1 day, (B) 7 days and (C) 14 days. Features with larger magnitude weights have a bigger impact on the model predictions. Numbers in brackets indicate features lagged by that many days. PBB, potential bed base; PICU, paediatric intensive care unit; TCI, to come in.
Figure 5Feature importance showing the 10 features with the largest magnitude of top-performing logistic regression models used to determine whether a PICU bed will be available in (A) 1 day, (B) 7 days and (C) 14 days. Features with larger magnitude weights have a bigger impact on the model predictions. Numbers in brackets indicate features lagged by that many days, and the weeks reference the weeks of the year: for example, week 28 is a week in the Edinburgh school summer holidays, week 32 is the first week of Edinburgh Autumn school term while week 41 is the autumn half term week in October of Edinburgh school term. PBB, potential bed base; PICU, paediatric intensive care unit.
A main challenge encountered was that the forecasting tasks became increasingly difficult over longer forecasting horizons. This was in contrast with other similar works in literature7 which successfully predicted full hospital occupancy up to 60 days in the future. There are several possible explanations: (1) Our work forecasts bed availability specifically in the PICU on a daily basis, which is a challenging problem when the unit only contains 12 beds compared with a whole hospital containing hundreds of beds. (2) The random nature of unplanned admissions and discharges means there is a large amount of random variance relative to the total number of beds. This is in contrast to other studies which forecast the occupancy of beds in an entire hospital7 19 20 and aggregate the data into weekly totals to reduce variance.19 20 Aggregation into low-granularity data was judged to reduce the clinical usefulness of this work in PICU setting, and instead, other smoothing techniques were applied.
While this project uses data from PICU for critical care bed availability prediction as a case study, the methodology may be used to predict other wards and clinical areas’ bed availability. This is because the hospital bed management data, for example, the number of beds occupied and available, expected admissions and discharges, is generic data collected across all clinical areas and hospitals, regardless of the patients’ age. PICU and children’s hospitals are generally smaller in size compared with adult ICU and hospitals. Forecasting errors are generally more significant when the bed capacity of a unit is smaller. Thus, in the PICU case, forecasting was expected to be more challenging. For this reason, when generalising this method to forecast adult ICU bed availability, we would expect it to be slightly easier to get a better forecasting performance relative to the bed capacity as adult ICU usually have in excess of 12 beds.
We treated the number of beds below zero (ie, the use of unfunded beds) as the same as zero beds in our models. This was a pragmatic decision agreed between our research team and clinical management team prior to commencing this current study. This was because there was no prior study that used hospital bed management data to forecast ICU bed availability in the literature. We, therefore, wanted to simplify the methodology to successfully answer the main research question of whether ICU bed availability forecasting could be achieved using hospital bed management data and time series modelling. Now that we have answered this important question, it is possible in the future to treat the unfunded beds in the way that they are recorded in the hospital bed management data or to treat them as zero as in this current work, at the time of developing a specific model for a specific ward/unit. That should be a joint decision between the data team and the key stakeholders of the ward/unit where the specific model may be applied.