Modelling admission lengths within psychiatric intensive care units
•,,,,,,.
...
Abstract
Objectives To examine whether discharge destination is a useful predictor variable for the length of admission within psychiatric intensive care units (PICUs).
Methods A clinicianled process separated PICU admissions by discharge destination into three types and suggested other possible variables associated with length of stay. Subsequently, a retrospective study gathered proposed predictor variable data from a total of 368 admissions from four PICUs. Bayesian models were developed and analysed.
Results Clinical patienttype grouping by discharge destination displayed better intraclass correlation (0.37) than any other predictor variable (next highest was the specific PICU to which a patient was admitted (0.0585)). Patients who were transferred to further secure care had the longest PICU admission length. The best model included both patient type (discharge destination) and unit as well as an interaction between those variables.
Discussion Patient typing based on clinical pathways shows better predictive ability of admission length than clinical diagnosis or a specific tool that was developed to identify patient needs. Modelling admission lengths in a Bayesian fashion could be expanded and be useful within service planning and monitoring for groups of patients.
Conclusion Variables previously proposed to be associated with patient need did not predict PICU admission length. Of the proposed predictor variables, grouping patients by discharge destination contributed the most to length of stay in four different PICUs.
What is already known on this topic
A specific mandatory tool has previously been developed to identify patient needs by clustering patient presentations.
What this study adds
Grouping of patient needs by clinical staff in psychiatric intensive care services is a better indicator of need (as measured by admission length) than the established tool.
How this study might affect research, practice or policy
Bayesian modeling can aid service monitoring and planning by considering different proposed predictor variables. By grouping patient need into categories, staff can develop more appropriate care pathway plans for patients.
Background
Psychiatric intensive care units (PICUs) admit patients in acutely disturbed phases of a severe mental disorder. Patients display increased risk to themselves or others that does not enable safe, therapeutic management within less secure conditions. Despite most admissions coming from acute psychiatric wards, over a third can come from the community or police custody.1
In England and Wales, PICU standards state admission length must be appropriate to clinical need and risk, but generally should not exceed 8 weeks.2 Despite this, lengths of stay range from a day to over a year.3 Some studies have shown association between diagnosis and admission length in acute psychiatric services or PICU.4 5 Although diagnosis can indicate what medical treatment is required, acute management and its duration also depend on severity, comorbidity, periodicity and other features.
A specific tool was developed to identify patient needs by clustering presentations according to the Mental Health Clustering Tool (MHCT) (NHSEngland 2016).6 This is based on the Health of the Nation Outcome Scale7 and was put forward as a solution to assess local service needs.8 MHCT usage has been questioned by some clinicians who described distinct clinical patient types within PICUs.9–11 Patients with ‘typical’ needs are subsequently transferred to acute psychiatric wards or discharged. Those transferred to forensic psychiatry settings have different, specific needs, and those discharged to more specialised services have particular, complex requirements. Discharge destination is a measure of observed need, is crucial to care planning and something that clinical teams should be aware of early in a patient’s admission (if not beforehand).
The primary objective of this study was to use statistical techniques to determine whether discharge destination is a useful predictor variable for length of stay (LoS) within PICUs.
Methods
Patient types
A clinical reference group developed mutually exclusive PICU patient types based on where patients are discharged to:
Typical: adult acute psychiatric inpatient services or community settings.
Longer secure care: longer term secure environments (eg, Medium or Low Secure Units, ‘Locked Rehabilitation’ Units).
Other: other psychiatric settings (eg, older adult wards, mother and baby units, another PICU or other specialist services).
Predictor variables
The aim was to choose routinely recorded, easy to collect and clinically relevant data. Number of variables was minimised to ensure clinical usefulness (table 1).
Table 1

Predictor variables
Diagnosis was measured categorically using International Classification of Diseases (ICD) criteria.12 Clusters and diagnoses may have some relation to length of PICU admission, but effects could interact with each other and/or with discharge destinations. It could be that gender influences time spent in a PICU and there may be different causes for agitated behaviour in the elderly compared with the younger population. Age was transformed into a categorical variable (following discussion within the reference group).
Outcome variable
LoS was measured in days. Clinical experience suggests that the distribution will be skewed by a minority of patients remaining for extended periods. To remedy this, admission lengths were transformed using natural logarithms. The number of days+1 was used when performing transformations to avoid potentially taking a logarithm of zero.
Method
When deciding sample size, previous research provided approximations for power analysis.11 From that research, estimations were obtained using similar classifications described by that paper. The mean (admission length_{(days)} + 1) for a typical (type 1) admission was approximated to be 3.00, for a longer secure care (type 2) admission 4.25, and for the other (type 3) patienttype group 3.65. Overall variance was estimated at 1.2. Power analysis using these figures for a oneway analysis of variance (ANOVA) suggested that each group should contain 25 patients for a power of 0.95 at a 5% significance level.
In this study, attempts were made to collect data retrospectively from each PICU on 40 consecutive discharges of patients in each patienttype group. Due to circumstances surrounding COVID19, the date of last discharge was taken to be 29 February 2020 (before the pandemic impacted on UK healthcare). Data collection was limited to a retrospective period of 4 years.
Bayesian models were developed that examined LoS distribution. Using standards and clinical experience, a prior for the mean admission, length was deemed to be 6 weeks (42 days). So (admission length+1) = (43) = 3.76. This was decreased to 3.75 for a prior.
In a normal distribution, heuristically nearly all observations lie within three SDs of the mean. If the minimum value is zero and assuming normality with a mean 3.75, 3σ=3.75, thus σ=1.25. The 0.95th quantile of admission length = e^{3.75+(1.96×1.25)} = e^{6.2} = 492 days. Patients rarely spend more than a year in a PICU (but it is not unheard of), so 1.25 seemed an acceptable value to use for the (admission length+1) SD weakly informative prior.
The first model allowed separate intercepts for each patient type (based on discharge destination):
Priors were:
Similar models were developed using diagnosis (ICD10 chapters) and cluster instead of patient type.
When estimating prior distributions for variance in hierarchical models, if the number of groups is less than 5, the halft family of prior distributions is recommended.13 HalfCauchy priors were used for spread of variable means in this project. Based on initial data inspection, an appropriate prior (that allowed for flexibility) was decided to be a halfCauchy with parameter 2 (a value slightly higher than expected for the SD of the underlying mean). Given uncertainty of what variables contributed most and in order to make priors extremely weak, it was decided to use this prior for an SD of all means in all proposed hierarchical models.
The first model can be improved on by considering the dependency between LoS for each unit. This was handled using mixedlinear modelling by specifying a byunit intercept (), thus allowing each unit to have a general level of variability. This strategy corresponds to the second model:
Here, priors were:
A third model adds a varying intercept for the interaction between unit and patient type:
With priors:
Similar methods could be established to examine other factors (such as diagnosis or clustering) that could affect admission length. Age and gender variables were added to models in a hierarchical fashion in conjunction with the patienttype variable to assess whether there were interactions which improved the model.
Models gave flexibility by allowing error variance to change between patient types, units and interaction between the two. Partial pooling of information could account for uncertainty when estimating grouplevel effects and provide stable estimates with the aid of weakly informative priors.
Data analysis
Data analysis was performed using the statistical package R14 with specific additional packages.15–26
Model generation
Stan20 is a programming language for specifying statistical models and used as a Markov chain Monte Carlo sampler for Bayesian analyses. The brms package15 allows R code to interface with stan and was used within this project. Four chains were used in every model simulation, each run with a warmup of 1000 out of 5000 iterations. For reproducibility, a seed of 123 was used.
Model verification and comparisons
Autocorrelations of parameter values with previous draws, measured by successive lags from each chain, were obtained. Graphical displays of posterior predictive checks were performed and Q–Q plots of residuals (comparing observed and expected residuals) were generated.
In Bayesian statistics, rather than considering a posterior point estimate, it is more useful to work with posterior distributions, p_{post}(θ) = p(θ  y) and summarise predictive accuracy by using log pointwise predictive densities (lppd). This is calculated by evaluating the expectation using draws from posterior simulations, p_{post}(θ) and labelling θ^{s}, s=1,2,…,S. The expected log pointwise predictive density (elpdloo) is26 :
Model comparisons were performed by applying leaveoneout crossvalidation (LOOCV) using training and validation data sets: observations were left out one at a time, so the training set used N1 observations and the other observation as the validation sample. With LOOCV, predictive accuracy is evaluated by first computing a pointwise predictive measure and then taking the sum of these values over all observations to obtain a single measure.27
The LOO Information Criterion (LOOIC) (using the loo package within brms25) estimates expected lppd by integrating over uncertainty in the parameters and, thus, does not assume that the posterior distribution is multivariate normal.
Although individually, the elpdloo and LOOIC estimates have little intrinsic value, it is helpful to compare values between models. Difference is computed relative to the model with the highest elpdloo/LOOIC. When that difference is positive, the expected predictive accuracy is higher for the model with the largest positive difference. The model with the largest negative valued difference is the worstperforming model.
Results
From the four units involved, data were acquired from 368 patients (table 2).
Table 2

Patient records used in data analysis
Data from 40 type 1 (typical) patients were easily collected from each unit. Of the other types, amount collected from units over the maximum 4year time period varied.
Survival plots for patient types are shown in figure 1A. These show lengthier admissions for the longer secure care group (patient type 2). Assuming an exponential distribution, hazard of discharge for this group is significantly lower (at the 5% level) than the other two groups. A logrank test for differences (using patienttype 1 as the reference) gave a value of 97.8 on two degrees of freedom (p≤2×10^{−6}). Survival plots of patients from the different units are shown in figure 1B. These plots are closer together than the patienttype survival distributions but do indicate differences between unit distributions. Indeed, a logrank test for differences (using unit 1 as reference) gave a value of 20.6 on three degrees of freedom (p=0.001). As survival plots are initial inspection tools, to make them easier to view and to avoid doubts concerning the power of logrank tests, only group subsets containing 20 or more patients were considered for other variables (figure 1C–E). The logrank test for differences in survival distributions gave p=0.0007 for ICD chapter differences, p=0.1 for cluster and p=0.03 for gender differences.
Plots of admission lengths and predictor variables (only groups containing at least 20 patients depicted).
Intraclass correlation (ICC) measures how similar outcomes of individuals within a group are likely to be, relative to those from other groups. Measurement is based on ANOVA, assuming normal distributions. Using (admission length_{(days)}+ 1) as the outcome variable, the ICC for patient type was 0.37, unit=0.0585, cluster=0.0425, ICD10 chapter=0.0259 and gender=0.014. None can be regarded as good, however, in the context of this data set, patient type was an order of magnitude better than any other.
ICC does not account for effect of interactions between groups, for example, the effect that patient type has may be confounded by the effect that unit has on admission length. Figure 1F shows how the patienttype variable varies with unit. This indicated that any model should include an interaction term between them.
From the three models described above, both elpdloo and LOOIC favoured model 3. Compared with model 1, the LOOIC difference for model 2 was 25.5 and for model 3, 42.4. Differences in elpdloo were 12.8 and 21.2, respectively. In models analogous to model 1, patient type was favoured above using either ICD_10 chapters or clustering measurements. There was an elpdloo difference of 49.3 for patient type versus clustering and 50.5 versus ICD_10 chapter. LOOIC differences were 98.5 and 100.9, respectively.
Adding further parameters to the model did not improve model comparison measures and there was some suggestion of model overfitting (with some models having a higher R^{2} value but negative differences in elpdloo and LOOIC compared with model 3).
The middle plot in the last row of figure 2 displays autocorrelation from chain 1 of model 3 (other chains are similar). It shows that correlation settled after 3–4 lags (ideally, it should be around zero from lag 1 onwards). It was decided that this was satisfactory and that no thinning or increase in draws was needed.27
Posterior predictive checks give the model’s predictive distribution for a replication of y, denoted y_{rep}. The first plot in figure 2’s final row shows posterior predictive distributions of 1000 draws from model 3. It shows a reasonable fit.
One method of plotting residuals, using the posterior predictive distribution, is to use what Kay has termed a probability residual.28 Here, for each observation, the predicted probability of generating a value less than or equal to the actual observation is calculated: . If the predictive distribution is well calibrated, these probabilities should be uniform and if the inverse cumulative distribution function of the standard Normal distribution is applied to these probability residuals, the result should be approximately standard normal. These are quantile residuals (z_residual): and the final plot in figure 2 shows a Q–Q plot of these residuals using model 3 to be acceptable.
Actual mean values can be calculated (noting the model gives values of log_{e}(days admitted+1)). Table 3 displays means, 0.025 and 0.975 quantiles of actual admission lengths for types of patients from different units.
Table 3

Model 3—admission lengths
Discussion
Results showed that patient typing based on clinical pathways has better predictive ability of admission length than clinical diagnosis or a specific tool developed to identify patient needs.
In one analysis of acute psychiatric admissions using ICD criteria and patient demographics, a model accounted for 15% of admission length variation.29 This current project’s less detailed and perhaps more pragmatic model accounted for over double that figure (37%). Although still not sizeable, this type of modelling could well give rise to useful avenues to pursue.
One question to be asked is whether correct data were collected and used appropriately? As always, balance needed to be made between pragmatism and level of detail. There may well be other clinical or nonclinical variables that are useful (eg, level of patient engagement with treatment, or severity and amount of behavioural disturbance, number of previous admissions), but the ones collected are routinely measured by clinicians, thus making it easier to compare different PICUs. One variable measured but not used in this project was from where an individual was admitted. To check this was not an ‘important’ variable, a classification tree method was used to ensure that it did not separate the data. As expected, the most important independent variables were patient type and unit. These were followed by age and ICD chapter. It seems reasonable to conclude that the variables collected were the easiest to obtain and most relevant for this project.
Models were developed hierarchically. Partial pooling of units was implemented. As well as defining intercept LoS parameters for each patient type, this model allows each unit to have a mean LoS associated with it that comes from a global distribution. Since the units are a random sample in themselves, interaction terms modelling the pattern of their effect on LoS from one patient type to another were expressed as random effects. There are then two levels of random effects: the for unit and the for the patient type within each unit.
It may be that the choice of priors (especially for the SD of means) could have been better. This is reflected by wide intervals between 0.025 and 0.975 quantiles in table 3. Clinical judgement and a wish to have weakly informative priors gave rise to initial priors. When performing analysis using other priors, brms default settings (flat priors for patient type and student t distribution centred on zero with 3° of freedom and a 2.5 SD for the sigmas) gave rise to divergent transitions. However, more defined prior specifications with patient type means of N(3.0, 0.5), N(4.0, 0.75) and N(3.0, 0.75) ln(admission+1)_{days}, respectively, and halfCauchy(1) distributions for and gave figures within brackets in table 3. This showed stability of means with noticeable decreases of widths between 0.025 and 0.975 quantiles. Therefore, in any further analysis, narrower priors should be considered.
Despite this limitation, it was satisfying to use a clinically pragmatic model that separated at least one of the patient types and had greater accuracy than more complicated models. Discharge planning is integral in patient management and it has transpired that discharge destination gives some indication of LoS. This could be used within service planning and monitoring for groups of patients.
Limitations
Sampling over different time periods could have confounding effects: changes in unit practice or fabric may have occurred over the time needed to collect data for one specific group, compared with time needed for another.
The data collection time period limit may have been too wide. This required balancing against numbers needed to perform effective analysis.
This pragmatic project was designed to maximise clinician input. Unless explained appropriately, Bayesian analysis could potentially alienate clinicians.
Prior specifications were too vague. However, the project provides a useful starting point for development of clinical patient types.
Conclusions
This study aimed to assess contribution of clinical patient typing to PICU admission length. Variables previously proposed to be associated with clinical severity and, therefore, time spent on PICUs were found not to be as useful as patient typing, which contributed most to admission length. The specific unit to which a patient was admitted also had influence on LoS. A further avenue to explore would be why units differ in distributions of admission lengths for specific patient types.