Artigo Acesso aberto Revisado por pares

Generalizability of a Simple Approach for Predicting Hospital Admission From an Emergency Department

2013; Wiley; Volume: 20; Issue: 11 Linguagem: Inglês

10.1111/acem.12244

ISSN

1553-2712

Autores

Jordan Peck, Stephan A. Gaehde, Deborah Nightingale, David Y. Gelman, David S. Huckins, Mark F. Lemons, Eric W. Dickson, James C. Benneyan,

Tópico(s)

Pneumonia and Respiratory Infections

Resumo

The objective was to test the generalizability, across a range of hospital sizes and demographics, of a previously developed method for predicting and aggregating, in real time, the probabilities that emergency department (ED) patients will be admitted to a hospital inpatient unit. Logistic regression models were developed that estimate inpatient admission probabilities of each patient upon entering an ED. The models were based on retrospective development (n = 4,000 to 5,000 ED visits) and validation (n = 1,000 to 2,000 ED visits) data sets from four heterogeneous hospitals. Model performance was evaluated using retrospective test data sets (n = 1,000 to 2,000 ED visits). For one hospital the developed model also was applied prospectively to a test data set (n = 910 ED visits) coded by triage nurses in real time, to compare results to those from the retrospective single investigator–coded test data set. The prediction models for each hospital performed reasonably well and typically involved just a few simple-to-collect variables, which differed for each hospital. Areas under receiver operating characteristic curves (AUC) ranged from 0.80 to 0.89, R2 correlation coefficients between predicted and actual daily admissions ranged from 0.58 to 0.90, and Hosmer-Lemeshow goodness-of-fit statistics of model accuracy had p > 0.01 with one exception. Data coded prospectively by triage nurses produced comparable results. The accuracy of regression models to predict ED patient admission likelihood was shown to be generalizable across hospitals of different sizes, populations, and administrative structures. Each hospital used a unique combination of predictive factors that may reflect these differences. This approach performed equally well when hospital staff coded patient data in real time versus the research team retrospectively. Examinar la posibilidad de generalizar, en un grupo de hospitales de tamaño y demografia distintos, un método previamente desarrollado para predecir y acumular, a tiempo real, las probabilidades de los pacientes del servicio de urgencias (SU) de ser ingresados en una unidad de hospitalización (UH). Se desarrollaron modelos de regresión lineal que estiman las probabilidades de ingreso de cada paciente que acude a un SU. Los modelos se basaron en el desarrollo (n = 4.000 a 5.000 visitas al SU) y la validación retrospectiva (n = 1.000 a 2.000 visitas al SU) de un conjunto de datos de cuatro hospitales heterogéneos. El rendimiento del modelo se realizó mediante un conjunto de datos evaluados de forma retrospectiva (n = 1.000 a 2.000 visitas al SU). Para un hospital el modelo desarrollado también se aplicó de forma prospectiva a un conjunto de datos evaluados (n = 910 visitas al SU), codificados por los enfermeros del triaje en tiempo real, con el fin de comparar los resultados con aquéllos del conjunto de datos evaluados codificados por cada investigador de forma retrospectiva. Los modelos de predicción de cada hospital rindieron razonablemente bien e incluyeron sólo unas pocas variables recogidas de forma sencilla, que difirieron en cada hospital. Las áreas bajo la curva ROC fueron de 0,80 a 0,89, los coeficientes de correlación R2 entre los predichos y los ingresos diarios actuales fueron de 0,58 a 0,90, la bondad del ajuste estadística de Hosmer-Lemeshow de certeza del modelo tuvo una p >0,01 con una excepción. Los datos codificados prospectivamente por los enfermeros del triaje produjeron resultados comparables. La certeza de los modelos de regresión para predecir la probabilidad de ingreso del paciente desde el SU fue generalizable en hospitales de diferentes tamaños, poblaciones, y estructuras administrativas. Cada hospital utilizó una única combinación de factores predictivos que pueden reflejar estas diferencias. Esta aproximación se realizó igual de bien cuando la plantilla del hospital codificó los datos del paciente en tiempo real frente al equipo investigador de forma retrospectiva. Nationally, emergency department (ED) crowding continues to prevent patients from receiving timely high-quality care.1, 2 The prevalence of this issue has led researchers from many domains (medical, managerial, engineering) to seek innovative solutions to ED crowding.3-6 These solutions include predictive modeling to improve coordination between EDs and inpatient units. The goal of this coordination is to minimize “boarding time,” the amount of time a patient remains in the ED once evaluation and treatment are completed, which is considered a primary contributor to ED crowding.1-7 Prediction models are used in many organizations to forecast demand and manage resources.8-12 Long-term forecasting allows a hospital manager to schedule appropriate resources over time. However, when resource flexibility is limited, changes to processes, organizational structure, and human factors may improve flow.13-15 On an hourly or daily basis, when total resources may be fixed, real-time ED-to-inpatient unit admission prediction can help hospital staff organize, prioritize, and adapt their work to manage short-term spikes in demand that exceed long-range average daily demand forecasts. Some previous studies have used data collected at patient arrival to an ED or before (such as at triage or even in an ambulance) to predict whether that individual patient eventually will require admission to an inpatient unit. While a patient is receiving ED treatment, this predictive information can be used to mobilize admission resources and reduce boarding time.16-20 A recent study compared several methods for predicting admissions at ED triage within one Veterans Health Administration (VHA) hospital.20 Logistic regression with a logit transform was found to perform best and used only a few predictive variables, suggesting potential for broader implementation in other ED settings. Given organizational differences between hospitals, exploring the generalizability of this approach across multiple hospitals is worthwhile.21 The objective of this article therefore is to assess performance of this method across four hospitals, and when data coding is performed by nurses rather than the research team. Retrospective patient visit data were collected from four hospitals: two VHA Medical Centers (VHA 1 and VHA 2), a private hospital, and a public hospital. The visit data were separated into development, validation, and test data sets for each hospital. In each case the development set was used to create logistic regression models using several combinations of factors. These models then were applied to the validation data sets to select the best set of independent variables and corresponding models for each hospital. The selected models were applied to the test data sets to compare performance across hospitals. All data sets were coded retrospectively by a single investigator (JSP). To study model performance when implemented by hospital personnel, an additional test data set was coded by triage nurses prospectively at VHA 1. All data analysis was performed using MATLAB (MathWorks, Inc., Natick, MA) and Excel 2007 (Microsoft Corp., Redmond, WA). All portions of this study were approved or granted exemption by the respective institutional review board of each hospital. Approval for the complete study design and implementation was provided by the institutional review board of VHA 1. The four hospitals in this study all are located in the northeastern United States. They include two Veteran's hospitals (VHA 1 and VHA 2), one private community hospital (Private), and one public teaching hospital (Public). Table 1 summarizes the characteristics of the participating hospitals and corresponding EDs. In the earlier single-facility study at VHA 1, six factors collected at triage were found to predict hospital admission: patient age, primary complaint, ED provider, designation (fast track vs. ED), arrival mode, and urgency using the Emergency Severity Index (ESI) score.20 The model building strategy for this study mimics that of the prior study as follows. Not all four hospitals collect the same data at ED triage, and different data are collected even among the two VHA hospitals. Table 2 summarizes patient data collected at triage by each hospital, which in turn affects the specific model created for each facility. Using the development data sets from each hospital, separate models were created for each possible combination of factors collected at that particular hospital. For example, different models were created for patient age; patient age and primary complaint; patient age, primary complaint, and ED provider; and so forth. This resulted in a maximum of 63 combinations per hospital (fewer for those that did not track all six factors). A validation data set was used for each site to assess the performance of each of the generated models and select the best final model. The performance of this final model, used for comparison between hospitals, then was determined using the test data sets. Table 3 summarizes the attributes of these data sets, including the test data set prospectively coded by triage nurses. The minimum data set sizes were selected to be similar to those used in the initial study, and when possible additional data were used. Following the approach developed previously, the probability of admission was estimated from the historical development data sets for each value of each categorized factor, denoted as P[Admit|Factor], and used as the independent variable values in the regression.20 Table 4 illustrates examples of these probability values for factors that are tracked at each hospital. To create the regression models, some data needed to be coded into categories. Age was categorized into decades, and primary complaint was categorized using a modified version of a previously published ED complaint coding system; all other factors already were in categorical formats.22 The investigator doing the coding (JSP) was blinded to the final admission status of patients. The predictive value of each regression model was assessed in two manners. The first was in terms of the ability of each to predict admission of individual patients. Individual admission predictability enables the model to be used in current work flow to order an inpatient bed preemptive of a provider decision. The ability of each model to perform in this way was measured using the area under the receiver operating characteristic curve (AUC) and the Hosmer-Lemeshow goodness-of-fit test.23 The second manner of assessment targets the ability of each model to predict an aggregate number of admissions across multiple patients. Summing of individual probabilities generates an expected total bed demand. This total bed demand can inform decisions of inpatient unit staff and bed managers, such as expediting discharges.20 Denoting Pn as the probability of admission for patient ‘n,’ the expected bed demand ‘E’ at any point in time is calculated as E = P1 + P2 + … + Pn. To illustrate, if three patients each have an admission probability of 45% then the expected bed demand that would be communicated to inpatient unit staff is 0.45 + 0.45 + 0.45 = 1.35 beds. In contrast, implementing the models using individual predictions with a 50% bed order threshold would result in no beds ordered. The accuracy of the models, when generating aggregate predictions, was evaluated by comparing estimated versus actual number of daily admissions via scatter plots, R2 statistics, and residuals. For the purpose of comparison, the larger volume hospitals (Private and Public) were evaluated for R2 by quarter-days. Although R2 and the resulting residuals are limited measures of prediction accuracy on their own, the combination provides useful insight. For each hospital the final regression model was chosen based on several criteria, first selecting from all models that had goodness-of-fit > 0.01 with the highest R2 value. If two models had similar R2 values, then the model with the lowest absolute residual value was used. This selection criterion always resulted in models with relatively high AUC values and for each hospital produced the following prediction independent variables: VHA 1—patient age, primary complaint, designation, and mode of arrival; VHA 2—patient age, primary complaint, and mode of arrival; Private—patient age, primary complaint, mode of arrival, and urgency; and Public—primary complaint, designation, and urgency. Table 5 summarizes the regression coefficients for the final models selected for each hospital (all with p < 0.01). Table 6 summarizes the accuracies of each final prediction model when applied to their respective test data sets, using the measures described in the data analysis section. As can be seen, all models perform reasonably well using any of the criteria. One exception is the goodness-of-fit for VHA 2, which had a p = 0.03 when applied to the validation set and p = 0.002 when applied to the test set. Figure 1 illustrates the ability of each model to categorize admission likelihoods into probability deciles. The figure was created by categorizing patients based on probability ranges (horizontal axis and vertical bars) and calculating what percentage of those patients eventually receives admission (vertical axis and lines). Goodness-of-fit is a summary measure of this fit; however, simple visual examination of Figure 1 shows that the models categorize patients well, although none generate the ideal 45-degree line. Figures 2 and 3 illustrate scatter plots of actual daily admissions versus the aggregated daily predictions for each of the hospitals' test data sets. These would ideally depict a one-to-one relationship, which would lead to an R2 of 1. The lowest R2 value, 0.58, while not as strong as desired, can still be considered good. The figures also show residuals between actual and predicted admission, which may show some small biases but are generally well scattered, suggesting random errors. Admission predictions can help inform mobilization of resources to make an inpatient unit bed available before a patient's ED treatment is complete, in turn helping improve flow, reduce bed waits, and alleviate ED crowding. Previous models to predict ED admission have focused mostly on an individual patient, typically estimating a yes/no admission and accordingly placing a bed order or not.18, 19 In contrast, aggregating these predictions together into a running expected total bed demand can provide more useful information to bed managers. At the individual patient level, all models had AUC of greater than 80%. All models except that for VHA 2 had high goodness-of-fit results and, in this case, model accuracy still was shown by the other measures presented in the data analysis section and the visual analysis of Figure 1. Implementing the prediction models on the individual patient level would require further analysis to choose a prediction probability threshold for ordering a bed that best balances positive effects of early action against negative effects of false-positives. At the aggregate level, predicting total admissions performs better due to risk pooling across patients, with R2 statistics and residuals indicating consistently high accuracy. As a more direct connection between the ED and inpatient unit demand, total bed demand accounts for the unique properties of patients in the ED at a specific point in time and can complement ED crowding scales for informing inpatient staff of ED demand.24, 25 Despite big picture similarities, not all hospitals follow the same processes for facilitating ED to inpatient unit flow. Data collection, coding, and logistics differences also can affect results. This might explain the variation in the predictive value of each factor across the four hospitals shown in Table 4 and in coefficient values across hospitals in Table 5. For example, differences exist in the use of ED versus fast track in VHA 1 and the use of north ward, south ward, and urgent care at the public hospital. Differences in the details of ESI implementation or bed assignment methods also may explain why the ESI 1 admission probability was low for VHA 2. Only the non-VHA hospitals had urgency (ESI level) as important independent variables. The increased predictive value of ESI in these hospitals may be coincidence, a reflection of the patient population, or the result of ESI implementation strategies. Both the public hospital and the VHA 1 tracked patient location, which was a statistically significant predictor in both cases. It therefore may be valuable for other hospitals to track this information, if separate location designations exist, to use in a prediction model. Arrival mode also was a significant variable for all hospitals except the public hospital, which suggests it may be useful to track this information also. Another result of differences between hospital processes and data collection is that a generalized model could not be created across all hospitals. Instead it is shown that the approach was generalizable for creating accurate logistic regression prediction models using multiple factors collected at triage. Finally, all retrospective data in our study were coded by a single researcher, while in practice the person doing the coding would likely be a medical practitioner. Because this may affect results, we also included a test data set that was prospectively coded by the VHA 1. When applied to this data set, the model from the VHA 1 had comparably strong results, again suggesting approach generalizability and usability. Further, the prediction method used here, ordinary logistic regression, is relatively simple and relies on a manageable amount of data and little statistical expertise. Future work might focus on further validating this approach at other hospitals, developing dynamic models that update predictions as a patient progresses through the ED, and creating strategies for effectively using predictive information to improve hospital patient flow. Whether using aggregate predictions or individual predictions, there must be clear policies for responding to a prediction based on the trade-offs of reacting to uncertain information. Although the same approach worked well in all four hospitals, this probably is too few test settings to yet claim broad generalizability. Because the significant independent variables can differ for any given ED, implementation in each new setting will require collecting data and rerunning the regression analysis. Even in the same hospital setting, these models also may need periodic updating over time. Finally, for convenience a preexisting primary complaint coding methodology was used, whereas a different coding system could improve results.22 This study illustrated the generalizability of a logistic regression approach for predicting ED-to-inpatient unit bed demand in four hospitals of varying size, patient populations, and operating structures. Results of those models can be used to influence desirable bed management behaviors and make resource allocation decisions that improve patient flow between EDs and inpatient units. While the independent predictive variables differ by facility, the general approach appears to perform well in each setting. In one hospital it was shown that the approach also performs equally well when coding is prospectively conducted by triage nurses rather than retrospectively by the research team. The authors thank the staff of the participating hospitals for their input, support, and participation; Professor Stephen Graves for general feedback; and Stephanie Triplett who developed the Web interface for collecting nurse-coded data.

Referência(s)