Cardiac surgery report cards: making the grade
2001; Elsevier BV; Volume: 72; Issue: 6 Linguagem: Inglês
10.1016/s0003-4975(01)03333-1
ISSN1552-6259
AutoresGary L. Grunkemeier, Kathryn J Zerr, Ruyun Jin,
Tópico(s)Healthcare Quality and Management
ResumoFor several years, data-driven methodologies have been used in an attempt to improve performance in cardiac surgery programs. The article by Shahian and his colleagues [1Shahian M Norman S.-L Torchiana D.F et al.Cardiac surgery report cards comprehensive review and statistical critique.Ann Thorac Surg. 2001; 72 ([Review]): 2155-2168Abstract Full Text Full Text PDF PubMed Scopus (236) Google Scholar] in this issue of The Annals provides a thoughtful and thorough comparison of "Report Cards" and continuous quality improvement (CQI) initiatives. Their conclusions are that CQI, including multi-disciplinary team site visits to identify and share processes and systems, has been proven to be effective. In contrast, "Report Cards" as currently implemented are not satisfactory and have potential to do harm. Their arguments are cogent and comprehensive. They claim that these cardiac surgery report cards are based on "sophisticated mathematical models" which engender "an exaggerated aura of scientific accuracy". Examining some of the deficiencies and limitations of the risk models used for cardiac surgery, from which the report card "Grades" are derived, provides support for their claim. The mathematical models in question are constructed using multivariable regression, which yields a formula which uses the risk factors for an individual patient to provide an estimate (expected value) of his outcome. Building such a model is not a fixed, reproducible exercise, and there are at least 9 reasons why different investigators with the same data set would produce different risk models [2Naftel D.C Do different investigators sometimes produce different multivariable equations from the same data?.J Thorac Cardiovasc Surg. 1994; 107: 1528-1529Abstract Full Text PDF PubMed Scopus (27) Google Scholar]. For a continuous outcome (cost, length of stay, etc) such a model can exactly predict or at least come close to a patient's observed value. But operative mortality is a binary outcome, and an ideal formula would result in a classification of alive or dead. Instead, logistic regression provides the expected mortality, the probability that the patient will be an operative death. This probability is always between 0% and 100%, so it will never match the observed mortality, which is either 0%, for survivors, or 100%, for deaths (Fig 1). The discrimination of such a model is measured by the C-index (area under the ROC curve). The C-index for the model in Figure 1 is 0.80. (Shahian notes that most cardiac surgery models have C-indices between 0.76–0.82.) A C-index of .50 indicates no discrimination and an index of 1.00 is perfect discrimination. So a value of 0.80 is only 60% of the way between worthless and perfect. Only in aggregate can we achieve agreement between observed and expected mortality. For example, if we have 20 patients with an expected mortality of 0.05 (5%) and one of them dies, we consider the model successful. But note that we do not say which of the 20 will die; this seems unsatisfactory, but that is the best we can do. Another reason for the poor performance of CABG risk models may be the complex pathways of the outcome (death) compared to the relative few risk factors available to predict it. Most current models use a stepwise regression procedure to allow only statistically significant factors. In general the larger the sample size, the more risk factors can be found (an informal rule says the number of risk factors identified should not exceed the number of deaths in the training set divided by 10). Table 1summarizes recently published multivariable risk models based on at least 2,000 patients [3Shroyer A.L Plomondon M.E Grover F.L Edwards F.H The 1996 coronary artery bypass risk model The Society of Thoracic Surgeons Adult Cardiac National Database.Ann Thorac Surg. 1999; 67: 1205-1208Abstract Full Text Full Text PDF PubMed Scopus (123) Google Scholar, 4Hannan E.L Kilburn Jr, H Racz M Shields E Chassin M.R Improving the outcomes of coronary artery bypass surgery in New York State.JAMA. 1994; 27: 761-766Crossref Scopus (671) Google Scholar, 5Ghali W.A Quan H Brant R Coronary artery bypass grafting in Canada national and provincial mortality trends, 1992–1995.CMAJ. 1998; 159: 25-31PubMed Google Scholar, 6Ghali W.A Ash A.S Hall R.E Moskowitz M.A Statewide quality improvement initiatives and mortality after cardiac surgery.JAMA. 1997; 277: 379-382Crossref PubMed Google Scholar, 7Weintraub W.S Wenger N.K Jones E.L et al.Changing clinical characteristics of coronary surgery patients. Differences between men and women.Circulation. 1993; 88: II79-II86PubMed Google Scholar, 8Grover F.L Johnson R.R Marshall G Hammermeister K.E Factors predictive of operative mortality among coronary artery bypass subsets.Ann Thorac Surg. 1993; 56: 1296-1306Abstract Full Text PDF PubMed Scopus (117) Google Scholar, 9Iyer V.S Russell W.J Leppard P Craddock D Mortality and myocardial infarction after coronary artery surgery. A review of 12,003 patients.Med J Aust. 1993; 159: 166-170PubMed Google Scholar, 10Ivanov J Tu J.V Naylor C.D Ready-made, recalibrated, or remodeled? Issues in the use of risk indexes for assessing mortality after coronary artery bypass graft surgery.Circulation. 1999; 99: 2098-2104Crossref PubMed Scopus (137) Google Scholar, 11Higgins T.L Estafanous F.G Loop F.D et al.ICU admission score for predicting morbidity and mortality risk after coronary artery bypass grafting.Ann Thorac Surg. 1997; 64: 1050-1058Abstract Full Text Full Text PDF PubMed Scopus (123) Google Scholar, 12Mozes B Olmer L Galai N Simchen E A national study of postoperative mortality associated with coronary artery bypass grafting in Israel. ISCAB Consortium. Israel Coronary Artery Bypass Study.Ann Thorac Surg. 1998; 66: 1254-1262Abstract Full Text Full Text PDF PubMed Scopus (24) Google Scholar, 13DeLong E.R Peterson E.D DeLong D.M et al.Comparing risk-adjustment methods for provider profiling.Statistics in Medicine. 1997; 16: 2645-2664Crossref PubMed Scopus (188) Google Scholar, 14O'Connor G.T Plume S.K Olmstead E.M et al.Multivariate prediction of in-hospital mortality associated with coronary artery bypass graft surgery. Northern New England Cardiovascular Disease Study Group.Circulation. 1992; 85: 2110-2118Crossref PubMed Scopus (357) Google Scholar, 15Reich D.L Bodian C.A Krol M et al.Intraoperative hemodynamic predictors of mortality, stroke, and myocardial infarction after coronary artery bypass surgery.Anesth Analg. 1999; 89: 814-822PubMed Google Scholar]. There are 44 different risk factors cited (the original number was larger, but some similar categories were combined). The number of independent risk factors cited by any one paper varied from 5 to 29, with the largest numbers being found by the largest series.Table 1Recently Published Risk Models for Coronary Bypass Surgical MortalityaMortality risk models are sorted horizontally by the number of patients. The risk factors are sorted vertically by the sum of their appearances in these models.Reference Number4Hannan E.L Kilburn Jr, H Racz M Shields E Chassin M.R Improving the outcomes of coronary artery bypass surgery in New York State.JAMA. 1994; 27: 761-766Crossref Scopus (671) Google Scholar5Ghali W.A Quan H Brant R Coronary artery bypass grafting in Canada national and provincial mortality trends, 1992–1995.CMAJ. 1998; 159: 25-31PubMed Google Scholar6Ghali W.A Ash A.S Hall R.E Moskowitz M.A Statewide quality improvement initiatives and mortality after cardiac surgery.JAMA. 1997; 277: 379-382Crossref PubMed Google Scholar7Weintraub W.S Wenger N.K Jones E.L et al.Changing clinical characteristics of coronary surgery patients. Differences between men and women.Circulation. 1993; 88: II79-II86PubMed Google Scholar8Grover F.L Johnson R.R Marshall G Hammermeister K.E Factors predictive of operative mortality among coronary artery bypass subsets.Ann Thorac Surg. 1993; 56: 1296-1306Abstract Full Text PDF PubMed Scopus (117) Google Scholar9Iyer V.S Russell W.J Leppard P Craddock D Mortality and myocardial infarction after coronary artery surgery. A review of 12,003 patients.Med J Aust. 1993; 159: 166-170PubMed Google Scholar10Ivanov J Tu J.V Naylor C.D Ready-made, recalibrated, or remodeled? Issues in the use of risk indexes for assessing mortality after coronary artery bypass graft surgery.Circulation. 1999; 99: 2098-2104Crossref PubMed Scopus (137) Google Scholar11Higgins T.L Estafanous F.G Loop F.D et al.ICU admission score for predicting morbidity and mortality risk after coronary artery bypass grafting.Ann Thorac Surg. 1997; 64: 1050-1058Abstract Full Text Full Text PDF PubMed Scopus (123) Google Scholar12Mozes B Olmer L Galai N Simchen E A national study of postoperative mortality associated with coronary artery bypass grafting in Israel. ISCAB Consortium. Israel Coronary Artery Bypass Study.Ann Thorac Surg. 1998; 66: 1254-1262Abstract Full Text Full Text PDF PubMed Scopus (24) Google Scholar13DeLong E.R Peterson E.D DeLong D.M et al.Comparing risk-adjustment methods for provider profiling.Statistics in Medicine. 1997; 16: 2645-2664Crossref PubMed Scopus (188) Google Scholar14O'Connor G.T Plume S.K Olmstead E.M et al.Multivariate prediction of in-hospital mortality associated with coronary artery bypass graft surgery. Northern New England Cardiovascular Disease Study Group.Circulation. 1992; 85: 2110-2118Crossref PubMed Scopus (357) Google Scholar15Reich D.L Bodian C.A Krol M et al.Intraoperative hemodynamic predictors of mortality, stroke, and myocardial infarction after coronary artery bypass surgery.Anesth Analg. 1999; 89: 814-822PubMed Google Scholar16Bernstein A.D Parsonnet V Bedside estimation of risk as an aid for decision-making in cardiac surgery.Ann Thorac Surg. 2000; 69: 823-828Abstract Full Text Full Text PDF PubMed Scopus (188) Google ScholarNumber of Patients174,21057,18750,35717,12813,36812,71212,0037,4914,9184,8353,6543,0552,152Number of Risk Factors29161376959799108SumAgeXXXXXXXXXXXX12GenderXXXXXXXXX9Surgical urgencyXXXXXXXX8Ejection fractionXXXXXXXX8Renal dysfunction/CreatinineXXXXXXX7Previous CABGXXXXXX6NYHA classXXXXXX6Left main diseaseXXXXXX6Diseased coronary vesselsXXXXXX6Peripheral vascular diseaseXXXXX5Diabetes mellitusXXXXX5Cerebrovascular diseaseXXXX4Intraop/postop VariablesXXXX4Myocardial infarctionXXXX4Body sizeXXX3Preoperative IABPXXX3Cardiogenic shock/unstableXXX3COPDXX2PTCAXX2AnginaXX2Intravenous nitratesXX2ArrhythmiasXX2History of heart operationXX2Hemodynamic instabilityXX2Charison comorbidity scoreXX2Dialysis dependenceXX2Pulmonary hypertensionXX2DiureticsXX2Systemic hypertensionX1Serum albuminX1RaceX1Previous CHFX1Myocardial infarction timingX1Cardiac indexX1LV end-diastolic pressureX1CVA timingX1Liver diseaseX1Neoplasia/Metastatic diseaseX1Ventricular aneurysmX1SteroidsX1DigitalisX1Thrombolytic therapyX1Arterial bicarbonateX1Calcified ascending aortaX1CABG = coronary artery bypass grafting; CHF = congestive heart failure; COPD = chronic obstructive pulmonary disease; CVA = cerebrovascular accident; IABP = intraaortic balloon pump; LV = left ventricular; NYHA = New York Heart Association; PTCA = percutaneous transluminal coronary angioplasty.a Mortality risk models are sorted horizontally by the number of patients. The risk factors are sorted vertically by the sum of their appearances in these models. Open table in a new tab CABG = coronary artery bypass grafting; CHF = congestive heart failure; COPD = chronic obstructive pulmonary disease; CVA = cerebrovascular accident; IABP = intraaortic balloon pump; LV = left ventricular; NYHA = New York Heart Association; PTCA = percutaneous transluminal coronary angioplasty. What is the correct number, and which particular factors are essential? The principle of parsimony says to use the simplest possible model to explain a phenomenon. But some investigators, based on clinical as well as technical arguments, recommend using all known risk factors, whether statistically significant or not, rather than using a stepwise variable selection procedure [16Bernstein A.D Parsonnet V Bedside estimation of risk as an aid for decision-making in cardiac surgery.Ann Thorac Surg. 2000; 69: 823-828Abstract Full Text Full Text PDF PubMed Scopus (188) Google Scholar, 17Spiegelhalter D.J Probabilistic prediction in patient management and clinic trials.Statistics In Medicine. 1986; 5: 421-433Crossref PubMed Scopus (253) Google Scholar, 18Steyerberg E.W Eijkemans M.J Harrell Jr, F.E Habbema J.D Prognostic modelling with logistic regression analysis a comparison of selection and estimation methods in small data sets.Stat Med. 2000; 19: 1059-1079Crossref PubMed Scopus (612) Google Scholar]. Indeed a recently published textbook states that "Stepwise variable selection … violates every principle of statistical estimation and hypothesis testing", and goes on to list 7 serious problems associated with stepwise procedures [19Harrell Jr, F.E Regression modeling strategies with applications to linear models, logistic regression, and survival analysis. Springer-Verlag, New York2001Crossref Google Scholar, page 56]. An optimal model no doubt would include many more risk factors than do any of the above models. To grade a provider (doctor, clinic) the expected number of deaths (the sum of the expected mortality for all the patients in the group, based on the risk model) is compared to the observed number of deaths. Dividing by the number of patients yields the provider's expected (E) and observed (O) mortality, respectively. To make comparisons, the E's are usually considered to be without error (!), and sampling error is attached to the O's by assuming, incorrectly, that they are statistically independent. This has the effect of identifying too many outliers (in either direction). Hierarchical models, as advocated by Shahian, extend the traditional risk model structure to compensate for this by reducing the overly optimistic precision of the estimates [20Iezzoni L.I Risk adjustment for measuring healthcare outcomes. Second edition. Health Administration Press, Chicago1997Google Scholar, pages 507 to 511]. Incorporating hospital or surgeon as another dimension of variability results in more realistic estimates of the provider effects, dampening or shrinking them towards the mean value for all providers. Figure 2 illustrates the use of a heirarchical model, using data from nine hospitals. For comparison, a conventional logistic regression model was used to provide expected mortality for each hospital and the O/E ratios ranged from 0.62 to 1.62 (horizontal axis). Confidence intervals were computed for each hospital [20Iezzoni L.I Risk adjustment for measuring healthcare outcomes. Second edition. Health Administration Press, Chicago1997Google Scholar, page 488]. The 95% confidence intervals (solid horizontal lines) for 2 of the hospitals do not include the value 1 (vertical line), so they would be considered abnormal by this simplistic analysis, and the "significantly high" one would get a "flunking" grade. However, if 99.4% confidence intervals (dashed horizontal lines) are used instead to ensure protection against type I error (finding spurious significance due to multiple simultaneous comparisons), then all of the hospitals would get passing grades.Fig 1Comparison of observed to expected mortality for individual patients, based on a risk model using Providence Health System data. For a continuous variable, one can expect the data points to be clustered around the line of identity, but for a binary outcome variable, this is not possible. The points fall far from this line, yet this risk model from which the expected mortality was derived has a C-index of 0.80.View Large Image Figure ViewerDownload (PPT) The heirarchical model, which incorporated hospital as a random factor, produced O/E ratios as shown by the vertical axis. The value for each hospital moved towards the mean of all hospitals, and the interhospital range was greatly reduced (0.82 to 1.25). This "shrinkage" is greater for the smaller hospitals since they have less evidence to establish behavior different from the mean. The 95% confidence intervals (not shown) for these estimates all include the value of 1, so again all of the hospitals make the grade. The current methods used to determine risk factors, construct risk models and compute variability result in anti-conservative comparisons of provider effects. The report by Shahian and his colleagues provides a masterful discussion of these and related issues. It deserves a careful reading.
Referência(s)