Artigo Acesso aberto Revisado por pares

Net Reclassification Index: Measuring the Incremental Value of Adding a New Risk Factor to an Existing Risk Model

2015; Elsevier BV; Volume: 99; Issue: 2 Linguagem: Inglês

10.1016/j.athoracsur.2014.10.084

ISSN

1552-6259

Autores

Gary L. Grunkemeier, Ruyun Jin,

Tópico(s)

Healthcare cost, quality, practices

Resumo

A recent study by Scrascia and colleagues [1Scrascia G. Guida P. Caparrotti S.M. et al.Incremental value of anemia in cardiac surgical risk prediction with the European system for cardiac operative risk evaluation (EuroSCORE) II model.Ann Thorac Surg. 2014; 98: 869-875Abstract Full Text Full Text PDF PubMed Scopus (27) Google Scholar] based on 4,594 cardiac surgeries from 5 Italian hospitals, demonstrated that the addition of preoperative anemia significantly improved the predictive ability of the European system for cardiac operative risk evaluation II (ES2) risk model for cardiac surgery operative mortality.The authors used hemoglobin as the measure of anemia, and investigated this claim very thoroughly. To document its significance, they used p values for the estimated coefficients of anemia, the increase in the C statistic (area under the receiver operating characteristic curve [AUC]) with anemia added, and a relatively new measure, the Net Reclassification Index (NRI).We wondered whether the same is true for The Society of Thoracic Surgeons (STS) risk models for operative mortality; that is, whether the addition of preoperative anemia would improve their predictive ability. And, we wished to take this opportunity to introduce the NRI, an interesting but not well-known reclassification statistic, to the readers of The Annals.Clinical MaterialThis project involved only the study of deidentified, existing records and thus qualified for exemption from Institutional Review Board approval based on Exemption 45 CFR 46.101(b) (4).From January 2008 through June 2014, 28,683 cardiac surgeries were performed in 14 Providence Health & Services (PHS) hospitals. While ES2 is a single risk model applicable to all cardiac surgery [2Nashef S.A. Roques F. Sharples L.D. et al.EuroSCORE II.Eur J Cardiothorac Surg. 2012; 41: 734-745Crossref PubMed Scopus (1680) Google Scholar], there are separate STS risk models for operative mortality of only 3 pure subgroups; isolated coronary artery bypass grafting (CABG) [3Shahian D.M. O'Brien S.M. Filardo G. et al.The society of thoracic surgeons 2008 cardiac surgery risk models: part 1–coronary artery bypass grafting surgery.Ann Thorac Surg. 2009; 88: S2-S22Abstract Full Text Full Text PDF PubMed Scopus (802) Google Scholar], isolated valve [4O'Brien S.M. Shahian D.M. Filardo G. et al.The Society of Thoracic Surgeons 2008 cardiac surgery risk models: part 2–isolated valve surgery.Ann Thorac Surg. 2009; 88: S23-S42Abstract Full Text Full Text PDF PubMed Scopus (956) Google Scholar], and CABG plus isolated valve [5Shahian D.M. O'Brien S.M. Filardo G. et al.The Society of Thoracic Surgeons 2008 cardiac surgery risk models: part 3–valve plus coronary artery bypass grafting surgery.Ann Thorac Surg. 2009; 88: S43-S62Abstract Full Text Full Text PDF PubMed Scopus (364) Google Scholar]. The STS does not yet have risk models for the other more complex surgeries. Thus, an STS risk prediction was available for only 21,312 (74%) of PHS patients, which were predominantly CABG, and none of the more complex "Other" types (Table 1). The differences in overall operative mortality (Table 1) reflect the different case-load distributions (the mortality for PHS patients in the Other category, without an STS risk model, was 6.7%).Table 1Comparison of Patients and Surgery TypesVariableScrascia et alPHSSurgery dates2011–20132008–2014No. of hospitals514No. of patients4,59423,312Mean (SD) age67 (11)67 (11)Percent female34%29%Surgery type Isolated CABG36%60% Isolated valve23%25% CABG + Valve9%15% Other31%0%Mortality5.9%2.4%CABG = coronary artery bypass grafting; PHS = Providence Health & Services. Open table in a new tab Scrascia and colleagues [1Scrascia G. Guida P. Caparrotti S.M. et al.Incremental value of anemia in cardiac surgical risk prediction with the European system for cardiac operative risk evaluation (EuroSCORE) II model.Ann Thorac Surg. 2014; 98: 869-875Abstract Full Text Full Text PDF PubMed Scopus (27) Google Scholar] used preoperative hemoglobin to measure anemia. The STS database does not contain hemoglobin so we used preoperative hematocrit, divided by 3, to approximate the hemoglobin count. The distributions of preoperative hematocrit by sex, and the categories of anemia corresponding to those of Scrascia and colleagues, are shown in Figure 1. Based on this categorization, the distribution of anemia levels was quite similar between the data of Scrascia and colleagues and the PHS data (Table 2). For both groups the mortality approximately doubles for each incremental step in anemia severity (Table 2).Table 2Comparison of Anemia: Frequency and Associated MortalityAnemiaScrascia et alPHSFrequencyMortalityFrequencyMortalityNone64.9%3.4%61.9%1.5%Mild22.2%7.7%26.2%2.9%Moderate12.6%11.6%Severe0.3%15.7%0.2%6.0%PHS = Providence Health & Services. Open table in a new tab Statistical MethodsThe statistical significance of adding an additional risk factor to a model is immediately determined by the p value associated with its coefficient or odds ratio. Another measure of statistical significance can be derived from the increase in the C statistic (AUC) of the models with and without the additional risk factor. The C statistic measures discrimination, the ability of the risk model to assign higher scores to those who will die than to those who will live. However, it often happens that a highly significant additional risk factor will only improve the C statistic a small amount; eg, from 0.852 to 0.860 with the addition of anemia, in the Scrascia and colleagues [1Scrascia G. Guida P. Caparrotti S.M. et al.Incremental value of anemia in cardiac surgical risk prediction with the European system for cardiac operative risk evaluation (EuroSCORE) II model.Ann Thorac Surg. 2014; 98: 869-875Abstract Full Text Full Text PDF PubMed Scopus (27) Google Scholar] study. One would like to know the clinical impact of the improvement in the risk model achieved by the additional risk factor, with such a small increase of only 0.008, in the C statistic. To address this, Scrascia and colleagues used a relatively new method of measuring the clinical importance, the NRI.There are 2 versions of the NRI statistic: the original (categorical) NRI, introduced by Pencina and colleagues in 2008 [6Pencina M.J. D'Agostino Sr., R.B. D'Agostino Jr., R.B. Vasan R.S. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond.Stat Med. 2008; 27: 157-172Crossref PubMed Scopus (4687) Google Scholar]; and (2) the non-categorical (continuous) version, introduced by Pencina and colleagues in 2011 [7Pencina M.J. D'Agostino Sr., R.B. Steyerberg E.W. Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers.Stat Med. 2011; 30: 11-21Crossref PubMed Scopus (1691) Google Scholar]. Scrascia and colleagues used the latter, but we will briefly describe this methodology in the chronologic order that the 2 versions were introduced.Categorical Net Reclassification IndexSay the outcome of a risk model for mortality is categorical; high risk, medium risk, and low risk. These may be the ranges for which clinical decisions can be made; eg, if high, take some additional action, if low, do not, and if medium have a discussion to consider taking some action. Thus, it would be advantageous if the risk model correctly predicted more patients at higher risks of death into the highest category and those more likely to survive into the lowest. Unlike the C statistic, which can be computed for a single model, NRI needs 2 models, it is a comparison statistic. Here is how it works: suppose we are comparing the risk prediction from an existing standard model (M) to that of a proposed model with an additional risk factor (M+). One way to compare the clinical value of the 2 risk models is to compare how the patients that were distributed into those 3 mortality categories by the 2 models actually fared, survival-wise. The categorical NRI does just that, it measures the net improvement, of M+ over M, in placing patients into the appropriate category; more deaths into a higher category, more survivors into a lower category.Category-Free (Continuous) NRIA limitation of the NRI described above is the dependence on the definition of categories. That is, it is not an absolute value, but will vary depending on the number of categories (3 in the above discussion) and the cut points used to define the categories (low, medium, and high in the above discussion).The way around this limitation is to eliminate the categories entirely, by treating each individual prediction as its own "category," as precisely as it is defined. Thus, for every patient who died (D), the M+ model is considered superior if it assigned that patient a higher risk (DH) than the original M model, and is considered inferior if it assigned that patient a lower risk (DL). And for patients who survived (S), the M+ model is better if it assigns a lower risk (SL) than did the original M model, and worse if it assigns a higher risk (SH).Then, DH − DL is the net improvement in predictions for those who died, and SL − SH is the net improvement in predictions for those who survived. The category-free NRI is the sum of these 2 components, taken as percentages of the D and S patients, respectively: (DH – DL)/D is the net proportion of dead patients who are assigned a higher risk by M+, and (SL – SH)/S is the net proportion of survivors who are assigned a lower risk. The M+ model will show improved prediction if both of these components are positive as it then assigns higher probabilities of dying to the deaths, and lower probabilities of dying to the survivors. The sum of these proportions is the category-free NRI (see the Appendix for a graphical derivation).ResultsScrascia and colleagues [1Scrascia G. Guida P. Caparrotti S.M. et al.Incremental value of anemia in cardiac surgical risk prediction with the European system for cardiac operative risk evaluation (EuroSCORE) II model.Ann Thorac Surg. 2014; 98: 869-875Abstract Full Text Full Text PDF PubMed Scopus (27) Google Scholar] found that the ES2 risk model had less good discrimination in anemia patients. The C statistics for the ES2 risk model were 0.862, 0.797, and 0.796 for none, mild, and moderate to severe anemia, respectively. We found a similar result. The STS risk models had C statistics of 0.808, 0.771, and 0.729 for the same 3 anemia subgroups, respectively (Fig 2).Fig 2Receiver operating characteristic curves for The Society of Thoracic Surgeons mortality risk prediction models, according to level of preoperative anemia. (AUC = area under the curve.)View Large Image Figure ViewerDownload (PPT)Adding Anemia to the Risk ModelsWhen added to ES2, anemia was significantly associated with mortality, with odds ratios of 1.6 (p = 0.004) for mild and 2.1 (p < 0.001) for moderate-to-severe anemia [1Scrascia G. Guida P. Caparrotti S.M. et al.Incremental value of anemia in cardiac surgical risk prediction with the European system for cardiac operative risk evaluation (EuroSCORE) II model.Ann Thorac Surg. 2014; 98: 869-875Abstract Full Text Full Text PDF PubMed Scopus (27) Google Scholar]. The STS risk models performed similarly, with odds ratios of 1.3 (p = 0.034) for mild and 1.6 (p < 0.001) for moderate-to-severe anemia.When anemia was added to ES2, the risk model improved in discrimination; the C statistic increased from 0.852 versus 0.860 (p = 0.007). The category-free NRI was 0.257 for deaths and 0.335 for survivors; this means that there was a 25.7% net improvement in the predictions of deaths and a 33.5% net improvement in the prediction of survivors. The total NRI was 0.592 (p < 0.001).Similarly, adding anemia to the STS risk model improved its discrimination; the C statistic improved from 0.806 to 0.809 (p = 0.052). The category-free NRI is graphically represented in Figure 3 and its components are summarized in Table 3. The category-free NRI was 0.266 for deaths and 0.093 for survivors; this means that there was a 26.6% improvement in the predictions of deaths and a 9.3% net improvement in the prediction of survivors. The total NRI was 0.360 (p < 0.001), with a 95% confidence interval of (0.272% to 0.447%).CommentOur investigation had 2 objectives. First, we wanted to see whether the conclusion of Scrascia and colleagues [1Scrascia G. Guida P. Caparrotti S.M. et al.Incremental value of anemia in cardiac surgical risk prediction with the European system for cardiac operative risk evaluation (EuroSCORE) II model.Ann Thorac Surg. 2014; 98: 869-875Abstract Full Text Full Text PDF PubMed Scopus (27) Google Scholar] that preoperative "Anemia provides significant incremental value." to the ES2 risk model for mortality, was also true for the STS risk models. All the standard statistical approaches, using the STS risk models on our PHS data, concurred with their conclusion. In establishing this conclusion the authors also used the relatively novel NRI statistic; our secondary objective was to investigate and explain this measure of reclassification.NRI Was Introduced Recently and Quickly Became Popular•The original version of NRI was introduced in 2008 [6Pencina M.J. D'Agostino Sr., R.B. D'Agostino Jr., R.B. Vasan R.S. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond.Stat Med. 2008; 27: 157-172Crossref PubMed Scopus (4687) Google Scholar], and endorsed in 2011 as "rapidly becoming the recommended approach" [8Sundström J. Byberg L. Gedeborg R. Michaëlsson K. Berglund L. Useful tests of usefulness of new risk factors: tools for assessing reclassification and discrimination.Scand J Public Health. 2011; 39: 439-441Crossref PubMed Scopus (41) Google Scholar].•The NRI was extended in 2011 to the category-free version, touted as "the most objective and versatile measure of improvement in risk prediction" [7Pencina M.J. D'Agostino Sr., R.B. Steyerberg E.W. Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers.Stat Med. 2011; 30: 11-21Crossref PubMed Scopus (1691) Google Scholar].•A 2011 critique of papers using reclassification statistics refers to NRI as the "currently most popular reclassification metric" [9Tzoulaki I. Liberopoulos G. Ioannidis J.P. Use of reclassification for assessment of improved prediction: an empirical evaluation.Int J Epidemiol. 2011; 40: 1094-1105Crossref PubMed Scopus (47) Google Scholar].•NRI has become widely used. A 2014 review [10Kerr K.F. Wang Z. Janes H. McClelland R.L. Psaty B.M. Pepe M.S. Net reclassification indices for evaluating risk prediction instruments: a critical review.Epidemiology. 2014; 25: 114-121Crossref PubMed Scopus (237) Google Scholar] states that there are approximately 500 papers that contain "net reclassification index."However, There Have Been Some Very Recent Papers Critiquing Use of the NRI•A review of the original NRI, the category-free NRI, and another re-classification metric, the Integrated Discrimination Improvement (IDI), also introduced by Pencina and colleagues [6Pencina M.J. D'Agostino Sr., R.B. D'Agostino Jr., R.B. Vasan R.S. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond.Stat Med. 2008; 27: 157-172Crossref PubMed Scopus (4687) Google Scholar], concluded the following: "Of the new metrics, the IDI … best represents the value of a new test" [11Pickering J.W. Endre Z.H. New metrics for assessing diagnostic potential of candidate biomarkers.Clin J Am Soc Nephrol. 2012; 7: 1355-1364Crossref PubMed Scopus (131) Google Scholar].•"The present paper shows IDI and NRI themselves to offer guidance that cannot be trusted"[12Hilden J. Gerds T.A. A note on the evaluation of novel biomarkers: do not rely on integrated discrimination improvement and net reclassification index.Stat Med. 2014; 33: 3405-3414Crossref PubMed Scopus (108) Google Scholar].•"In addition, the category-free index can mislead investigators by overstating the incremental value of a biomarker, even in independent validation data. When investigators want to test a null hypothesis of no prediction increment, the well-established tests for coefficients in the regression model are superior to the net reclassification index" [10Kerr K.F. Wang Z. Janes H. McClelland R.L. Psaty B.M. Pepe M.S. Net reclassification indices for evaluating risk prediction instruments: a critical review.Epidemiology. 2014; 25: 114-121Crossref PubMed Scopus (237) Google Scholar].•"Conclusions about biomarker performance that are based primarily on a statistically significant NRI statistic should be treated with skepticism. Use of NRI P values in scientific reporting should be halted" [13Pepe M.S. Janes H. Li C.I. Net risk reclassification p values: valid or misleading?.J Natl Cancer Inst. 2014; 106: dju041Crossref PubMed Scopus (59) Google Scholar].A recent paper, from the authors of the original NRI papers [6Pencina M.J. D'Agostino Sr., R.B. D'Agostino Jr., R.B. Vasan R.S. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond.Stat Med. 2008; 27: 157-172Crossref PubMed Scopus (4687) Google Scholar, 7Pencina M.J. D'Agostino Sr., R.B. Steyerberg E.W. Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers.Stat Med. 2011; 30: 11-21Crossref PubMed Scopus (1691) Google Scholar], discusses many aspects of the implementation of NRI and IDI [14Xanthakis V. Sullivan L.M. Vasan R.S. et al.Assessing the incremental predictive performance of novel biomarkers over standard predictors.Stat Med. 2014; 33: 2577-2584Crossref PubMed Scopus (18) Google Scholar]. Among their recommendations is that "authors report the effect of a new biomarker after controlling for standard predictors modeled as individual variables." Scrascia and colleagues did this and found that the NRI was only 0.517, compared with 0.592 when the ES2 model was represented by only a single dependent variable (the Predictive Index).Scrascia and colleagues [1Scrascia G. Guida P. Caparrotti S.M. et al.Incremental value of anemia in cardiac surgical risk prediction with the European system for cardiac operative risk evaluation (EuroSCORE) II model.Ann Thorac Surg. 2014; 98: 869-875Abstract Full Text Full Text PDF PubMed Scopus (27) Google Scholar] demonstrated that anemia can improve the ES2 risk model's mortality prediction for cardiac surgery, using data from Italian patients. Our study, using the STS risk models, and patients from the western US, replicates their conclusion. Scrascia and colleagues used a relatively novel statistical tool, NRI, to support their claim. We discussed the derivation and meaning of that statistic and supplied its graphic implementation (Appendix). There is currently much active investigation into the use of the NRI as a measure of risk model improvement. A recent study by Scrascia and colleagues [1Scrascia G. Guida P. Caparrotti S.M. et al.Incremental value of anemia in cardiac surgical risk prediction with the European system for cardiac operative risk evaluation (EuroSCORE) II model.Ann Thorac Surg. 2014; 98: 869-875Abstract Full Text Full Text PDF PubMed Scopus (27) Google Scholar] based on 4,594 cardiac surgeries from 5 Italian hospitals, demonstrated that the addition of preoperative anemia significantly improved the predictive ability of the European system for cardiac operative risk evaluation II (ES2) risk model for cardiac surgery operative mortality. The authors used hemoglobin as the measure of anemia, and investigated this claim very thoroughly. To document its significance, they used p values for the estimated coefficients of anemia, the increase in the C statistic (area under the receiver operating characteristic curve [AUC]) with anemia added, and a relatively new measure, the Net Reclassification Index (NRI). We wondered whether the same is true for The Society of Thoracic Surgeons (STS) risk models for operative mortality; that is, whether the addition of preoperative anemia would improve their predictive ability. And, we wished to take this opportunity to introduce the NRI, an interesting but not well-known reclassification statistic, to the readers of The Annals. Clinical MaterialThis project involved only the study of deidentified, existing records and thus qualified for exemption from Institutional Review Board approval based on Exemption 45 CFR 46.101(b) (4).From January 2008 through June 2014, 28,683 cardiac surgeries were performed in 14 Providence Health & Services (PHS) hospitals. While ES2 is a single risk model applicable to all cardiac surgery [2Nashef S.A. Roques F. Sharples L.D. et al.EuroSCORE II.Eur J Cardiothorac Surg. 2012; 41: 734-745Crossref PubMed Scopus (1680) Google Scholar], there are separate STS risk models for operative mortality of only 3 pure subgroups; isolated coronary artery bypass grafting (CABG) [3Shahian D.M. O'Brien S.M. Filardo G. et al.The society of thoracic surgeons 2008 cardiac surgery risk models: part 1–coronary artery bypass grafting surgery.Ann Thorac Surg. 2009; 88: S2-S22Abstract Full Text Full Text PDF PubMed Scopus (802) Google Scholar], isolated valve [4O'Brien S.M. Shahian D.M. Filardo G. et al.The Society of Thoracic Surgeons 2008 cardiac surgery risk models: part 2–isolated valve surgery.Ann Thorac Surg. 2009; 88: S23-S42Abstract Full Text Full Text PDF PubMed Scopus (956) Google Scholar], and CABG plus isolated valve [5Shahian D.M. O'Brien S.M. Filardo G. et al.The Society of Thoracic Surgeons 2008 cardiac surgery risk models: part 3–valve plus coronary artery bypass grafting surgery.Ann Thorac Surg. 2009; 88: S43-S62Abstract Full Text Full Text PDF PubMed Scopus (364) Google Scholar]. The STS does not yet have risk models for the other more complex surgeries. Thus, an STS risk prediction was available for only 21,312 (74%) of PHS patients, which were predominantly CABG, and none of the more complex "Other" types (Table 1). The differences in overall operative mortality (Table 1) reflect the different case-load distributions (the mortality for PHS patients in the Other category, without an STS risk model, was 6.7%).Table 1Comparison of Patients and Surgery TypesVariableScrascia et alPHSSurgery dates2011–20132008–2014No. of hospitals514No. of patients4,59423,312Mean (SD) age67 (11)67 (11)Percent female34%29%Surgery type Isolated CABG36%60% Isolated valve23%25% CABG + Valve9%15% Other31%0%Mortality5.9%2.4%CABG = coronary artery bypass grafting; PHS = Providence Health & Services. Open table in a new tab Scrascia and colleagues [1Scrascia G. Guida P. Caparrotti S.M. et al.Incremental value of anemia in cardiac surgical risk prediction with the European system for cardiac operative risk evaluation (EuroSCORE) II model.Ann Thorac Surg. 2014; 98: 869-875Abstract Full Text Full Text PDF PubMed Scopus (27) Google Scholar] used preoperative hemoglobin to measure anemia. The STS database does not contain hemoglobin so we used preoperative hematocrit, divided by 3, to approximate the hemoglobin count. The distributions of preoperative hematocrit by sex, and the categories of anemia corresponding to those of Scrascia and colleagues, are shown in Figure 1. Based on this categorization, the distribution of anemia levels was quite similar between the data of Scrascia and colleagues and the PHS data (Table 2). For both groups the mortality approximately doubles for each incremental step in anemia severity (Table 2).Table 2Comparison of Anemia: Frequency and Associated MortalityAnemiaScrascia et alPHSFrequencyMortalityFrequencyMortalityNone64.9%3.4%61.9%1.5%Mild22.2%7.7%26.2%2.9%Moderate12.6%11.6%Severe0.3%15.7%0.2%6.0%PHS = Providence Health & Services. Open table in a new tab This project involved only the study of deidentified, existing records and thus qualified for exemption from Institutional Review Board approval based on Exemption 45 CFR 46.101(b) (4). From January 2008 through June 2014, 28,683 cardiac surgeries were performed in 14 Providence Health & Services (PHS) hospitals. While ES2 is a single risk model applicable to all cardiac surgery [2Nashef S.A. Roques F. Sharples L.D. et al.EuroSCORE II.Eur J Cardiothorac Surg. 2012; 41: 734-745Crossref PubMed Scopus (1680) Google Scholar], there are separate STS risk models for operative mortality of only 3 pure subgroups; isolated coronary artery bypass grafting (CABG) [3Shahian D.M. O'Brien S.M. Filardo G. et al.The society of thoracic surgeons 2008 cardiac surgery risk models: part 1–coronary artery bypass grafting surgery.Ann Thorac Surg. 2009; 88: S2-S22Abstract Full Text Full Text PDF PubMed Scopus (802) Google Scholar], isolated valve [4O'Brien S.M. Shahian D.M. Filardo G. et al.The Society of Thoracic Surgeons 2008 cardiac surgery risk models: part 2–isolated valve surgery.Ann Thorac Surg. 2009; 88: S23-S42Abstract Full Text Full Text PDF PubMed Scopus (956) Google Scholar], and CABG plus isolated valve [5Shahian D.M. O'Brien S.M. Filardo G. et al.The Society of Thoracic Surgeons 2008 cardiac surgery risk models: part 3–valve plus coronary artery bypass grafting surgery.Ann Thorac Surg. 2009; 88: S43-S62Abstract Full Text Full Text PDF PubMed Scopus (364) Google Scholar]. The STS does not yet have risk models for the other more complex surgeries. Thus, an STS risk prediction was available for only 21,312 (74%) of PHS patients, which were predominantly CABG, and none of the more complex "Other" types (Table 1). The differences in overall operative mortality (Table 1) reflect the different case-load distributions (the mortality for PHS patients in the Other category, without an STS risk model, was 6.7%). CABG = coronary artery bypass grafting; PHS = Providence Health & Services. Scrascia and colleagues [1Scrascia G. Guida P. Caparrotti S.M. et al.Incremental value of anemia in cardiac surgical risk prediction with the European system for cardiac operative risk evaluation (EuroSCORE) II model.Ann Thorac Surg. 2014; 98: 869-875Abstract Full Text Full Text PDF PubMed Scopus (27) Google Scholar] used preoperative hemoglobin to measure anemia. The STS database does not contain hemoglobin so we used preoperative hematocrit, divided by 3, to approximate the hemoglobin count. The distributions of preoperative hematocrit by sex, and the categories of anemia corresponding to those of Scrascia and colleagues, are shown in Figure 1. Based on this categorization, the distribution of anemia levels was quite similar between the data of Scrascia and colleagues and the PHS data (Table 2). For both groups the mortality approximately doubles for each incremental step in anemia severity (Table 2). PHS = Providence Health & Services. Statistical MethodsThe statistical significance of adding an additional risk factor to a model is immediately determined by the p value associated with its coefficient or odds ratio. Another measure of statistical significance can be derived from the increase in the C statistic (AUC) of the models with and without the additional risk factor. The C statistic measures discrimination, the ability of the risk model to assign higher scores to those who will die than to those who will live. However, it often happens that a highly significant additional risk factor will only improve the C statistic a small amount; eg, from 0.852 to 0.860 with the addition of anemia, in the Scrascia and colleagues [1Scrascia G. Guida P. Caparrotti S.M. et al.Incremental value of anemia in cardiac surgical risk prediction with the European system for cardiac operative risk evaluation (EuroSCORE) II model.Ann Thorac Surg. 2014; 98: 869-875Abstract Full Text Full Text PDF PubMed Scopus (27) Google Scholar] study. One would like to know the clinical impact of the improvement in the risk model achieved by the additional risk factor, with such a small increase of only 0.008, in the C statistic. To address this, Scrascia and colleagues used a relatively new method of measuring the clinical importance, the NRI.There are 2 versions of the NRI statistic: the original (categorical) NRI, introduced by Pencina and colleagues in 2008 [6Pencina M.J. D'Agostino Sr., R.B. D'Agostino Jr., R.B. Vasan R.S. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond.Stat Med. 2008; 27: 157-172Crossref PubMed Scopus (4687) Google Scholar]; and (2) the non-categorical (continuous) version, introduced by Pencina and colleagues in 2011 [7Pencina M.J. D'Agostino Sr., R.B. Steyerberg E.W. Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers.Stat Med. 2011; 30: 11-21Crossref PubMed Scopus (1691) Google Scholar]. Scrascia and colleagues used the latter, but we will briefly describe this methodology in the chronologic order that the 2 versions were introduced.Categorical Net Reclassification IndexSay the outcome of a risk model for mortality is categorical; high risk, medium risk, and low risk. These may be the ranges for which clinical decisions can be made; eg, if high, take some additional action, if low, do not, and if medium have a discussion to consider taking some action. Thus, it would be advantageous if the risk model correctly predicted more patients at higher risks of death into the highest category and those more likely to survive into the lowest. Unlike the C statistic, which can be computed for a single model, NRI needs 2 models, it is a comparison statistic. Here is how it works: suppose we are comparing the risk prediction from an existing standard model (M)

Referência(s)