Alirocumab, Decreased Mortality, Nominal Significance, P Values, Bayesian Statistics, and the Duplicity of Multiplicity
2019; Lippincott Williams & Wilkins; Volume: 140; Issue: 2 Linguagem: Inglês
10.1161/circulationaha.119.041496
ISSN1524-4539
Autores Tópico(s)Statistical Methods in Clinical Trials
ResumoHomeCirculationVol. 140, No. 2Alirocumab, Decreased Mortality, Nominal Significance, P Values, Bayesian Statistics, and the Duplicity of Multiplicity Free AccessEditorialPDF/EPUBAboutView PDFView EPUBSections ToolsAdd to favoritesDownload citationsTrack citationsPermissions ShareShare onFacebookTwitterLinked InMendeleyReddit Jump toFree AccessEditorialPDF/EPUBAlirocumab, Decreased Mortality, Nominal Significance, P Values, Bayesian Statistics, and the Duplicity of MultiplicityA Bays-on-Bayes Editorial Harold Edward Bays, MD Harold Edward BaysHarold Edward Bays Harold Edward Bays, MD, Louisville Metabolic and Atherosclerosis Research Center, 3288 Illinois Ave, Louisville, KY 40213. Email E-mail Address: [email protected] Louisville Metabolic and Atherosclerosis Research Center, KY. Originally published8 Jul 2019https://doi.org/10.1161/CIRCULATIONAHA.119.041496Circulation. 2019;140:113–116This article is a commentary on the followingEffect of Alirocumab on Mortality After Acute Coronary SyndromesArticle, see p 103Some clinicians may find challenges in translating statistical jargon into clinical application. This invited Editorial provides some context to a common, but often misunderstood application of P values.What is "Nominal Significance"?The ODYSSEY OUTCOMES trial (Alirocumab and Cardiovascular Outcomes After Acute Coronary Syndrome) evaluated 18 924 patients with acute coronary syndrome and elevated atherogenic lipoproteins. Alirocumab (a proprotein convertase subtilisin/kexin type 9 inhibitor) achieved the primary end point of a statistically significant reduction in recurrent ischemic cardiovascular events in patients with previous acute coronary syndrome receiving high-intensity statins.1 In a follow-up publication, "Effect of Alirocumab on Mortality After Acute Coronary Syndromes: An Analysis of the ODYSSEY OUTCOMES Randomized Clinical Trial," Steg et al2 concluded, "Alirocumab added to intensive statin therapy has the potential to reduce death after acute coronary syndrome."2In the original ODYSSEY OUTCOMES publication,1 Table 2 listed the 7 major secondary study end points. The hierarchical statistical testing order was as follows: any coronary heart disease event; major coronary heart disease event; any cardiovascular event; composite of death resulting from any cause, nonfatal myocardial infarction, or nonfatal ischemic stroke; death from coronary heart disease; death from any cardiovascular causes; and death from any cause. The first 4 secondary end points achieved statistical significance. But although death from coronary heart disease was numerically reduced (alirocumab, 205 coronary heart disease deaths; placebo, 222 coronary heart disease deaths), this did not achieve statistical significance (P=0.38).1 Because the a priori hierarchical analysis stopped after the first nonsignificant outcome result, Table 2 did not list P values for the remaining secondary end points of death resulting from any cardiovascular causes and death resulting from any cause. In the follow-up report, Steg et al2 noted that death resulting from any cause was reduced (hazard ratio, 0.85), with a nominally significant P=0.03.A common statistical challenge involves multiplicity of data.3 Unfettered comparisons of data and indiscriminate application of P values increase the chances that false-positive findings will be reported as statistically significant. A Bonferroni correction (desired α/number of tests) is a common method to address problems of multiplicity, especially when a large number of tests are conducted without a preplanned hypothesis.4 A Bonferroni correction (α=0.05) evaluating the responsive expression of 20 000 genes would require a value of P<0.0000025 to be statistically significant. In clinical outcomes trials, a more common approach is a hierarchical analysis, which lists outcomes based on best-guess ordering of clinically important results thought most likely to achieve statistical significance. Going from top to bottom, once a listed outcome is assigned a nonsignificant P value, then P values are no longer applied to the remaining outcomes on the list. If P values are applied to outcomes outside the a priori statistical plan, then an unadjusted P value below the traditional 0.05 is called nominally significant (nominally means in name only).P values applied outside the a priori statistical plan are best considered hypothesis generating (Figure). More than a decade ago, observational and secondary trial analyses suggested that statins improve osteoporosis. When these hypothesis-generating data were evaluated in a prospective randomized controlled trial in postmenopausal women, atorvastatin had no effect on bone mineral density.5 Hence, hypothesis-generating findings are best evaluated via dedicated clinical trials. Once designed, it is equally important to adhere to the a priori protocol and statistical plan because variance from the application of clinical trial methodology and data analysis may "allow presenting anything as significant."6Download figureDownload PowerPointFigure. Approach to the nominally significant finding. The figure assumes that because the investigators included the nominally significant comparison on an a priori basis, the finding was scientifically plausible. In such a case, if a comparison P value calculation is less than or equal to the unadjusted α (traditionally 0.05) yet falls outside the a priori statistical plan for a primary or secondary end point (eg, representing a tertiary or exploratory end point or a test result outside a hierarchical analysis), then it is best considered hypothesis-generating.For example, by intentionally failing to collect and analyze data in compliance with a rational a priori method or statistical plan, investigators were able to prove chronological rejuvenation. Through researcher degrees of freedom involving flexibility in statistical application, these investigators found that after University of Pennsylvania undergraduates listened to the song "When I'm Sixty-Four" by the Beatles (a rock-and-roll band of the 1960s), test subjects were 1½ years younger (P=0.040).6In another example, in the International Study of Infarct Survival trial, aspirin demonstrated a statistically significant benefit after acute myocardial infarction (P<0.00001). However, no statistically significant benefits were found in study subgroups having the astrological signs of Gemini and Libra.7 These examples highlight the importance of crafting study conduct and statistical plans before the start of the clinical trial. After the trial, it is best to honor the a priori protocol and statistical plan and to avoid P value frenzy, even if doing otherwise might enhance speaking invitations to investigators.Assuming that the null hypothesis is true (no actual difference), a P value reflects the probability that one would still find a difference equal to or greater than the reported finding. But that is not how P values are often perceived. A value of P =0.05 is often misinterpreted as meaning a 95% chance of accuracy.8 To many assessing drug trials, a value of P=0.051 means the drug is clinically worthless, a value of P=0.049 means the drug is clinically effective, and a value of P=0.00001 means it is the best drug ever.9 Some argue that exceptions should be made to P values outside an a priori hierarchical analysis when it comes to verifiable hard end points such as death. However, a priori hierarchical statistical plans are intended to address potential problems of multiplicity, not the diagnostic accuracy of the findings.3With this said, it does seem myopic to focus only on P values in analyzing clinical trial data. The "eyeball" analytical method remains useful. In the interpretation of data distribution (eg, for normality), it is often useful to look at the curves. In the interpretation of results, it is often useful to look at the descriptive statistics. In the assessment of clinical meaning, it is often useful to view the context of the investigation before, during, and after the clinical trial.Thus, What Role Does Common Sense Play in the Interpretation of Clinical Trial Data?In 2003, investigators performed a systematic review of randomized controlled trials evaluating the effectiveness of parachutes to prevent death and major trauma related to gravitational challenge.They were unable to identify any randomized controlled trials of parachute intervention. Their conclusion? "The effectiveness of parachutes has not been subjected to rigorous evaluation by using randomised controlled trials."10 This illustrates the fallacy of antiseptically removing all rational thought from interventions that affect the health of individuals. The problem here is what may be common sense to one person may be anathema to someone else. What may have seemed intuitive yesterday (eg, juice for all infants, estrogens for all postmenopausal women, disco) may not seem as intuitive today or tomorrow.In ODYSSEY OUTCOMES, death caused by coronary heart disease did not achieve statistical significance. So how might a reduction in overall mortality with alirocumab be scientifically plausible? According to Steg et al,2 "The majority of the difference in noncardiovascular deaths between groups (27 deaths) was related to a reduction in pulmonary deaths (14 fewer with alirocumab)." Given that myocardial infarction combined with lung disease increases mortality,11 patients experiencing fewer cardiovascular events may be less debilitated, have fewer frailty-related (pulmonary) deaths, and thus have fewer noncardiovascular deaths.Given the inconsistency in total mortality results from placebo-controlled cholesterol-lowering drug trials,12 even with statins,13 it is perhaps time to suggest a complementary statistical approach that incorporates validated beliefs or rational thought in the planning stages of new trials. Current cardiovascular disease outcomes drug trials are usually conducted to achieve regulatory approval for an indicated use and often are designed to achieve a desired result quickly. They are not typically optimized to assess effects on mortality. Many recent major cardiovascular disease outcomes trials have had a median follow-up of only 2 to 3 years (the median follow-up of ODYSSEY OUTCOMES was 2.8 years), potentially creating challenges in demonstrating mortality benefits.Many have advocated that clinical trials should move beyond strictly a frequentist approach. Instead, many suggest incorporating bayesian statistics, wherein the probability of a result incorporates a degree of belief in a result based on previous knowledge, which can then be potentially updated on the basis of new (a posteriori) clinical trial results.14,15 This approach may be especially appealing in assessing potential mortality benefits, which is often elusive with current clinical trial designs. If on the basis of previous clinical trials a belief was established about the probability of a mortality benefit with cholesterol-lowering drugs in general, or proprotein convertase subtilisin/kexin type 9inhibitors or alirocumab specifically, then incorporating a posteriori input from the ODYSSEY OUTCOMES into a bayesian model might have provided complementary statistical support to predict the degree by which alirocumab may reduce overall mortality.So was the focus of Steg et al on a nominally significant finding an attempt to publish a claim not published in Table 2 of the original article? Or is this an illustrative example how pointy-headed statistician purists seem to have a life goal to ruin everything for everybody? My sense is neither is correct. The Steg et al publication was an opportunity to more comprehensively examine the reduction in mortality with alirocumab compared with the original publication. The authors rightfully concluded that "alirocumab added to intensive statin therapy has the potential to reduce death after acute coronary syndrome." The authors did not claim that alirocumab reduced death. But they also did not dismiss a potentially lifesaving finding. Yes, this nominally significant finding is best described as hypothesis generating. However, it does seem scientifically plausible to conclude that, whether it be cardiac or noncardiac mortality, a person not having a heart attack has a greater chance of not dying than a person having a heart attack.Such is the duplicity of multiplicity.DisclosuresDr Bays has received research grants from and served as a consultant for Sanofi, Regeneron, Amgen, and Esperion. Dr Bays has served as a speaker for Sanofi, Regeneron, and Amgen.FootnotesThe opinions expressed in this article are not necessarily those of the editors or of the American Heart Association.Guest Editor for this article was Christie M. Ballantyne, MD.Harold Edward Bays, MD, Louisville Metabolic and Atherosclerosis Research Center, 3288 Illinois Ave, Louisville, KY 40213. Email [email protected]comReferences1. Schwartz GG, Steg PG, Szarek M, Bhatt DL, Bittner VA, Diaz R, Edelberg JM, Goodman SG, Hanotin C, Harrington RA, Jukema JW, Lecorps G, Mahaffey KW, Moryusef A, Pordy R, Quintero K, Roe MT, Sasiela WJ, Tamby JF, Tricoci P, White HD, Zeiher AM; ODYSSEY OUTCOMES Committees and Investigators. Alirocumab and cardiovascular outcomes after acute coronary syndrome.N Engl J Med. 2018; 379:2097–2107. doi: 10.1056/NEJMoa1801174CrossrefMedlineGoogle Scholar2. Steg PG, Szarek M, Bhatt D, SzarekBittner VA, SzarekBrégeault M-F, SzarekDalby AJ, SzarekDiaz R, SzarekEdelberg JM, SzarekGoodman SG, SzarekHanotin C, SzarekHarrington RA, SzarekJukema JW, SzarekLecorps G, SzarekMahaffey KW, SzarekMoryusef A, SzarekOstadal P, SzarekParkhomenko A, SzarekPordy R, SzarekRoe MT, SzarekTricoci P, SzarekVogel R, SzarekWhite HD, SzarekZeiher AM, SzarekSchwartz GG; For the ODYSSEY OUTCOMES Committees and Investigators. Effect of alirocumab on mortality after acute coronary syndromes: an analysis of the ODYSSEY OUTCOMES randomized clinical trial.Circulation. 2019; 140:103–112. doi: 10.1161/CIRCULATIONAHA.118.038840LinkGoogle Scholar3. Pocock SJ, McMurray JJV, Collier TJ. Statistical controversies in reporting of clinical trials: part 2 of a 4-part series on statistics for clinical trials.J Am Coll Cardiol. 2015; 66:2648–2662. doi: 10.1016/j.jacc.2015.10.023CrossrefMedlineGoogle Scholar4. Armstrong RA. When to use the Bonferroni correction.Ophthalmic Physiol Opt. 2014; 34:502–508. doi: 10.1111/opo.12131CrossrefMedlineGoogle Scholar5. Bone HG, Kiel DP, Lindsay RS, Lewiecki EM, Bolognese MA, Leary ET, Lowe W, McClung MR. Effects of atorvastatin on bone in postmenopausal women with dyslipidemia: a double-blind, placebo-controlled, dose-ranging trial.J Clin Endocrinol Metab. 2007; 92:4671–4677. doi: 10.1210/jc.2006-1909CrossrefMedlineGoogle Scholar6. Simmons JP, Nelson LD, Simonsohn U. False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant.Psychol Sci. 2011; 22:1359–1366. doi: 10.1177/0956797611417632CrossrefMedlineGoogle Scholar7. Sleight P. Debate: Subgroup analyses in clinical trials: fun to look at - but don't believe them!Curr Control Trials Cardiovasc Med. 2000; 1:25–27. doi: 10.1186/cvm-1-1-025CrossrefMedlineGoogle Scholar8. Gagnier JJ, Morgenstern H. Misconceptions, misuses, and misinterpretations of P values and significance testing.J Bone Joint Surg Am. 2017; 99:1598–1603. doi: 10.2106/JBJS.16.01314CrossrefMedlineGoogle Scholar9. Kim J, Bang H. Three common misuses of P values.Dent Hypotheses. 2016; 7:73–80. doi: 10.4103/2155-8213.190481CrossrefMedlineGoogle Scholar10. Smith GC, Pell JP. Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of randomised controlled trials.BMJ. 2003; 327:1459–1461. doi: 10.1136/bmj.327.7429.1459CrossrefMedlineGoogle Scholar11. Rothnie KJ, Quint JK. Chronic obstructive pulmonary disease and acute myocardial infarction: effects on presentation, management, and outcomes.Eur Heart J Qual Care Clin Outcomes. 2016; 2:81–90. doi: 10.1093/ehjqcco/qcw005CrossrefMedlineGoogle Scholar12. Navarese EP, Robinson JG, Kowalewski M, Kolodziejczak M, Andreotti F, Bliden K, Tantry U, Kubica J, Raggi P, Gurbel PA. Association between baseline LDL-C level and total and cardiovascular mortality after LDL-C lowering: a systematic review and meta-analysis.JAMA. 2018; 319:1566–1579. doi: 10.1001/jama.2018.2525CrossrefMedlineGoogle Scholar13. DuBroff R, de Lorgeril M. Cholesterol confusion and statin controversy.World J Cardiol. 2015; 7:404–409. doi: 10.4330/wjc.v7.i7.404CrossrefMedlineGoogle Scholar14. Gupta SK. Use of Bayesian statistics in drug development: advantages and challenges.Int J Appl Basic Med Res. 2012; 2:3–6. doi: 10.4103/2229-516X.96789CrossrefMedlineGoogle Scholar15. Yin G, Lam CK, Shi H. Bayesian randomized clinical trials: from fixed to adaptive design.Contemp Clin Trials. 2017; 59:77–86. doi: 10.1016/j.cct.2017.04.010CrossrefMedlineGoogle Scholar Previous Back to top Next FiguresReferencesRelatedDetailsCited By Galati G, Sabouret P, Germanova O and Bhatt D (2021) Women and Diabetes: Preventing Heart Disease in a New Era of Therapies, European Cardiology Review, 10.15420/ecr.2021.22, 16 Guedeney P, Sorrentino S, Giustino G, Chapelle C, Laporte S, Claessen B, Ollier E, Camaj A, Kalkman D, Vogel B, De Rosa S, Indolfi C, Lattuca B, Zeitouni M, Kerneis M, Silvain J, Collet J, Mehran R and Montalescot G (2020) Indirect comparison of the efficacy and safety of alirocumab and evolocumab: a systematic review and network meta-analysis, European Heart Journal - Cardiovascular Pharmacotherapy, 10.1093/ehjcvp/pvaa024, 7:3, (225-235), Online publication date: 23-May-2021. Claessen B, Guedeney P, Gibson C, Angiolillo D, Cao D, Lepor N and Mehran R (2020) Lipid Management in Patients Presenting With Acute Coronary Syndromes: A Review, Journal of the American Heart Association, 9:24, Online publication date: 15-Dec-2020. McCormick D, Bhatt D, Bays H, Taub P, Caldwell K, Guerin C, Steinhoff J, Ahmad Z, Singh R, Moreo K, Carter J, Heggen C and Sapir T (2020) A regional analysis of payer and provider views on cholesterol management: PCSK9 inhibitors as an illustrative alignment model, Journal of Managed Care & Specialty Pharmacy, 10.18553/jmcp.2020.26.12.1517, 26:12, (1517-1528), Online publication date: 1-Dec-2020. Sabouret P, Galati G, Angoulvant D, Germanova O, Castelletti S, Pathak A, Metra M and Margonato A (2020) The interplay between cardiology and diabetology: a renewed collaboration to optimize cardiovascular prevention and heart failure management, European Heart Journal - Cardiovascular Pharmacotherapy, 10.1093/ehjcvp/pvaa051, 6:6, (394-404), Online publication date: 1-Nov-2020. Duprez D, Handelsman Y and Koren M (2020) Cardiovascular Outcomes and Proprotein Convertase Subtilisin/Kexin Type 9 Inhibitors: Current Data and Future Prospects, Vascular Health and Risk Management, 10.2147/VHRM.S261719, Volume 16, (403-418) Related articlesEffect of Alirocumab on Mortality After Acute Coronary SyndromesPhilippe Gabriel Steg, et al. Circulation. 2019;140:103-112 July 9, 2019Vol 140, Issue 2 Advertisement Article InformationMetrics © 2019 American Heart Association, Inc.https://doi.org/10.1161/CIRCULATIONAHA.119.041496PMID: 31283369 Originally publishedJuly 8, 2019 KeywordsstatisticsEditorialsBayes theoremmortalityPDF download Advertisement SubjectsLipids and CholesterolMortality/SurvivalMyocardial InfarctionPharmacologyQuality and Outcomes
Referência(s)