Use of the GRADE approach in systematic reviews and guidelines
2019; Elsevier BV; Volume: 123; Issue: 5 Linguagem: Inglês
10.1016/j.bja.2019.08.015
ISSN1471-6771
AutoresAnders Granholm, Waleed Alhazzani, Morten Hylander Møller,
Tópico(s)Meta-analysis and systematic reviews
ResumoThe Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach is a systematic and transparent approach for rating the certainty of evidence in systematic reviews and clinical practice guidelines, and for developing and determining the strength of clinical practice recommendations.1Guyatt G.H. Oxman A.D. Vist G.E. et al.GRADE: an emerging consensus on rating quality of evidence and strength of recommendations.BMJ. 2008; 336: 924-926Crossref PubMed Google Scholar While use of GRADE in systematic reviews is currently only mandated by a few (∼4%) journals within anaesthesia and intensive care medicine,2Butler E. Granholm A. Aneman A. Trustworthy systematic reviews-Can journals do more?.Acta Anaesthesiol Scand. 2019; 63: 558-559Google Scholar it is becoming a de facto standard for high-quality systematic reviews, and it is an essential component of trustworthy guidelines. GRADE has been adopted by more than 100 organisations worldwide, including the Cochrane Collaboration, the WHO, UpToDate®, the UK National Institute for Health and Clinical Excellence, and many societies within the fields of anaesthesia and critical care.3GRADE Working Group. GRADE home. Available from: http://www.gradeworkinggroup.org/(accessed 15 August 2019).Google Scholar Knowledge about GRADE is therefore necessary not only for researchers and guideline panel members, but also for clinicians who use and rely on systematic reviews and guidelines for their clinical practice. Here, we provide an introduction and overview of the GRADE approach after the publication of a narrative review on nitrous oxide in the British Journal of Anaesthesia4Buhre W. Disma N. Hendrickx J. et al.European society of Anaesthesiology task force on nitrous oxide: a narrative review of its role in clinical practice.Br J Anaesth. 2019; 122: 587-604Abstract Full Text Full Text PDF PubMed Scopus (40) Google Scholar and subsequent discussion related to its apparent use of GRADE and concerns about the methodological adequacy.5Imberger G. McGain F. GRADE quality of evidence: a systematic and objective assessment, not an expression of opinion.Br J Anaesth. 2019; 123: e479-e480Abstract Full Text Full Text PDF Scopus (3) Google Scholar, 6Muret J. Fernandes T. Gerlach H. et al.Environmental impacts of nitrous oxide - No laughing matter! Comment on Br J Anaesth 2019.Br J Anaesth. 2019; 123: e481-e482Abstract Full Text Full Text PDF Scopus (14) Google Scholar, 7Buhre W. Disma N. Hendrickx J. et al.Response to comments on ‘The European Society of Anaesthesiology Task Force review on the place of nitrous oxide in current clinical practice (Br J Anaesth 2019).Br J Anaesth. 2019; 123: e482-e483Abstract Full Text Full Text PDF Scopus (3) Google Scholar An overview of the GRADE approach is presented in Figure 1.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar The process of developing a systematic review or clinical practice guideline starts with assemblance of a review group or guideline panel, which should ideally include academic and frontline clinicians, methodologists, and, for guidelines, other key stakeholders including patient representatives. The initial phases consist of selection of the topics and settings of interest, formulating population, intervention, comparator, outcomes (PICO) questions, prioritising outcomes (focusing on patient-important outcomes), and systematically summarising the evidence base by conducting or updating a systematic review.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar This is followed by an assessment of the certainty of evidence, and for guidelines subsequently by issuing recommendations and rating their strength, which requires consideration of multiple factors described below. This last step is omitted in systematic reviews, as their purpose is to summarise the evidence base only, and not to make recommendations. Finally, the systematic review or guideline undergoes peer review, is published and disseminated, and, when necessary, updated. Assessing the certainty of evidence (also referred to as the quality of evidence or the confidence in the effect estimates) is central in GRADE, and the process is somewhat different for systematic reviews and guidelines. For systematic reviews, the certainty refers to how certain review authors are that an effect estimate represents the true effect, while for guidelines, the certainty refers to how certain the guideline panel is that the evidence is sufficient to support a particular recommendation.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar Importantly, the certainty of evidence is assessed per outcome measure using pooled estimates from the included studies and not for individual studies. Further, in systematic reviews each outcome is considered individually and the threshold between benefit (decreased risk) and harm (increased risk) is usually central in this assessment. In guidelines, outcomes are considered together and the certainty assessment is often more focused on a clinically important decision threshold chosen based on trade-offs between benefit and harm, costs, inconveniences, and adverse effects of an intervention.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar, 12Guyatt G.H. Oxman A.D. Kunz R. et al.GRADE guidelines 6. Rating the quality of evidence - imprecision.J Clin Epidemiol. 2011; 64: 1283-1293Abstract Full Text Full Text PDF PubMed Scopus (1533) Google Scholar While the certainty of evidence should be considered a continuum from absolute certainty to no certainty at all, it is ultimately categorised as either very low, low, moderate, or high certainty of evidence,13Balshem H. Helfand M. Schünemann H.J. et al.GRADE guidelines: 3. Rating the quality of evidence.J Clin Epidemiol. 2011; 64: 401-406Abstract Full Text Full Text PDF PubMed Scopus (4303) Google Scholar which eases interpretation and communication of results. Generally, randomised clinical trials (RCTs) start as high certainty evidence, while observational studies start as low certainty evidence. When using GRADE for studies of diagnostic test accuracy or prognostic factors, observational studies start as high certainty evidence.9Schünemann H.J. Oxman A.D. Brozek J. et al.GRADE: grading of quality of evidence and strength of recommendations for diagnostic tests and strategies.BMJ. 2008; 336: 1106-1110Crossref PubMed Google Scholar, 10Iorio A. Spencer F.A. Falavigna M. et al.Use of GRADE for assessment of evidence about prognosis: rating confidence in estimates of event rates in broad categories of patients.BMJ. 2015; 350: h870Crossref PubMed Scopus (384) Google Scholar From this initial rating, the certainty of evidence can be rated down by one or two levels when there are serious or very serious concerns, respectively, in any of the following five domains: risk of bias, inconsistency, indirectness, imprecision, or publication bias.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar, 13Balshem H. Helfand M. Schünemann H.J. et al.GRADE guidelines: 3. Rating the quality of evidence.J Clin Epidemiol. 2011; 64: 401-406Abstract Full Text Full Text PDF PubMed Scopus (4303) Google Scholar It is also possible to rate up the certainty of evidence, although this is less frequently done and practically only relevant for high-quality observational studies.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar, 13Balshem H. Helfand M. Schünemann H.J. et al.GRADE guidelines: 3. Rating the quality of evidence.J Clin Epidemiol. 2011; 64: 401-406Abstract Full Text Full Text PDF PubMed Scopus (4303) Google Scholar, 14Guyatt G.H. Oxman A.D. Sultan S. et al.GRADE guidelines: 9. Rating up the quality of evidence.J Clin Epidemiol. 2011; 64: 1311-1316Abstract Full Text Full Text PDF PubMed Scopus (797) Google Scholar The process ultimately ends with an overall rating of the certainty of evidence for each outcome, which is presented in an evidence profile or summary of findings table together with the study types, the number of studies and participants, the assessments of each domain (only in evidence profiles), and the relative and absolute effects for each outcome.15Guyatt G. Oxman A.D. Akl E.A. et al.GRADE guidelines: 1. Introduction - GRADE evidence profiles and summary of findings tables.J Clin Epidemiol. 2011; 64: 383-394Abstract Full Text Full Text PDF PubMed Scopus (5017) Google Scholar Details on the assessment of each individual domain are presented in the next sections. Risk of bias is also referred to as limitations in study design or execution,8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar, 16Guyatt G.H. Oxman A.D. Vist G. et al.GRADE guidelines: 4. Rating the quality of evidence - study limitations (risk of bias).J Clin Epidemiol. 2011; 64: 407-415Abstract Full Text Full Text PDF PubMed Scopus (1761) Google Scholar and should be assessed using formal tools and with explicit reasonings for the judgments presented, as the process commonly involves some degree of subjective judgment. For RCTs, sources of risk of bias relate to inadequate randomisation or allocation concealment, lack of blinding (especially when ‘subjective’ outcomes are assessed), loss to follow-up, selective reporting of results, and other factors such as early stopping.16Guyatt G.H. Oxman A.D. Vist G. et al.GRADE guidelines: 4. Rating the quality of evidence - study limitations (risk of bias).J Clin Epidemiol. 2011; 64: 407-415Abstract Full Text Full Text PDF PubMed Scopus (1761) Google Scholar For observational studies, sources of bias include selection bias, measurement errors, confounding, and missing data or loss to follow-up.16Guyatt G.H. Oxman A.D. Vist G. et al.GRADE guidelines: 4. Rating the quality of evidence - study limitations (risk of bias).J Clin Epidemiol. 2011; 64: 407-415Abstract Full Text Full Text PDF PubMed Scopus (1761) Google Scholar It is important that observational studies are not further rated down for inherent limitations of observational studies (such as lack of random allocation or blinding), as this is the reason why observational studies start as low certainty of evidence; for some tools used to assess risk of bias (e.g. the Risk of Bias In Non-randomised Studies - of Interventions (ROBINS-I) tool), it is thus appropriate to start with a high certainty rating for observational studies and subsequently rate down for these inherent sources of bias.16Guyatt G.H. Oxman A.D. Vist G. et al.GRADE guidelines: 4. Rating the quality of evidence - study limitations (risk of bias).J Clin Epidemiol. 2011; 64: 407-415Abstract Full Text Full Text PDF PubMed Scopus (1761) Google Scholar When assessing the influence of risk of bias, the contribution (i.e. study weights in a meta-analysis) of studies at high risk of bias should be considered, and subgroup analyses stratified by risk of bias should be conducted whenever possible.16Guyatt G.H. Oxman A.D. Vist G. et al.GRADE guidelines: 4. Rating the quality of evidence - study limitations (risk of bias).J Clin Epidemiol. 2011; 64: 407-415Abstract Full Text Full Text PDF PubMed Scopus (1761) Google Scholar, 17Jakobsen J.C. Wetterslev J. Winkel P. Lange T. Gluud C. Thresholds for statistical and clinical significance in systematic reviews with meta-analytic methods.BMC Med Res Methodol. 2014; 14: 120Crossref PubMed Scopus (405) Google Scholar If results from low (or lower if an intervention is assessed where it is considered impossible to blind everybody involved, for example) risk of bias studies differ substantially from results from high(er) risk of bias studies, the primary results should be based on the low(er) risk of bias subgroup without rating down for risk of bias. This should be accompanied by presentation of results from all studies with certainty rated down for risk of bias. If results from low and high risk of bias studies are similar, conclusions can be based on all studies (with increased precision), without necessarily rating down for risk of bias.16Guyatt G.H. Oxman A.D. Vist G. et al.GRADE guidelines: 4. Rating the quality of evidence - study limitations (risk of bias).J Clin Epidemiol. 2011; 64: 407-415Abstract Full Text Full Text PDF PubMed Scopus (1761) Google Scholar Consistency of results is generally assessed on the relative effect scale, as there is often heterogeneity on the absolute scale because of different baseline risks.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar, 18Guyatt G.H. Oxman A.D. Kunz R. et al.GRADE guidelines: 7. Rating the quality of evidence - inconsistency.J Clin Epidemiol. 2011; 64: 1294-1302Abstract Full Text Full Text PDF PubMed Scopus (1358) Google Scholar Consistency is assessed by considering heterogeneity of point estimates, confidence intervals (CIs), and statistical measures such as I2-values.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar, 18Guyatt G.H. Oxman A.D. Kunz R. et al.GRADE guidelines: 7. Rating the quality of evidence - inconsistency.J Clin Epidemiol. 2011; 64: 1294-1302Abstract Full Text Full Text PDF PubMed Scopus (1358) Google Scholar Inconsistency (also known as heterogeneity) is present when the magnitude or the direction of estimates vary between studies without a clear explanation.18Guyatt G.H. Oxman A.D. Kunz R. et al.GRADE guidelines: 7. Rating the quality of evidence - inconsistency.J Clin Epidemiol. 2011; 64: 1294-1302Abstract Full Text Full Text PDF PubMed Scopus (1358) Google Scholar Where inconsistency can be explained (e.g. by differences in populations, interventions, or comparators used), subgroup analyses and separate GRADE assessments for each subgroup are preferred, and rating down for inconsistency should thus be reserved for situations where heterogeneity cannot be explained by meaningful subgroup analyses.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar Indirectness is present when included studies do not entirely match the PICO question, and is thus mostly relevant when direct evidence is unavailable or sparse.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar, 19Guyatt G.H. Oxman A.D. Kunz R. et al.GRADE guidelines: 8. Rating the quality of evidence - indirectness.J Clin Epidemiol. 2011; 64: 1303-1310Abstract Full Text Full Text PDF PubMed Scopus (1093) Google Scholar In such situations, it can be necessary to rely on indirect evidence from different populations (including different age groups), for different outcomes, or for different interventions (extrapolation), e.g. when two interventions of interest have only been compared with a common third intervention (including placebo/no treatment) and not directly with each other. Additional considerations apply for indirect comparisons in the context of network meta-analyses.20Brignardello-Petersen R. Bonner A. Alexander P.E. et al.Advances in the GRADE approach to rate the certainty in estimates from a network meta-analysis.J Clin Epidemiol. 2018; 93: 36-44Abstract Full Text Full Text PDF PubMed Scopus (287) Google Scholar Imprecision is assessed by considering the number of included patients and events and the confidence interval (CI), which is—somewhat simplified—interpreted as the range of plausible effect sizes.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar, 12Guyatt G.H. Oxman A.D. Kunz R. et al.GRADE guidelines 6. Rating the quality of evidence - imprecision.J Clin Epidemiol. 2011; 64: 1283-1293Abstract Full Text Full Text PDF PubMed Scopus (1533) Google Scholar When assessing imprecision, it is recommended to focus primarily on absolute effects.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar, 12Guyatt G.H. Oxman A.D. Kunz R. et al.GRADE guidelines 6. Rating the quality of evidence - imprecision.J Clin Epidemiol. 2011; 64: 1283-1293Abstract Full Text Full Text PDF PubMed Scopus (1533) Google Scholar If the plausible range of effect sizes encompasses both considerable benefits and harms (in a systematic review), or is wide enough to both support and not support a recommendation (in a guideline), imprecision is present. As the risk of random errors is increased when few patients or events have been included, it is justified to rate down for imprecision regardless of the CI, especially if the optimal information size (the minimum sample size required to confirm or reject an effect of interest in a single, adequately powered study) has not been achieved and the sample size is not very large.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar, 12Guyatt G.H. Oxman A.D. Kunz R. et al.GRADE guidelines 6. Rating the quality of evidence - imprecision.J Clin Epidemiol. 2011; 64: 1283-1293Abstract Full Text Full Text PDF PubMed Scopus (1533) Google Scholar Similarly, the required information size or adjusted CIs from trial sequential analyses21Wetterslev J. Jakobsen J.C. Gluud C. Trial Sequential Analysis in systematic reviews with meta-analysis.BMC Med Res Methodol. 2017; 17: 39Crossref PubMed Scopus (538) Google Scholar may be used,17Jakobsen J.C. Wetterslev J. Winkel P. Lange T. Gluud C. Thresholds for statistical and clinical significance in systematic reviews with meta-analytic methods.BMC Med Res Methodol. 2014; 14: 120Crossref PubMed Scopus (405) Google Scholar although no official GRADE guidance for applying trial sequential analysis currently exists. Small studies without statistically significant results are less likely to be published, and publication is often delayed.22Hopewell S. Loudon K. Clarke M.J. Oxman A.D. Dickersin K. Publication bias in clinical trials due to significance of trial results.Cochrane Database Syst Rev. 2009; 1 (MR000006)Google Scholar, 23Guyatt G.H. Oxman A.D. Montori V. et al.GRADE guidelines: 5. Rating the quality of evidence - publication bias.J Clin Epidemiol. 2011; 64: 1277-1282Abstract Full Text Full Text PDF PubMed Scopus (1098) Google Scholar This risk is increased when financial or other conflicts of interest are present, such as in industry-funded studies.22Hopewell S. Loudon K. Clarke M.J. Oxman A.D. Dickersin K. Publication bias in clinical trials due to significance of trial results.Cochrane Database Syst Rev. 2009; 1 (MR000006)Google Scholar, 23Guyatt G.H. Oxman A.D. Montori V. et al.GRADE guidelines: 5. Rating the quality of evidence - publication bias.J Clin Epidemiol. 2011; 64: 1277-1282Abstract Full Text Full Text PDF PubMed Scopus (1098) Google Scholar The risk of publication bias can be assessed by visual assessment or statistical tests of funnel plot asymmetry if at least 10 trials have been published,8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar but as publication bias is difficult to prove (this would require access to the unpublished studies), it is recommended to maximally rate down the certainty of evidence by one level for this domain.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar, 23Guyatt G.H. Oxman A.D. Montori V. et al.GRADE guidelines: 5. Rating the quality of evidence - publication bias.J Clin Epidemiol. 2011; 64: 1277-1282Abstract Full Text Full Text PDF PubMed Scopus (1098) Google Scholar It may be appropriate to rate up the certainty of evidence if: 1) the effect size is large (and there is not substantial imprecision, with small effect sizes also being plausible); 2) a dose-response gradient is present; 3) when plausible residual confounding would reduce a demonstrated effect or suggest a spurious effect when the effect shown is neutral; or 4) rarely in other special situations.14Guyatt G.H. Oxman A.D. Sultan S. et al.GRADE guidelines: 9. Rating up the quality of evidence.J Clin Epidemiol. 2011; 64: 1311-1316Abstract Full Text Full Text PDF PubMed Scopus (797) Google Scholar These criteria are, for all practical purposes, only relevant for high-quality observational studies and not RCTs, and are less frequently applied than the five criteria for rating down certainty. Further, one should generally be conservative rating up, especially when the certainty has already been rated down.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar, 14Guyatt G.H. Oxman A.D. Sultan S. et al.GRADE guidelines: 9. Rating up the quality of evidence.J Clin Epidemiol. 2011; 64: 1311-1316Abstract Full Text Full Text PDF PubMed Scopus (797) Google Scholar In clinical practice guidelines, guideline members issue recommendations either for or against an intervention. To ensure that guidelines are useful, recommendations should be made whenever possible, and be both specific and actionable.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar However, issuing recommendations may sometimes be inappropriate (e.g. when there is no or minimal evidence and the certainty is very low).8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar, 24Andrews J. Guyatt G. Oxman A.D. et al.GRADE guidelines: 14. Going from evidence to recommendations: the significance and presentation of recommendations.J Clin Epidemiol. 2013; 66: 719-725Abstract Full Text Full Text PDF PubMed Scopus (820) Google Scholar Recommendations are GRADEd as either strong (usually phrased as ‘we recommend’) or conditional (previously referred to as weak and usually phrased as ‘we suggest’) depending not only on the certainty of evidence across all outcomes, but also on the balance between benefits and harms for all patient-important outcomes, patients' values and preferences, cost and resources, feasibility, acceptability, and impact on equity.24Andrews J. Guyatt G. Oxman A.D. et al.GRADE guidelines: 14. Going from evidence to recommendations: the significance and presentation of recommendations.J Clin Epidemiol. 2013; 66: 719-725Abstract Full Text Full Text PDF PubMed Scopus (820) Google Scholar, 25Andrews J.C. Schünemann H.J. Oxman A.D. et al.GRADE guidelines: 15. Going from evidence to recommendation - determinants of a recommendation’s direction and strength.J Clin Epidemiol. 2013; 66: 726-735Abstract Full Text Full Text PDF PubMed Scopus (747) Google Scholar Thus, while there may be high certainty of evidence that an intervention improves outcomes compared with another intervention, the recommendation may be weak, e.g. if the treatment is costly, infeasible, or associated with substantial inconvenience for patients.25Andrews J.C. Schünemann H.J. Oxman A.D. et al.GRADE guidelines: 15. Going from evidence to recommendation - determinants of a recommendation’s direction and strength.J Clin Epidemiol. 2013; 66: 726-735Abstract Full Text Full Text PDF PubMed Scopus (747) Google Scholar Issuing strong recommendations based on low or very low certainty evidence is generally recommended against, but may be considered in some very select situations.25Andrews J.C. Schünemann H.J. Oxman A.D. et al.GRADE guidelines: 15. Going from evidence to recommendation - determinants of a recommendation’s direction and strength.J Clin Epidemiol. 2013; 66: 726-735Abstract Full Text Full Text PDF PubMed Scopus (747) Google Scholar The distinction between strong and conditional recommendations has direct implications for patients, clinicians, and policymakers, as outlined in Table 1.Table 1Implications of recommendations according to their strength. Implications of strong and conditional (previously referred to as weak) recommendations for patients, clinicians and policymakers according to GRADE guidelines.24Andrews J. Guyatt G. Oxman A.D. et al.GRADE guidelines: 14. Going from evidence to recommendations: the significance and presentation of recommendations.J Clin Epidemiol. 2013; 66: 719-725Abstract Full Text Full Text PDF PubMed Scopus (820) Google ScholarImplicationsStrong recommendationConditional recommendationFor patientsThe recommended course of action would be preferred by all or almost all informed patients.The recommended course of action would be preferred by most informed patients, but a substantial proportion of patients would prefer a different course of action.For cliniciansAs almost all patients would prefer the recommended course of action, clinicians should – generally – spend less time on discussing implications and alternatives with patients and focus more on other issues such as implementation and adherence.As a substantial proportion of patients would not prefer the recommended course of action, clinicians should spend more time on shared decision making, and ensure that patients are adequately informed about the implications of the recommended course of action and alternatives, in order to ensure that individual patients make a choice that best reflects their values and preferences.For policymakersAs almost all patients would prefer the recommended course of action, policymakers can assume that large variations in clinical practice are unlikely, and the recommendation can be used as policy and as a performance indicator in most situations.As a substantial proportion of patients would not prefer the recommended course of action, policymakers can assume that large variations in clinical practice are likely, and that this will depend on multiple factors, including patients' preferences and values. Use of the recommendation as policy or as a performance indicator is inappropriate. Open table in a new tab In rare situations where it is considered difficult or impossible to formally summarise and GRADE the evidence, and guideline panel members are confident that there is unequivocal benefit or harm, a good practice statement may, under strict criteria, be produced instead.26Guyatt G.H. Alonso-Coello P. Schünemann H.J. et al.Guideline panels should seldom make good practice statements: guidance from the GRADE Working Group.J Clin Epidemiol. 2016; 80: 3-7Abstract Full Text Full Text PDF PubMed Scopus (102) Google Scholar Such statements should be reserved for special situations, used with caution, and clearly labelled as good practice statements and not as GRADEd recommendations.26Guyatt G.H. Alonso-Coello P. Schünemann H.J. et al.Guideline panels should seldom make good practice statements: guidance from the GRADE Working Group.J Clin Epidemiol. 2016; 80: 3-7Abstract Full Text Full Text PDF PubMed Scopus (102) Google Scholar While GRADE provides a systematic and transparent approach to assessing the certainty of evidence and strength of recommendations, it is important to acknowledge that using GRADE will commonly involve some subjective judgments, and assessments may vary between individuals.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar, 15Guyatt G. Oxman A.D. Akl E.A. et al.GRADE guidelines: 1. Introduction - GRADE evidence profiles and summary of findings tables.J Clin Epidemiol. 2011; 64: 383-394Abstract Full Text Full Text PDF PubMed Scopus (5017) Google Scholar The inter-rater agreement for GRADE assessments by different, untrained individuals is limited; however, it is higher than judgments made without a systematic approach, and reproducibility substantially increases with training, calibration exercises, clear instructions, and when performed by two (or more) people.27Mustafa R.A. Santesso N. Brozek J. et al.The GRADE approach is reproducible in assessing the quality of evidence of quantitative evidence syntheses.J Clin Epidemiol. 2013; 66: 736-742Abstract Full Text Full Text PDF PubMed Scopus (212) Google Scholar, 28Kumar A. Miladinovic B. Guyatt G.H. Schünemann H.J. Djulbegovic B. GRADE guidelines system is reproducible when instructions are clearly operationalized even among the guidelines panel members with limited experience with GRADE.J Clin Epidemiol. 2016; 75: 115-118Abstract Full Text Full Text PDF PubMed Scopus (21) Google Scholar Common examples of where individual judgments may vary include when GRADE users are uncertain about the exact ratings for multiple domains or when the same issue affects multiple domains.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar, 15Guyatt G. Oxman A.D. Akl E.A. et al.GRADE guidelines: 1. Introduction - GRADE evidence profiles and summary of findings tables.J Clin Epidemiol. 2011; 64: 383-394Abstract Full Text Full Text PDF PubMed Scopus (5017) Google Scholar In the first situation, GRADE users should consider the domains together and choose the worst rating considered in one domain and the best rating considered in the other8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar; in the second situation, GRADE emphasises that one should not ‘double count’ or ‘penalise twice’, for example if inconsistency is explained by risk of bias, this should only lead to rating down in the risk of bias domain.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar Because the exact judgments may vary between individual GRADE users, explicitly presenting the reasonings for all judgments is paramount to ensure transparency.8Schünemann H, Brożek J, Guyatt G, Oxman A, eds. GRADE Handbook. Available from: https://gdt.gradepro.org/app/handbook/handbook.html (accessed 15 August 2019).Google Scholar While transparency is helpful to understand the rationale behind judgments, it does not protect against the issue of biased assessments, and subjectivity in judgments could make room for flawed certainty assessments. One method that can help minimise this issue is duplicate assessment of certainty of evidence; however, this is not mandated when using the GRADE approach. Use of GRADE in systematic reviews and guidelines comes with clear advantages. First, as discussed above, GRADE increases reproducibility compared with less systematic approaches. Second, GRADE provides a framework for the entire process of conducting a systematic review or developing a guideline. This is important, as the adequacy of every step in the process depends on the systematic and transparent conduct of each previous step: trustworthy recommendations require trustworthy assessments of the certainty of evidence, which require that the evidence base has been systematically assessed, which requires that adequate PICO questions were formulated to begin with. Appropriate use of GRADE thus requires all steps to be performed in a systematic and transparent manner. Third, GRADE assessments have direct implications for both practice and research, with certainty of evidence assessments highlighting where the evidence base is adequate or where either more or better research is needed. Finally, the widespread use of GRADE, the extensive guidance available, and the use of plain language in certainty of evidence assessments and recommendations makes GRADE recognisable and easy to use and interpret. In conclusion, GRADE provides a systematic, transparent, and explicit approach for assessing the certainty of evidence and issuing practice recommendations, and has become an essential component of high-quality systematic reviews and clinical practice guidelines. It is necessary to recognise that recommendations in clinical practice guidelines will not apply to every patient in every case, and considering individual patient conditions, preferences, and values remains absolutely essential when practicing evidence-based medicine. Drafted the first version of this editorial: AG Substantially contributed to critically revising the editorial for important intellectual contents: WA, MHM Approved the final version: all authors AG and WA are members of the GRADE Working Group. WA is chair and MHM a member of the Guidelines in Intensive Care, Development and Evaluation (GUIDE) Group, which uses, advocates, and teaches the use of the GRADE approach; all authors are involved in systematic reviews and/or clinical practice guidelines developed by the GUIDE Group. MHM is chair of the Clinical Practice Committee of the Scandinavian Society of Anaesthesiology and Intensive Care Medicine (SSAI), which has adopted the GRADE approach for all guidelines, and AG has been involved in guideline work within the SSAI. Environmental impacts of nitrous oxide: no laughing matter! Comment on Br J Anaesth 2019; 122: 587–604British Journal of AnaesthesiaVol. 123Issue 4PreviewEditor—We thank Buhre and colleagues1 for their contemporary review of the place for nitrous oxide (N2O) in current clinical practice, particularly for noting the existence of environmental toxicity concerns. We feel, however, that the authors did not adequately assess or communicate the environmental impact of N2O use, and we are concerned by the conclusion that the ‘perceived environmental drawbacks … have been exaggerated or misplaced’. Whilst medical sources of N2O are minor compared with forest destruction, pollution from vehicles, and nitrogen fertilisers, they are not irrelevant and contribute to greenhouse gas accumulation that is accelerating climate change. Full-Text PDF Open ArchiveResponse to comments on ‘The European Society of Anaesthesiology Task Force review on the place of nitrous oxide in current clinical practice’ (Br J Anaesth 2019; 122:587–604)British Journal of AnaesthesiaVol. 123Issue 4PreviewEditor—We thank Imberger and McGain1 and Muret and colleagues2 for their interest in our review3 of the place of N2O in clinical practice and their comments. The primary focus of our review was on the clinical utility of N2O and its risk–benefit in clinical practice. However, given the broader context of climate change and the known contribution of greenhouse gases to global warming, we felt it would have been inappropriate to fail to include atmospheric pollution in the list of drawbacks of N2O, which we believe has been a major contributor to the decline in its routine use in anaesthesia. Full-Text PDF Open ArchiveGRADE quality of evidence: a systematic and objective assessment, not an expression of opinion. Comment on Br J Anaesth 2019; 122: 587–604British Journal of AnaesthesiaVol. 123Issue 4PreviewEditor—We offer some reflections on the narrative review by Buhre and colleagues1 on the place of N2O in current clinical practice beyond environmental concerns. We wish to highlight an additional, but associated, concern that their review did not include the methodology required to make the conclusions that they made. Full-Text PDF Open Archive
Referência(s)