Development, Validation, and Evaluation of a Pediatric Ulcerative Colitis Activity Index: A Prospective Multicenter Study
2007; Elsevier BV; Volume: 133; Issue: 2 Linguagem: Inglês
10.1053/j.gastro.2007.05.029
ISSN1528-0012
AutoresDan Turner, Anthony Otley, David R. Mack, Jeffrey S. Hyams, Joep de Bruijne, Krista Uusoue, Thomas D. Walters, Mary Zachos, Petar Mamula, Dorcas Beaton, A. Hillary Steinhart, Anne M. Griffiths,
Tópico(s)Intestinal Malrotation and Obstruction Disorders
ResumoBackground & Aims: Colonoscopic appearance, the primary measure of disease activity in adult ulcerative colitis, is less acceptable to children. Our aim was to develop a noninvasive activity index of pediatric ulcerative colitis. Methods: Item selection was performed judgmentally using a Delphi group of 36 experts in pediatric inflammatory bowel disease. Item weighting was performed by regression modeling using a prospective cohort of 157 pediatric ulcerative colitis patients. Validation was assessed on a separate prospective cohort of 48 children with ulcerative colitis undergoing complete colonoscopy. Responsiveness was evaluated at a follow-up visit of 75 children using effect size statistics and diagnostic utility approaches. Results: A list of 41 items was generated and reduced to 11 by rank order. Two physicians completed the Pediatric Ulcerative Colitis Activity Index (PUCAI) on each of the patients in the weighting cohort. Six clinical items were significant in the regression analysis; the laboratory items and an endoscopic appearance item did not improve the PUCAI performance. In the validation cohort, the PUCAI was highly correlated with the Physician’s Global Assessment (r = 0.91, P < .001), Mayo score (r = 0.95, P < .001), and colonoscopic appearance (r = 0.77, P < .001). Correlations were higher than 2 noninvasive adult indices calculated concurrently. Interobserver and test-retest reliability were excellent (intraclass correlation coefficient = 0.95; 95% CI: 0.93–0.97). Cut-off points were established using receiver operator characteristic curves on the full cohort. Excellent responsiveness was found at repeated visits (effect size = 1.9, area under the receiver operator characteristic curve = 0.97). Conclusions: The rigorously developed PUCAI is a noninvasive, valid, highly reliable, and responsive index with which to assess disease activity in pediatric ulcerative colitis. Background & Aims: Colonoscopic appearance, the primary measure of disease activity in adult ulcerative colitis, is less acceptable to children. Our aim was to develop a noninvasive activity index of pediatric ulcerative colitis. Methods: Item selection was performed judgmentally using a Delphi group of 36 experts in pediatric inflammatory bowel disease. Item weighting was performed by regression modeling using a prospective cohort of 157 pediatric ulcerative colitis patients. Validation was assessed on a separate prospective cohort of 48 children with ulcerative colitis undergoing complete colonoscopy. Responsiveness was evaluated at a follow-up visit of 75 children using effect size statistics and diagnostic utility approaches. Results: A list of 41 items was generated and reduced to 11 by rank order. Two physicians completed the Pediatric Ulcerative Colitis Activity Index (PUCAI) on each of the patients in the weighting cohort. Six clinical items were significant in the regression analysis; the laboratory items and an endoscopic appearance item did not improve the PUCAI performance. In the validation cohort, the PUCAI was highly correlated with the Physician’s Global Assessment (r = 0.91, P < .001), Mayo score (r = 0.95, P < .001), and colonoscopic appearance (r = 0.77, P < .001). Correlations were higher than 2 noninvasive adult indices calculated concurrently. Interobserver and test-retest reliability were excellent (intraclass correlation coefficient = 0.95; 95% CI: 0.93–0.97). Cut-off points were established using receiver operator characteristic curves on the full cohort. Excellent responsiveness was found at repeated visits (effect size = 1.9, area under the receiver operator characteristic curve = 0.97). Conclusions: The rigorously developed PUCAI is a noninvasive, valid, highly reliable, and responsive index with which to assess disease activity in pediatric ulcerative colitis. See Hakonarson H et al on page 972 in the August 2007 issue of CGH. See Hakonarson H et al on page 972 in the August 2007 issue of CGH. Ulcerative colitis (UC) develops during childhood in up to 25% of patients.1Griffiths A.M. Specificities of inflammatory bowel disease in childhood.Best Pract Res Clin Gastroenterol. 2004; 18: 509-523Abstract Full Text Full Text PDF PubMed Scopus (282) Google Scholar In comparison with adult-onset disease, pediatric-onset UC is more often extensive and therefore more likely to be associated with severe acute exacerbations.1Griffiths A.M. Specificities of inflammatory bowel disease in childhood.Best Pract Res Clin Gastroenterol. 2004; 18: 509-523Abstract Full Text Full Text PDF PubMed Scopus (282) Google Scholar The tendency to spontaneous worsening and improvement of intestinal inflammation mandates that the efficacy of therapies be established in the randomized controlled trial setting. Because new biologic therapies for inflammatory bowel disease (IBD) are rapidly emerging, both adult and pediatric investigators recognize the need to optimize and standardize methodology for assessment of disease activity in clinical trials.2Sands B.E. Biological therapies for ulcerative colitis.Acta Gastroenterol Belg. 2001; 64: 205-209PubMed Google Scholar In contrast to Crohn’s disease, in which the Crohn’s Disease Activity Index (CDAI)3Best W.R. Becktel J.M. Singleton J.W. Kern Jr, F. Development of a Crohn’s disease activity index National Cooperative Crohn’s Disease Study.Gastroenterology. 1976; 70: 439-444Abstract Full Text PDF PubMed Scopus (3035) Google Scholar has become the internationally accepted measure of clinical outcome, no single instrument has been consistently employed in UC.2Sands B.E. Biological therapies for ulcerative colitis.Acta Gastroenterol Belg. 2001; 64: 205-209PubMed Google Scholar Moreover, with the exception of the Seo Index, which has seldom been selected for use in clinical trials, existing multiitem measures in UC have not been rigorously developed and evaluated.4Seo M. Okada M. Yao T. Ueki M. Arima S. Okumura M. An index of disease activity in patients with ulcerative colitis.Am J Gastroenterol. 1992; 87: 971-976PubMed Google Scholar, 5Truelove S.C. Witts L.J. Cortisone in ulcerative colitis; final report on a therapeutic trial.Br Med J. 1955; : 1041-1048Crossref PubMed Scopus (2204) Google Scholar, 6Rachmilewitz D. Coated mesalazine (5-aminosalicylic acid) versus sulphasalazine in the treatment of active ulcerative colitis: a randomised trial.BMJ. 1989; 298: 82-86Crossref PubMed Scopus (977) Google Scholar, 7Schroeder K.W. Tremaine W.J. Ilstrup D.M. Coated oral 5-aminosalicylic acid therapy for mildly to moderately active ulcerative colitis A randomized study.N Engl J Med. 1987; 317: 1625-1629Crossref PubMed Scopus (2096) Google Scholar, 8Lichtiger S. Present D.H. Preliminary report: cyclosporin in treatment of severe active ulcerative colitis.Lancet. 1990; 336: 16-19Abstract PubMed Scopus (299) Google Scholar, 9Powell-Tuck J. Bown R.L. Lennard-Jones J.E. A comparison of oral prednisolone given as single or multiple daily doses for active proctocolitis.Scand J Gastroenterol. 1978; 13: 833-837Crossref PubMed Scopus (276) Google Scholar, 10Walmsley R.S. Ayres R.C. Pounder R.E. Allan R.N. A simple clinical colitis activity index.Gut. 1998; 43: 29-32Crossref PubMed Scopus (914) Google Scholar, 11Azzolini F. Pagnini C. Camellini L. Scarcelli A. Merighi A. Primerano A.M. Bertani A. Antonioli A. Manenti F. Rigo G.P. Proposal of a new clinical index predictive of endoscopic severity in ulcerative colitis.Dig Dis Sci. 2005; 50: 246-251Crossref PubMed Scopus (21) Google Scholar The most widely used UC activity index in adult clinical trials is the Mayo score, which requires sigmoidoscopy at each assessment.12D’Haens G. Sandborn W.J. Feagan B.G. Geboes K. Hanauer S.B. Irvine E.J. Lemann M. Marteau P. Rutgeerts P. Scholmerich J. Sutherland L.R. A review of activity indices and efficacy end points for clinical trials of medical therapy in adults with ulcerative colitis.Gastroenterology. 2007; 132: 763-786Abstract Full Text Full Text PDF PubMed Scopus (815) Google Scholar Indeed, direct examination of the colonic mucosa has become the gold standard for assessing disease activity in adult UC clinical trials,2Sands B.E. Biological therapies for ulcerative colitis.Acta Gastroenterol Belg. 2001; 64: 205-209PubMed Google Scholar, 12D’Haens G. Sandborn W.J. Feagan B.G. Geboes K. Hanauer S.B. Irvine E.J. Lemann M. Marteau P. Rutgeerts P. Scholmerich J. Sutherland L.R. A review of activity indices and efficacy end points for clinical trials of medical therapy in adults with ulcerative colitis.Gastroenterology. 2007; 132: 763-786Abstract Full Text Full Text PDF PubMed Scopus (815) Google Scholar but this is less acceptable to pediatric patients and their parents. Pediatric gastroenterologists participating in the 2004 Crohn’s and Colitis Foundation of America Pediatric IBD clinical trials workshop reviewed the existing measures of UC activity and concluded that a novel noninvasive instrument for use in pediatric patients should be developed; the present study is a product of the workshop’s conclusions. According to the accepted standards of health index development,13Kirshner B. Guyatt G. A methodological framework for assessing health indices.J Chronic Dis. 1985; 38: 27-36Abstract Full Text PDF PubMed Scopus (1268) Google Scholar the introduction of any new index should follow a multistep process of item generation, reduction, grading, and weighting and then evaluation. An instrument that measures disease activity should be valid for its purpose, reliable and responsive. In this article, we describe the development and further assessment of the Pediatric Ulcerative Colitis Activity Index (PUCAI). A Delphi group of pediatric IBD experts was established to aid with item generation, reduction, and gradation. A Delphi technique involves an iterative, multistep process to reach a consensus for a specified question, when the experts are not physically together. The responses of all members of the Delphi group are summarized by an expert panel, processed, and resent to the group for further comments until consensus is reached. Between November 2005 and November 2006, 2 separate prospective cohorts of UC children enrolled at 5 pediatric IBD centers in North America (The Hospital for Sick Children, Toronto; Izaak Walton Killam Hospital, Halifax; Children’s Hospital of Eastern Ontario, Ottawa; Connecticut Children’s Medical Center, Hartford; Children’s Hospital of Philadelphia, Pennsylvania) were utilized for the purposes of item weighting and subsequent index evaluation. At all times, UC diagnosis was confirmed using established criteria.14Schachter H. Kirsner J.B. Definitions of inflammatory bowel disease of unknown etiology.Gastroenterology. 1975; 68: 591-600PubMed Scopus (102) Google Scholar Because some of the clinical features of UC do not apply to the same extent in ulcerative proctitis, patients with inflammation confined to the rectum were excluded. Left-sided and extensive colitis were defined as macroscopic inflammation ending, respectively, distally or proximally to the splenic flexure.15Silverberg M.S. Satsangi J. Ahmad T. Arnott I.D. Bernstein C.N. Brant S.R. Caprilli R. Colombel J.F. Gasche C. Geboes K. Jewell D.P. Karban A. Loftus Jr, E.V. Pena A.S. Riddell R.H. Sachar D.B. Schreiber S. Steinhart A.H. Targan S.R. Vermeire S. Warren B.F. Toward an integrated clinical, molecular and serological classification of inflammatory bowel disease: Report of a Working Party of the 2005 Montreal World Congress of Gastroenterology.Can J Gastroenterol. 2005; 19: 5-36Crossref PubMed Scopus (2462) Google Scholar For the item weighting and reliability assessment phase, hospitalized and ambulatory newly diagnosed and chronic UC patients 2–18 years of age were eligible for enrollment. For the purposes of validation, a second similar cohort of UC children, undergoing full colonoscopy, was enrolled. Responsiveness and test-retest reliability were assessed using the subset of patients who were seen twice during the enrollment period. This study was approved by the Institutional Review Board of each participating center. In addition to informed parental consent, age-appropriate patient consent or assent was also obtained. The items were first generated through a systematic review of the literature to identify variables in existing UC indices. The preliminary list of items was distributed to a Delphi group of 48 pediatric gastroenterologists in North America, experienced in IBD, who were asked to comment on the list and identify further indicators. The Delphi group members were asked to rank each item from 0 (not important) to 3 (extremely important). The results were analyzed using mean ratings and rank order of ratings. The 11 highest ranked variables were sent again to the Delphi group for further ranking. This clinimetric strategy has been used previously in the development of numerous scales16Marx R.G. Bombardier C. Hogg-Johnson S. Wright J.G. Clinimetric and psychometric strategies for development of a health measurement scale.J Clin Epidemiol. 1999; 52: 105-111Abstract Full Text Full Text PDF PubMed Scopus (122) Google Scholar and found to be robust.17Marx R.G. Bombardier C. Hogg-Johnson S. Wright J.G. How should importance and severity ratings be combined for item reduction in the development of health status instruments?.J Clin Epidemiol. 1999; 52: 193-197Abstract Full Text Full Text PDF PubMed Scopus (23) Google Scholar To minimize interobserver variability, we established logical gradations and clear definitions of items. Gradations used in other instruments were tabulated and reviewed. Proposed gradation schemes for each item of the PUCAI were distributed to the Delphi group, with the final instrument reflecting consensus opinions. The aim of this phase was to finalize the index by removing unnecessary items and to weight the gradations using a large prospective cohort. The purpose of a disease activity index is to establish an assessment of overall disease activity that is reproducible between different assessors. Therefore, the primary outcome variable for this step was the Physician’s Global Assessment (PGA), as measured by a visual analogue scale (0–100 mm) with anchors at the extremes (none and fulminant). We used a physician-based rather than a patient-based approach because we wished the outcome to reflect “disease activity,” which is best judged by experienced physicians.18Wright J.G. Feinstein A.R. A comparative contrast of clinimetric and psychometric methods for constructing indexes and rating scales.J Clin Epidemiol. 1992; 45: 1201-1218Abstract Full Text PDF PubMed Scopus (177) Google Scholar For each patient, 2 independent pediatric gastroenterologists completed a PGA of disease activity following complete history taking and physical examination but before scoring the PUCAI. The mean PGA score served as the dependent variable of the model and the PUCAI items as the explanatory variables. Additional baseline and laboratory data were recorded by a research coordinator. The resulting mathematical draft PUCAI was sent to the Delphi group and was slightly revised, also based on maximizing the R2 of the regression model.18Wright J.G. Feinstein A.R. A comparative contrast of clinimetric and psychometric methods for constructing indexes and rating scales.J Clin Epidemiol. 1992; 45: 1201-1218Abstract Full Text PDF PubMed Scopus (177) Google Scholar Interobserver reliability was assessed by comparing the 2 independent PUCAI assessments of the 2 physicians. In the second cohort, 4 constructs of disease activity were used to achieve a comprehensive evaluation (Figure 1): (1) colonoscopic appearance, (2) PGA, (3) a reference invasive index (the most commonly used adult score, the Mayo score7Schroeder K.W. Tremaine W.J. Ilstrup D.M. Coated oral 5-aminosalicylic acid therapy for mildly to moderately active ulcerative colitis A randomized study.N Engl J Med. 1987; 317: 1625-1629Crossref PubMed Scopus (2096) Google Scholar), and (4) The Truelove and Witts classification of disease activity.5Truelove S.C. Witts L.J. Cortisone in ulcerative colitis; final report on a therapeutic trial.Br Med J. 1955; : 1041-1048Crossref PubMed Scopus (2204) Google Scholar Activity of macroscopic colitis was scored 0–3 at each of the colon segments (ascending, transverse, descending, sigmoid, and rectum) using Beattie’s grading system of quiescent, mild, moderate, or severe.19Beattie R.M. Nicholls S.W. Domizio P. Williams C.B. Walker-Smith J.A. Endoscopic assessment of the colonic response to corticosteroids in children with ulcerative colitis.J Pediatr Gastroenterol Nutr. 1996; 22: 373-379Crossref PubMed Scopus (83) Google Scholar A final score (0–15) was calculated by adding all segment scores.20Mary J.Y. Modigliani R. Groupe d’Etudes Therapeutiques des Affections Inflammatoires du Tube Digestif (GETAID)Development and validation of an endoscopic index of the severity for Crohn’s disease: a prospective multicentre study.Gut. 1989; 30: 983-989Crossref PubMed Scopus (328) Google Scholar, 21Gomes P. du Boulay C. Smith C.L. Holdstock G. Relationship between disease activity indices and colonoscopic findings in patients with colonic inflammatory bowel disease.Gut. 1986; 27: 92-95Crossref PubMed Scopus (325) Google Scholar A priori, we expected good correlation with the macroscopic colonoscopy score, the PGA, and the invasive Mayo score. Two other noninvasive adult indices (Seo4Seo M. Okada M. Yao T. Ueki M. Arima S. Okumura M. An index of disease activity in patients with ulcerative colitis.Am J Gastroenterol. 1992; 87: 971-976PubMed Google Scholar and Lichtiger8Lichtiger S. Present D.H. Preliminary report: cyclosporin in treatment of severe active ulcerative colitis.Lancet. 1990; 336: 16-19Abstract PubMed Scopus (299) Google Scholar) were calculated simultaneously for comparison with the PUCAI performance. The physicians who calculated the PUCAI in the validation cohort followed the same instructions outlined in the PUCAI user guide (Appendix 1). Responsiveness and test-retest reliability were established utilizing patients who were seen twice during the study period. At the second visit, in addition to the procedures described for the weighting cohort, the 2 physicians were also asked to answer the question “how much do you think the disease activity of this patient changed since the last visit?” The change was graded using a 7-point Likert scale of change (ie, 1, significantly worse; 2, moderately worse; 3, mildly worse; 4, almost the same; 5, mildly improved; 6, moderately improved; and 7, significantly improved22Guyatt G.H. Deyo R.A. Charlson M. Levine M.N. Mitchell A. Responsiveness and validity in health status measurement: a clarification.J Clin Epidemiol. 1989; 42: 403-408Abstract Full Text PDF PubMed Scopus (380) Google Scholar, 23Kundhal P.S. Critch J.N. Zachos M. Otley A.R. Stephens D. Griffiths A.M. Pediatric Crohn’s Disease Activity Index: responsive to short-term change.J Pediatr Gastroenterol Nutr. 2003; 36: 83-89Crossref PubMed Scopus (49) Google Scholar). Physicians were encouraged to review the notes from the previous visit but were blinded to the previous PUCAI score. Our Likert scale of change met the required criteria for use as an external criterion, defined by Guyatt et al.24Schunemann H.J. Puhan M. Goldstein R. Jaeschke R. Guyatt G.H. Measurement properties and interpretability of the Chronic Respiratory Disease Questionnaire (CRQ).COPD. 2005; 2: 81-89Crossref PubMed Scopus (206) Google Scholar, 25Guyatt G.H. Norman G.R. Juniper E.F. Griffith L.E. A critical look at transition ratings.J Clin Epidemiol. 2002; 55: 900-908Abstract Full Text Full Text PDF PubMed Scopus (232) Google Scholar The minimal clinically important difference (MCID) was set as the change score on the PUCAI with the highest combined sensitivity and specificity (obtained with receiver operating characteristic [ROC] curve) to differentiate improved (defined as “at least moderate improvement,” ie, score of ≥6 on the 7-point Likert scale) vs nonimproved patients.23Kundhal P.S. Critch J.N. Zachos M. Otley A.R. Stephens D. Griffiths A.M. Pediatric Crohn’s Disease Activity Index: responsive to short-term change.J Pediatr Gastroenterol Nutr. 2003; 36: 83-89Crossref PubMed Scopus (49) Google Scholar, 26Hays R.D. Woolley J.M. The concept of clinically meaningful difference in health-related quality-of-life research How meaningful is it?.Pharmacoeconomics. 2000; 18: 419-423Crossref PubMed Scopus (382) Google Scholar, 27Beaton D.E. Boers M. Wells G.A. Many faces of the minimal clinically important difference (MCID): a literature review and directions for future research.Curr Opin Rheumatol. 2002; 14: 109-114Crossref PubMed Scopus (398) Google Scholar Two other change cutoffs corresponding to physician grading of at least “5, mildly improved” and “7, significantly improved” in the Likert scale of change were also calculated. Data are presented as means ± standard deviation or medians (interquartile range) and compared using 1-way analysis of variance (ANOVA) or Kruskal–Wallis test on ranks, as appropriate for the distribution of normality. Correlations between individual parameters were sought using Spearman or Pearson correlations, as appropriate for the distribution normality. To determine the weighting of each item, a multivariate regression analysis was used with the PUCAI items as the explanatory variable and PGA as the dependent variable. To test the contribution of endoscopic items to the total variance of the PUCAI, a similar model was fitted with an invasive version of the PUCAI as the dependent variable and the clinical items only as the explanatory variables. The model R2 was used to determine the added value of the endoscopic score to the total variance. The gradations of “stool blood” and “stool frequency” followed linear escalation and, thus, were entered as continuous items. The assumptions of the final model were confirmed by plotting the residual distribution. Serial ROC curves (±95% confidence intervals [CI]) were used to optimize gradations of individual PUCAI items and to define cut off scores for categorical disease activity (none, mild, moderate, and severe). Construct validation was based on correlation analyses between the index score and the construct. Correlation r of 0–0.25 was considered as lack of correlation, 0.25–0.5 as fair, 0.5–0.75 as moderate to good, and >0.75 as very good to excellent correlation. Interobserver reliability was assessed on the 2 PUCAI scores calculated independently by 2 physicians at the same visit. Test-retest reliability was established by comparing the first and the follow-up PUCAI scores of patients whose disease activity was thought to be unchanged. Intraclass correlation coefficient (ICC; using Shrout and Fleiss’s 2,1 2-way random ANOVA model ± 95% CI28Shrout P. Fleiss J. Intraclass correlations: uses in assessing rater reliability.Psychol Bull. 1979; 86: 420-428Crossref PubMed Scopus (17198) Google Scholar) was used to assess reliability of each item and the total score. ICC > 0.9 was considered as excellent reliability (and sufficient for individual patient level) and 0.8–0.9 as good reliability. Changes in scores (labeled as the Greek letter Δ) were determined by subtracting the follow-up score from the initial score. Responsiveness was assessed using 3 different approaches and 6 statistical methods29Deyo R.A. Centor R.M. Assessing the responsiveness of functional scales to clinical change: an analogy to diagnostic test performance.J Chronic Dis. 1986; 39: 897-906Abstract Full Text PDF PubMed Scopus (658) Google Scholar:1Effect size statistics: represent the magnitude of change in the evaluated index (signal [observed change] to noise [some measure of variance] ratio tests). The following statistical tests were used: (1) paired Student t test, comparing the baseline and follow-up PUCAI in the improved group; (2) standardized response mean, calculated by dividing the mean ΔPUCAI by its standard deviation30Liang M.H. Fossel A.H. Larson M.G. Comparisons of five health status instruments for orthopedic evaluation.Med Care. 1990; 28: 632-642Crossref PubMed Scopus (807) Google Scholar; (3) standardized effect size, calculated by dividing the mean ΔPUCAI score by the baseline standard deviation31Stratford P.W. Binkley F.M. Riddle D.L. Health status measures: strategies and analytic methods for assessing change scores.Phys Ther. 1996; 76: 1109-1123PubMed Google Scholar; and (4) Guyatt et al’s responsiveness statistic, calculated by dividing the PUCAI’s MCID with the standard deviation of the ΔPUCAI of the unchanged patients.32Guyatt G. Walter S. Norman G. Measuring change over time: assessing the usefulness of evaluative instruments.J Chronic Dis. 1987; 40: 171-178Abstract Full Text PDF PubMed Scopus (1545) Google Scholar A higher effect size statistic indicates a greater change effect, and, as a general rule, a score of >0.8 is considered a large effect, 0.5–0.8 as moderate, and 0.2–0.5 as small.23Kundhal P.S. Critch J.N. Zachos M. Otley A.R. Stephens D. Griffiths A.M. Pediatric Crohn’s Disease Activity Index: responsive to short-term change.J Pediatr Gastroenterol Nutr. 2003; 36: 83-89Crossref PubMed Scopus (49) Google Scholar We expected to find a large effect (>0.8) in the group of patients whose disease activity was changed and a small effect (0.2–0.5) in the unchanged group.2Correlational: correlation with external measure of change (ΔPUCAI with ΔPGA and Likert scale of change).3Diagnostic utility tests: to differentiate the patients who changed at least moderately from unchanged patients (using ROC curves, sensitivity, and specificity29Deyo R.A. Centor R.M. Assessing the responsiveness of functional scales to clinical change: an analogy to diagnostic test performance.J Chronic Dis. 1986; 39: 897-906Abstract Full Text PDF PubMed Scopus (658) Google Scholar). Area under the ROC curve of over 0.7 was considered indicative of “fair” discriminative ability, 0.8 as “good,” and over 0.9 as “excellent.” Sample size calculations were based on 10 patients/degrees of freedom33Babyak M.A. What you see may not be what you get: a brief, nontechnical introduction to overfitting in regression-type models.Psychosom Med. 2004; 66: 411-421Crossref PubMed Scopus (1384) Google Scholar for the weighting phase (n = 160) and a further 47 patients for the validation phase (The correlation between PUCAI and the colonoscopy score was expected to be greater than 0.75; a sample size of 47 achieves 80% power to detect a difference of 0.25 between the alternative hypothesis and the null hypothesis correlation of 0.5 using a 2-sided hypothesis test with a significance level of 0.05). Data were entered to a preprogrammed Access (Microsoft Corporation, Redmond, WA) database with 3% missing data occurring at random, for which imputation was performed using the hot deck method.34Streiner D.L. The case of the missing data: methods of dealing with dropouts and other research vagaries.Can J Psychiatry. 2002; 47: 68-75Crossref PubMed Scopus (259) Google Scholar No outcome data (ie, PGA, colonoscopic score, Mayo score) were missing. All comparisons were made using 2-sided significance levels of P < .05. Statistical analyses were performed using SAS V9.1 (SAS Institute, Gary, NC) and SPSS V12.0 (SPSS Inc, Chicago, IL). Of the 48 pediatric IBD experts contacted, 36 responded and served as the Delphi group. A list of 21 potential items was generated by the expert panel, and the Delphi group added another 20. Eleven variables, which were ranked significantly higher than the others, were graded by the Delphi group from 1, important, to 11, least important (Table 1). One item, “rectal urgency,” was removed because of perceived difficulties in applying this variable to young children. C-reactive protein (CRP), although not suggested by the Delphi group, was added to the list because of its importance in the UC literature.19Beattie R.M. Nicholls S.W. Domizio P. Williams C.B. Walker-Smith J.A. Endoscopic assessment of the colonic response to corticosteroids in children with ulcerative colitis.J Pediatr Gastroenterol Nutr. 1996; 22: 373-379Crossref PubMed Scopus (83) Google Scholar, 35Lindgren S.C. Flood L.M. Kilander A.F. Lofberg R. Persson T.B. Sjodahl R.I. Early predictors of glucocorticosteroid treatment failure in severe and moderately severe attacks of ulcerative colitis.Eur J Gastroenterol Hepatol. 1998; 10: 831-835Crossref PubMed Scopus (223) Google Scholar, 36Travis S.P. Farrant J.M. Ricketts C. Nolan D.J. Mortensen N.M. Kettlewell M.G. Jewell D.P. Predicting outcome in severe ulcerative colitis.Gut. 1996; 38: 905-910Crossref PubMed Scopus (603) Google Scholar The final gradation schemes reflect consensus opinion within the Delphi group. Based on the above stages, an initial draft PUCAI was used in the weighting and evaluation phase.Table 1Items That Reflect Disease Activity in Pediatric UC, Ranked Highest by the Delphi Group and Their Corresponding β Coefficients and P Values From the Modeling of the Weighting CohortItemDelphi group rank orderaExperts were asked to rank the items from 1 (most important) to 11 (least important), thus lower mean scores represent higher ranked items.β coefficientbFor simplicity, the β coefficients represent the score of the item, entered as continuous variable; however, the actual PUCAI weighting was based on explicit dummy variables.P valueRectal bleeding2.39.9< .001Stool frequency3.13.9.004Nocturnal stools4.47.7.002Abdominal pain4.56.1< .001Hemoglobin (excluded a priori)4.8——Abdominal tenderness (excluded post hoc)5.2——Stool consistency6.03.2.05Fever (excluded post hoc)6.2——Albumin6.61.2.39Rectal urgency (excluded a priori)6.6——Activity level75.8< .001CRP (forced a priori into the model)—0.5.68a Experts were asked to rank the items from 1 (most important) to 11 (least important), thus lower mean s
Referência(s)