Endometrial Carcinoma Biomarker Discovery and Verification Using Differentially Tagged Clinical Samples with Multidimensional Liquid Chromatography and Tandem Mass Spectrometry
2007; Elsevier BV; Volume: 6; Issue: 7 Linguagem: Inglês
10.1074/mcp.m600378-mcp200
ISSN1535-9484
AutoresLeroi V. DeSouza, Jörg Grigull, Shaun Ghanny, Valérie Dubé, Alexander Romaschin, Terence J. Colgan, K. W. Michael Siu,
Tópico(s)Metabolomics and Mass Spectrometry Studies
ResumoThe utility of differentially expressed proteins discovered and identified in an earlier study (DeSouza, L., Diehl, G., Rodrigues, M. J., Guo, J., Romaschin, A. D., Colgan, T. J., and Siu, K. W. M. (2005) Search for cancer markers from endometrial tissues using differentially labeled tags iTRAQ and cleavable ICAT with multidimensional liquid chromatography and tandem mass spectrometry. J. Proteome Res. 4, 377–386) to discriminate malignant and benign endometrial tissue samples was verified in a 40-sample iTRAQ (isobaric tags for relative and absolute quantitation) labeling study involving normal proliferative and secretory samples and Types I and II endometrial cancer samples. None of these proteins had the sensitivity and specificity to be used individually to discriminate between normal and cancer samples. However, a panel of pyruvate kinase, chaperonin 10, and α1-antitrypsin achieved the best results with a sensitivity, specificity, predictive value, and positive predictive value of 0.95 each in a logistic regression analysis. In addition, three new potential markers were discovered, whereas two other proteins showed promising trends but were not detected in sufficient numbers of samples to permit statistical validation. Differential expressions of some of these candidate biomarkers were independently verified using immunohistochemistry. The utility of differentially expressed proteins discovered and identified in an earlier study (DeSouza, L., Diehl, G., Rodrigues, M. J., Guo, J., Romaschin, A. D., Colgan, T. J., and Siu, K. W. M. (2005) Search for cancer markers from endometrial tissues using differentially labeled tags iTRAQ and cleavable ICAT with multidimensional liquid chromatography and tandem mass spectrometry. J. Proteome Res. 4, 377–386) to discriminate malignant and benign endometrial tissue samples was verified in a 40-sample iTRAQ (isobaric tags for relative and absolute quantitation) labeling study involving normal proliferative and secretory samples and Types I and II endometrial cancer samples. None of these proteins had the sensitivity and specificity to be used individually to discriminate between normal and cancer samples. However, a panel of pyruvate kinase, chaperonin 10, and α1-antitrypsin achieved the best results with a sensitivity, specificity, predictive value, and positive predictive value of 0.95 each in a logistic regression analysis. In addition, three new potential markers were discovered, whereas two other proteins showed promising trends but were not detected in sufficient numbers of samples to permit statistical validation. Differential expressions of some of these candidate biomarkers were independently verified using immunohistochemistry. Differential tagging with isotopic reagents, such as ICAT (1Gygi S.P. Rist B. Gerber S.A. Turecek F. Gelb M.H. Aebersold R. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags.Nat. Biotechnol. 1999; 17: 994-999Crossref PubMed Scopus (4293) Google Scholar) or the more recent variation that uses isobaric tagging reagents, iTRAQ 1The abbreviations used are: iTRAQ, isobaric tags for relative and absolute quantitation; PCM, potential cancer marker; EmCa, endometrial carcinoma; SCX, strong cation exchange; RP, reverse phase; PIGR, polymeric immunoglobulin receptor; PK, pyruvate kinase; Cpn 10, chaperonin 10; PV, predictive value; PPV, positive predictive value; AUC, area under the curve; MIF, macrophage migration-inhibitory factor; AAT, α1-antitrypsin; CKB, creatine kinase chain B; PEBP, phosphatidylethanolamine-binding protein; hnRNP D0, heterogeneous nuclear ribonucleoprotein D0; WFDC2, WAP four-disulfide core domain protein 2; PMN, polymorphonuclear neutrophil; MMP-2, metalloproteinase-2; RA, retinoic acid; P-LAP, placental leucine aminopeptidase; IL, interleukin; LAP3, leucine aminopeptidase 3; SLPI, secretory leucocyte protease inhibitor; NE, neutrophil elastase. 1The abbreviations used are: iTRAQ, isobaric tags for relative and absolute quantitation; PCM, potential cancer marker; EmCa, endometrial carcinoma; SCX, strong cation exchange; RP, reverse phase; PIGR, polymeric immunoglobulin receptor; PK, pyruvate kinase; Cpn 10, chaperonin 10; PV, predictive value; PPV, positive predictive value; AUC, area under the curve; MIF, macrophage migration-inhibitory factor; AAT, α1-antitrypsin; CKB, creatine kinase chain B; PEBP, phosphatidylethanolamine-binding protein; hnRNP D0, heterogeneous nuclear ribonucleoprotein D0; WFDC2, WAP four-disulfide core domain protein 2; PMN, polymorphonuclear neutrophil; MMP-2, metalloproteinase-2; RA, retinoic acid; P-LAP, placental leucine aminopeptidase; IL, interleukin; LAP3, leucine aminopeptidase 3; SLPI, secretory leucocyte protease inhibitor; NE, neutrophil elastase. (Applied Biosystems, Foster City, CA), followed by multidimensional LC and MS/MS analysis is quickly being recognized as one of the more powerful methodologies in the search for biomarkers for various disease states. Our recent studies using both ICAT and iTRAQ reagents as means to facilitate the identification and relative quantification of proteins from endometrial tissue homogenates have resulted in some interesting potential cancer markers (PCMs) (2DeSouza L. Diehl G. Yang E.C.C. Guo J. Rodrigues M.J. Romaschin A.D. Colgan T.J. Siu K.W.M. Proteomic analysis of the proliferative and secretory phases of the human endometrium: protein identification and differential protein expression.Proteomics. 2005; 5: 270-281Crossref PubMed Scopus (77) Google Scholar, 3DeSouza L. Diehl G. Rodrigues M.J. Guo J. Romaschin A.D. Colgan T.J. Siu K.W.M. Search for cancer markers from endometrial tissues using differentially labeled tags iTRAQ and cICAT with multidimensional liquid chromatography and tandem mass spectrometry.J. Proteome Res. 2005; 4: 377-386Crossref PubMed Scopus (325) Google Scholar). Those studies, however, were performed on small sample sets. This study describes the results of a more detailed investigation using a larger cohort of 40 samples and the iTRAQ technology and was aimed at validating the earlier results as well as expanding the panel of biomarkers. Endometrial carcinoma (EmCa), a cancer of the lining of the uterus, is the fourth most common cancer in Canadian women. 2National Cancer Institute of Canada, Canadian Cancer Statistics 2006, www.ncic.cancer.ca/vgn/images/portal/cit_86751114/31/23/935505938cw_2006stats_en.pdf.pdf. 2National Cancer Institute of Canada, Canadian Cancer Statistics 2006, www.ncic.cancer.ca/vgn/images/portal/cit_86751114/31/23/935505938cw_2006stats_en.pdf.pdf. Current methods of diagnosis rely on invasive techniques (biopsy and curettage), and no screening is available. A panel of biomarkers that helps in early diagnosis would, therefore, be useful especially for high risk groups, e.g. women who are on tamoxifen treatment or have hereditary nonpolyposis colorectal cancer syndrome. Although the eventual diagnostic testing for such biomarkers would be most facile from bodily fluids, such as blood or urine, the iTRAQ experiments performed thus far have been on resected EmCa from uterine tissues (hysterectomy specimens) (2DeSouza L. Diehl G. Yang E.C.C. Guo J. Rodrigues M.J. Romaschin A.D. Colgan T.J. Siu K.W.M. Proteomic analysis of the proliferative and secretory phases of the human endometrium: protein identification and differential protein expression.Proteomics. 2005; 5: 270-281Crossref PubMed Scopus (77) Google Scholar, 3DeSouza L. Diehl G. Rodrigues M.J. Guo J. Romaschin A.D. Colgan T.J. Siu K.W.M. Search for cancer markers from endometrial tissues using differentially labeled tags iTRAQ and cICAT with multidimensional liquid chromatography and tandem mass spectrometry.J. Proteome Res. 2005; 4: 377-386Crossref PubMed Scopus (325) Google Scholar). The rationale for this approach is that the concentration of any biomarker is most likely highest in the cancerous tissue itself and not when diluted in the bodily fluids, thus facilitating discovery. In addition, the use of the cancerous tissue reduces the intrinsic need to demonstrate that any differentially expressed protein detected originates from the endometrial cancer. By contrast, the origins of differentially expressed protein in the blood include a variety of potential sites other than the actual tumor. The use of homogenized tissues yields a heterogeneous sample with the proteome being contributed by the stroma, vasculature, blood, and malignant/benign epithelium. This heterogeneity may attenuate, and even mask, the variation in protein expression levels characteristic of cancerous epithelial cells. One remedy for this drawback that we have adopted is the use of laser capture microdissection to procure the specific, malignant epithelial cells from the samples (5Guo J. Colgan T.J. DeSouza L.V. Rodrigues M.J. Romaschin A.D. Siu K.W.M. Direct analysis of laser capture microdissected endometrial carcinoma and epithelium by matrix-assisted laser desorption/ionization mass spectrometry.Rapid Commun. Mass Spectrom. 2005; 19: 2762-2766Crossref PubMed Scopus (28) Google Scholar). This approach, however, is not practical when 103–104 cells/sample are required for current proteomics techniques in a global biomarker discovery strategy. Thus far, the types of differentially expressed proteins discovered (2DeSouza L. Diehl G. Yang E.C.C. Guo J. Rodrigues M.J. Romaschin A.D. Colgan T.J. Siu K.W.M. Proteomic analysis of the proliferative and secretory phases of the human endometrium: protein identification and differential protein expression.Proteomics. 2005; 5: 270-281Crossref PubMed Scopus (77) Google Scholar, 3DeSouza L. Diehl G. Rodrigues M.J. Guo J. Romaschin A.D. Colgan T.J. Siu K.W.M. Search for cancer markers from endometrial tissues using differentially labeled tags iTRAQ and cICAT with multidimensional liquid chromatography and tandem mass spectrometry.J. Proteome Res. 2005; 4: 377-386Crossref PubMed Scopus (325) Google Scholar) are primarily medium to high abundance proteins as universal detection methods, including the MS/MS technologies that were used, are much more efficient in detecting major rather than minor components in a complex mixture. As described previously, our strategy in the search for EmCa markers requires a comparison between the cancerous endometrium and the two major phases, proliferative and secretory, of the normal reproductive aged endometrium (3DeSouza L. Diehl G. Rodrigues M.J. Guo J. Romaschin A.D. Colgan T.J. Siu K.W.M. Search for cancer markers from endometrial tissues using differentially labeled tags iTRAQ and cICAT with multidimensional liquid chromatography and tandem mass spectrometry.J. Proteome Res. 2005; 4: 377-386Crossref PubMed Scopus (325) Google Scholar, 6Buckley C.H. Normal endometrium and non-proliferative conditions of the endometrium.in: Fox H. Wells M. Obstetrical and Gynaecological Pathology. 5th Ed. Churchill-Livingstone, London2003: 391-442Google Scholar). EmCa occurs primarily in postmenopausal women; normal atrophic endometrium contains few glands and few epithelial cells. Thus we chose normal proliferative and secretory samples as controls. The multiplexing ability afforded by the iTRAQ reagents, which are available in four different tags, is well suited for such a simultaneous comparison especially in view of the fact that endometrial carcinoma itself can have two distinct morphologic and physiologic types. Type I cancers are endometrioid in histologic typing, well differentiated, and estrogen-dependent and have typically a better prognosis. By contrast, Type II carcinomas are serous and clear cell carcinomas, hormone-independent, and aggressive and have generally a poorer clinical outcome (7Cao Q.J. Belbin T. Socci N. Balan R. Prystowsky M.B. Childs G. Jones J.G. Distinctive gene expression profiles by cDNA microarrays in endometrioid and serous carcinomas of the endometrium.Int. J. Gynecol. Pathol. 2004; 23: 321-329Crossref PubMed Scopus (36) Google Scholar). In the current study, we differentiated between these two EmCa categories, thus fully utilizing all four iTRAQ labeling categories in the analysis. A key to mass tagging is the ability to mix the samples and perform subsequent isotope dilution mass analysis. Mixing ensures that any differences in expression levels of individual proteins are a result of initial differences between the samples in the set and not an artifact of differences in sample handling or processing. To ameliorate natural and unavoidable variations, such as the inherent interindividual variations that would exist between clinical samples from different patients, the experiments were performed in 10 sets of four samples each. Additionally the 10 normal proliferative samples that were used as the controls in each of the sets were also separately labeled and compared with each other in a separate set of experiments. This permitted the evaluation of the extent of variation attributable to interindividual expression levels for each of the PCMs as well as provided a means of normalizing the ratios of individual protein levels in different sets to the average expression level in the normal proliferative samples. Endometrial tissues were retrieved from an in-house, dedicated, research endometrial tissue bank. With patient consent, samples from hysterectomy specimens had been flash frozen in liquid nitrogen within 20 min of devitalizing. The patient consent forms and tissue-banking procedures were approved by the Research Ethics Boards of York University, Mount Sinai Hospital, University Health Network, and North York General Hospital. These frozen samples were sectioned and stored at −80 °C. The histologic diagnosis for each sample was confirmed using microscopic examination of a hematoxylin- and eosin-stained frozen section of each research tissue block. The tissue from the mirror face of the histologic section was then washed three times in ∼1 ml of PBS with a mixture of protease inhibitors as described previously (1 mm 4-(2-aminoethyl)benzenesulfonyl fluoride, 10 μm leupeptin, 1 μg/ml aprotinin, and 1 μm pepstatin) (3DeSouza L. Diehl G. Rodrigues M.J. Guo J. Romaschin A.D. Colgan T.J. Siu K.W.M. Search for cancer markers from endometrial tissues using differentially labeled tags iTRAQ and cICAT with multidimensional liquid chromatography and tandem mass spectrometry.J. Proteome Res. 2005; 4: 377-386Crossref PubMed Scopus (325) Google Scholar). The washed tissue was then homogenized in 0.5 ml of PBS with protease inhibitors using a hand-held homogenizer. These homogenates were then flash frozen in liquid nitrogen and stored at −80 °C until use. Samples were thawed and clarified by centrifugation, and the protein concentration was determined by a Bradford-type assay using Bio-Rad's protein quantification reagent. Two hundred micrograms of each of the 40 samples were then labeled individually with an iTRAQ tag. As we were using double the manufacturer's suggested amounts (Applied Biosystems), we used two individual vials of each tag for labeling each sample. Trypsin digestion and labeling were performed according to the manufacturer's protocol. Normal proliferative, normal secretory, Type I cancer, and Type II cancer samples were labeled with the 114, 115, 116, and 117 tags, respectively. The trypsin-digested and labeled samples were then mixed in sets of four with each set containing one of each type of label, thus resulting in 10 sets in total. Each set of labeled samples was then separated by SCX fractionation using an HP1050 HPLC instrument (Agilent, Palo Alto, CA) with a 2.1-mm-internal diameter × 100-mm-length PolyLC Polysulfoethyl A column packed with 5-μm beads with 300-Å pores (The Nest Group, Southborough, MA). A 2.1-mm-internal diameter × 10-mm-length guard column of the same material was fitted immediately upstream of the analytical column. Separation was performed as described previously (3DeSouza L. Diehl G. Rodrigues M.J. Guo J. Romaschin A.D. Colgan T.J. Siu K.W.M. Search for cancer markers from endometrial tissues using differentially labeled tags iTRAQ and cICAT with multidimensional liquid chromatography and tandem mass spectrometry.J. Proteome Res. 2005; 4: 377-386Crossref PubMed Scopus (325) Google Scholar). Briefly each pooled sample set was diluted with the loading buffer (15 mm KH2PO4 in 25% acetonitrile, pH 3.0) to a total volume of 2 ml, and the pH was adjusted to 3.0 with phosphoric acid. Samples were then filtered using a 0.45-μm syringe filter (Millipore, Cambridge, Ontario, Canada) before loading onto the column. Separation was performed using a linear binary gradient over 1 h. Buffer A was identical in composition to the loading buffer; Buffer B was Buffer A containing 350 mm KCl. Fractions were collected every 2 min using an SF-2120 Super Fraction Collector (Advantec MFS, Dublin, CA) after an initial wait of 2 min to accommodate the void volume. This resulted in a total of 30 SCX fractions per sample set. These fractions were dried by speed vacuuming (Thermo Savant SC110 A, Holbrook, NY) and resuspended in 30 μl of 0.1% formic acid each. The fractions from 6 to 25 were then analyzed by nano-LC-MS/MS using the LC Packings Ultimate instrument (Amsterdam, The Netherlands) fitted with a 1-μl sample loop. Samples were loaded onto a 5-mm reverse phase (RP) C18 precolumn (LC Packings) at 50 μl/min and washed for 4 min before switching the precolumn in line with the separation column. The separation column used was either a 75-μm-internal diameter × 150-mm-length PepMap RP column from LC Packings packed with 3-μm C18 beads with 100-Å pores or an in-house equivalent packed with similar beads from Kromasil (The Nest Group). The flow rate used for separation on the RP column was 200 nl/min, and the gradient was as shown in Table I.Table IGradient used for LC-MS/MSTime (min)01015125145150160162188B (%)5515356080805Stop Open table in a new tab Samples were analyzed on a Q-STAR Pulsar i mass spectrometer (Applied Biosystems/MDS SCIEX, Foster City, CA) in information-dependent acquisition mode with the scan cycles set up to perform a 1-s MS scan followed by five MS/MS scans of the five most abundant peaks for 2 s each. For the first set of runs, the acquisition method was set up to allow one repetition of any m/z followed by a dynamic exclusion for a period of 60 s. The method was also set up to select the smallest peaks in the MS scan that are nearest to a threshold of 10 counts on every fourth scan. The last set of runs were performed using the same method but without any repetitions and with a dynamic exclusion of 30 s. Each sample was run a minimum of two times and a maximum of three times. The last run for each sample was performed using an inclusion list populated by m/z values that corresponded to peptides that appear to be proteotypic (8Craig R. Cortens J.P. Beavis R.C. The use of proteotypic peptide libraries for protein identification.Rapid Commun. Mass Spectrom. 2005; 19: 1844-1850Crossref PubMed Scopus (139) Google Scholar, 9Kuster B. Schirle M. Mallick P. Aebersold R. Scoring proteomes with proteotypic peptide probes.Nat. Rev. Mol. Cell. Biol. 2005; 6: 577-583Crossref PubMed Scopus (299) Google Scholar) for proteins that were deemed to be of interest after evaluating the results of the first set of runs. Relative protein abundances were determined using the MS/MS scans of iTRAQ-labeled peptides (3DeSouza L. Diehl G. Rodrigues M.J. Guo J. Romaschin A.D. Colgan T.J. Siu K.W.M. Search for cancer markers from endometrial tissues using differentially labeled tags iTRAQ and cICAT with multidimensional liquid chromatography and tandem mass spectrometry.J. Proteome Res. 2005; 4: 377-386Crossref PubMed Scopus (325) Google Scholar). The iTRAQ-labeled peptides fragmented under CID conditions to give reporter ions at 114.1, 115.1, 116.1, and 117.1 Th. Larger, sequence information-rich fragment ions were also produced under these conditions and gave the identity of the protein from which the peptide was analyzed. The ratios of peak areas of the iTRAQ reporter ions reflect the relative abundances of the peptides and the proteins in the samples. The software used (Applied Biosystems/MDS SCIEX) for data acquisition for the first set of runs was Analyst 1.0 SP8, whereas the software for the second run onward was Analyst 1.1. Data were analyzed using ProQUANT 1.0 or 1.1, respectively, and the database searched was the Celera human database (human KBMS 20041109) with a total of 178,243 entries provided by Applied Biosystems. Tolerance for the searches was set for 0.4 Da for the MS and 0.2 Da for the MS/MS spectra. The two parameters used to evaluate the quality of the peptide matches were the score and the confidence, which are described in detail in the literature accompanying the software. Briefly the score is a ProQUANT-generated value based on the number of ions that matches the theoretical list of fragments of the peptide in question, whereas the confidence, also a ProQUANT-generated value, is calculated from empirical data. The algorithm used to calculate the confidence incorporates the distance score calculated for the peptide as well as factors such as the total number of results returned in the search. The distance score itself is calculated by determining the difference between the score of the particular peptide and that of the seventh highest scoring peptide for that particular MS/MS spectrum and is a measure of the confidence of the match. Only those peptides scoring higher than a score of 20 and a confidence of 75 were retained in the ProQUANT search. The ProQUANT results were then grouped using ProGroup viewer, which reports the lowest number of non-redundant protein identities that would account for the peptides identified along with the ratios for the relative abundance of these proteins after normalizing. Normalizing was performed by first calculating the median ratio of all proteins reported. Peptides that contribute to the protein identification but with ratios of the iTRAQ signature peaks smaller than 40 counts between the pair of labeled peaks in question were excluded from this calculation. The resulting median ratio was the normalizing factor used and was termed the applied bias. This normalizing factor is based on the assumption that most of the protein levels in the test samples should be similar to those in the control with the exception of those that are specific to the condition of the test sample itself (i.e. malignant or benign), thus minimizing any systematic error. When the ratio for a protein from a set of constituent peptides is calculated, peptide ratios with smaller errors (better ion-counting statistics) are weighted more heavily by the program. All peptides used for this calculation were unique to the given protein; peptides that were common to other isoforms or proteins of the same family were ignored. ProGroup also assesses the confidence of the protein identities reported. The ProGroup confidence score cutoff used was 1.3, which corresponds to a confidence limit of 95%. On occasion, the ratios of some proteins that were not automatically given by the ProGroup software were also reported using the ratios returned by the ProQUANT searches. These were typically instances in which the confidence in the sequence of the identifying peptides was lower than the specified cutoff for reporting by ProGroup but for which we had more confident results for the same peptides from a different sample run. Identities of these peptides were manually verified prior to inclusion. Lastly the ratios for each of the potential markers were averaged across all the runs in which they were identified. As mentioned previously, the 10 normal proliferative samples were also compared against each other in a separate series of experiments. Samples for this second series of experiments were grouped in three sets. The first of these sets contained the proliferative samples used in the first four sets of samples in the experiments comparing the cancerous samples, i.e. P1–P4; the second set comprised proliferative samples P4–P7; and the third set comprised P7–P10. In cases where the particular protein of interest was identified in all three sample sets in these proliferative sample comparisons, the expression ratios were all recalculated relative to one proliferative sample, typically P1. These adjusted ratios were then used to calculate the average normal proliferative ratio, which was in turn used to normalize all the individual normal proliferative ratios themselves. This calculation was also performed on the individual expression ratios for the EmCa sample comparisons, thus permitting them to first be expressed relative to P1 and then relative to the average normal proliferative level. Verification of the differential expression levels of potential markers discovered using iTRAQ analysis was provided by dot-blot analyses and/or immunohistochemical analyses using antibodies specific to the protein of interest. Dot-blot analysis was performed by spotting 2 μg of each homogenate on a nitrocellulose filter (Bio-Rad); after blocking with 5% (w/v) skimmed milk in TBS (20 mm Tris, pH 7.5, 150 mm NaCl), each filter was probed by incubating it with a primary antibody in 5% bovine serum albumin in TBS with 0.1% Tween 20 overnight with shaking. An additional blot was probed with antibody specific for β-actin. Additionally selected proteins identified in the iTRAQ study were verified and localized using immunohistochemistry of proliferative, secretory, and EmCa tissues fixed in 10% buffered formalin and embedded in paraffin blocks. These tissues were different from those used in the iTRAQ work. The antibodies were applied in an appropriate dilution determined through a pilot study and immunohistochemically visualized using a diaminobenzidine chromogen. Interpretations of the immunohistochemically stained sections were conducted using a standardized microscopic review to assess positive staining (brown) for the targeted proteins in four tissue components: epithelium/carcinoma, endometrial stroma, any white blood cells, and glandular secretions. Antibodies used for these verifications were purchased from various commercial sources: β-actin, Cell Signaling Technologies (Pickering, Ontario, Canada); polymeric immunoglobulin receptor (PIGR), Cedarlane Laboratories (Hornby, Ontario, Canada); pyruvate kinase (PK) M2, ScheBo Biotech AG (Glessen, Germany); and chaperonin 10 (Cpn 10), Stressgen (Victoria, British Columbia, Canada). Evaluation of differential expression in the iTRAQ analyses was performed using two statistical approaches. A preliminary evaluation of the data was carried out using a power analysis. The deviation from unity, Δ, beyond which differential expression is indicated is given by [2 × S.D.2 × (Zα + Zβ)2/N]0.5 where S.D. is the standard deviation, (Zα + Zβ)2 is the power index, and N is the number of sample sets (10Motulsky H. Intuitive Biostatistics. Oxford University Press, New York1995: 198-199Google Scholar). The standard deviations of the cytoplasmic structural proteins, actin and β5-tubulin, were used to estimate the variation of protein concentrations between individual patients and sets. These averaged to be ∼0.3 over many iTRAQ analyses (see e.g. Table III). A power index of 10.5 was used for confidence limits of 95% for Type I error (α) and 90% for Type II error (β) (10Motulsky H. Intuitive Biostatistics. Oxford University Press, New York1995: 198-199Google Scholar). Type I error occurs when a difference is assumed (i.e. when the null hypothesis is rejected) where there is none; by contrast, Type II error occurs when no difference is assumed (i.e. when the null hypothesis is accepted) where there actually is. Thus for N = 2, the ratios must be (1 + Δ) = 1.97 to indicate differential expression; for N = 10, the criteria relax to 1.43. The three most significant and consistent biomarkers were then chosen as explanatory input variables in a logistic regression model as a discriminator between malignant and normal samples. If p denotes the predicted probability that a case i whose observed marker values are given by the vector x(i) = (x(i, marker 1), x(i, marker 2), x(i, marker 3)) is malignant. Then the logistic regression discriminator has the form p(case i is malignant x(i))=exp (α+∑βjx(i,j))/[1+exp(α+∑βjx(i,j))](Eq. 1) where the index "i" denotes the individual sample and "j" is a summation index that runs over the markers. Analogously logistic regression discriminators were defined for each of the three markers individually. For a training set S of marker values x(i) (i = 1, … , n) the model parameters α and βj were determined by maximizing the multiplicative likelihood over S using R Statistics (version 2.0.1). The discriminators were trained using the average observed iTRAQ ratios as marker observations in the malignant and benign cases. Here the malignant cases comprise the 20 Type I and Type II cancer cases, whereas the benign cases comprise 10 normal proliferative and 10 normal secretory cases. Receiver operating characteristic curves were calculated from the predictive scores of the parametrized logistic regression model by varying thresholds for "positive" calls between 0 and 1. Sensitivities, specificities, predictive values (PVs), and positive predictive values (PPVs) were calculated using a cutoff value of 0.5 on the logistic regression predictor. For any given receiver operating characteristic curve, the area under the curve (AUC) value was determined using the Mann-Whitney statistics (11Pepe M.S. The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford University Press, New York2003Google Scholar, 12Pepe M.S. Evaluating technologies for classification and predicti
Referência(s)