Combined Functional and Positional Gene Information for the Identification of Susceptibility Variants in Celiac Disease
2007; Elsevier BV; Volume: 134; Issue: 3 Linguagem: Inglês
10.1053/j.gastro.2007.11.041
ISSN1528-0012
AutoresAinara Castellanos–Rubio, Ainhoa Martín‐Pagola, Izortze Santín, Idoia Hualde, Ana M. Aransay, Luís Castaño, Juan Carlos Vitoria, José Ramón Bilbao,
Tópico(s)Digestive system and related health
ResumoBackground & Aims: Celiac disease is a complex, immune-mediated disorder of the intestinal mucosa with a strong genetic component. HLA-DQ2 is the major determinant of risk, but other minor genes, still to be identified, also are involved. Methods: We designed a strategy that combines gene expression profiling of intestinal biopsy specimens, linkage region information, and different bioinformatics tools for the selection of potentially regulatory single-nucleotide polymorphisms (SNPs) involved in the disease. We selected 361 SNPs from 71 genes that fulfilled stringent functional (changes in expression level) and positional criteria (located in regions that have been linked to the disease, other than HLA). These polymorphisms were genotyped in 262 celiac patients and 214 controls. Results: We detected strong evidence of association with several SNPs (the most significant were rs6747096, P = 2.38 × 10−5; rs7040561, P = 6.55 × 10−5; and rs458046, P = 1.35 × 10−4) that pinpoint novel candidate determinants of predisposition to the disease in previously identified linkage regions (eg, SERPINE2 in 2q33, and PBX3 or PPP6C in 9q34). Conclusions: Our study shows that the combination of function and position is a valid strategy for the genetic dissection of complex traits. Background & Aims: Celiac disease is a complex, immune-mediated disorder of the intestinal mucosa with a strong genetic component. HLA-DQ2 is the major determinant of risk, but other minor genes, still to be identified, also are involved. Methods: We designed a strategy that combines gene expression profiling of intestinal biopsy specimens, linkage region information, and different bioinformatics tools for the selection of potentially regulatory single-nucleotide polymorphisms (SNPs) involved in the disease. We selected 361 SNPs from 71 genes that fulfilled stringent functional (changes in expression level) and positional criteria (located in regions that have been linked to the disease, other than HLA). These polymorphisms were genotyped in 262 celiac patients and 214 controls. Results: We detected strong evidence of association with several SNPs (the most significant were rs6747096, P = 2.38 × 10−5; rs7040561, P = 6.55 × 10−5; and rs458046, P = 1.35 × 10−4) that pinpoint novel candidate determinants of predisposition to the disease in previously identified linkage regions (eg, SERPINE2 in 2q33, and PBX3 or PPP6C in 9q34). Conclusions: Our study shows that the combination of function and position is a valid strategy for the genetic dissection of complex traits. See editorial on page 869. See editorial on page 869. Celiac disease (CD [OMIM# 212750]) is a chronic, immune-mediated enteropathy caused by intolerance to ingested gluten that develops in genetically susceptible individuals. It is one of the most common lifelong disorders affecting Caucasians and recent studies have estimated its prevalence at close to 1:120.1Castano L. Blarduni E. Ortiz L. et al.Prospective population screening for celiac disease: high prevalence in the first 3 years of life.J Pediatr Gastroenterol Nutr. 2004; 39: 80-84Crossref PubMed Scopus (59) Google Scholar CD is a complex multifactorial disease, and familial clustering has been known to occur for a long time. The major genetic determinant of CD maps to the HLA class II region on 6p21, and more than 90% of celiac patients carry at least one copy of the HLA-DQ2 heterodimer (encoded by alleles HLA-DQA1*05 and HLA-DQB1*02, either in cis or in trans), or less frequently (∼6%) the HLA-DQ8 molecule encoded by HLA-DQA1*03 and HLA-DQB1*0302.2Van Heel D.A. Hunt K. Greco L. et al.Genetics in coeliac disease.Best Pract Res Clin Gastroenterol. 2005; 19: 323-339Abstract Full Text Full Text PDF PubMed Scopus (107) Google Scholar These associations are concordant with the ability of these HLA molecules to present immunogenic, deamidated gliadin peptides to T lymphocytes, which accounts for their central role in the pathogenesis of the disease.3Molberg O. Mcadam S.N. Korner R. et al.Tissue transglutaminase selectively modifies gliadin peptides that are recognized by gut-derived T cells in celiac disease.Nat Med. 1998; 4: 713-717Crossref PubMed Scopus (997) Google Scholar Nevertheless, the HLA-DQ2 variant also is frequent in the general population, and is present in more than 30% of Caucasians, suggesting that HLA alone is not sufficient to explain all the genetic susceptibility to CD. Current knowledge suggests that HLA is responsible for around 40% of the heritability of the disease, and that several other minor genetic determinants are contributing to the genetic predisposition. It has been suggested that there may be additional loci in the major histocompatibility complex (MHC) region that modify disease risk independently from HLA-DQ.4Lie B.A. Sollid L.M. Ascher H. et al.A gene telomeric of the HLA class I region is involved in predisposition to both type 1 diabetes and coeliac disease.Tissue Antigens. 1999; 54: 162-168Crossref PubMed Scopus (100) Google Scholar The profusion of potentially relevant immune response genes in this region, together with the differential frequency of A1-B8-DR3-DQ2 and A30-B18-DR3-DQ2 extended haplotypes in DQ2-homozygous patients with either CD or type 1 diabetes, support the presence of non-HLA disease-specific variants in 6p21.5Bilbao J.R. Calvo B. Aransay A.M. et al.Conserved extended haplotypes discriminate HLA-DR3-homozygous Basque patients with type 1 diabetes mellitus and celiac disease.Genes Immun. 2006; 7: 550-554Crossref PubMed Scopus (48) Google Scholar However, the strong linkage disequilibrium within the MHC complicates enormously the search for candidate genes, and the issue is still an open question. Candidate genes outside 6p21, especially those involved in the immune response, such as CTLA4 or IFNG, also have been investigated, but, in general, contradictory results have been obtained.6Djilali-Saiah I. Schmitz J. Harfouch-Hammoud E. et al.CTLA-4 gene polymorphism is associated with predisposition to coeliac disease.Gut. 1998; 43: 187-189Crossref PubMed Scopus (205) Google Scholar, 7Martin-Pagola A. Perez de Nanclares G. Vitoria J.C. et al.No association of CTLA4 gene with celiac disease in the Basque population.J Pediatr Gastroenterol Nutr. 2003; 37: 142-145Crossref PubMed Scopus (27) Google Scholar, 8Rueda B. Martinez A. Lopez-Nevot M.A. et al.A functional variant of IFNgamma gene is associated with coeliac disease.Genes Immun. 2004; 5: 517-519Crossref PubMed Scopus (22) Google Scholar, 9Wapenaar M.C. Van Belzen M.J. Fransen J.H. et al.The interferon gamma gene in celiac disease: augmented expression correlates with tissue damage but no evidence for genetic susceptibility.J Autoimmun. 2004; 23: 183-190Crossref PubMed Scopus (43) Google Scholar Discrepancies between studies in candidate gene association analyses could be caused by population heterogeneity, but also could reflect the lack of power to detect small gene effects because of undersized studies. A recently completed whole-genome association study in CD (>300,000 genotyped SNPs) found very strong evidence of association in a region spanning interleukin-2 (IL-2) genes and IL-21 on chromosome 4p27.10Van Heel D.A. Franke L. Hunt K.A. et al.A genome-wide association study for celiac disease identifies risk variants in the region harboring IL2 and IL21.Nat Genet. 2007; 39: 827-829Crossref PubMed Scopus (549) Google Scholar Several genome-wide linkage studies also have been performed during the past decade and more than 10 different chromosomal regions that could harbor secondary susceptibility genes have been mapped.11Zhong F. McCombs C.C. Olson J.M. et al.An autosomal screen for genes that predispose to celiac disease in the western counties of Ireland.Nat Genet. 1996; 14: 329-333Crossref PubMed Scopus (164) Google Scholar, 12Greco L. Corazza G. Babron M.C. et al.Genome search in celiac disease.Am J Hum Genet. 1998; 62: 669-675Abstract Full Text Full Text PDF PubMed Scopus (202) Google Scholar, 13King A.L. Yiannakou J.Y. Brett P.M. et al.A genome-wide family-based linkage study of coeliac disease.Ann Hum Genet. 2000; 64: 479-490Crossref PubMed Google Scholar, 14Liu J. Juo S.H. Holopainen P. et al.Genomewide linkage analysis of celiac disease in Finnish families.Am J Hum Genet. 2002; 70: 51-59Abstract Full Text Full Text PDF PubMed Scopus (87) Google Scholar, 15Popat S. Bevan S. Braegger C.P. et al.Genome screening of coeliac disease.J Med Genet. 2002; 39: 328-331Crossref PubMed Google Scholar, 16Garner C.P. Ding Y.C. Steele L. et al.Genome-wide linkage analysis of 160 North American families with celiac disease.Genes Immun. 2007; 8: 108-114Crossref PubMed Scopus (18) Google Scholar Two loci have been replicated in different studies and are considered more likely to contain disease-predisposing variants. CELIAC2, on linkage region 5q,11Zhong F. McCombs C.C. Olson J.M. et al.An autosomal screen for genes that predispose to celiac disease in the western counties of Ireland.Nat Genet. 1996; 14: 329-333Crossref PubMed Scopus (164) Google Scholar, 12Greco L. Corazza G. Babron M.C. et al.Genome search in celiac disease.Am J Hum Genet. 1998; 62: 669-675Abstract Full Text Full Text PDF PubMed Scopus (202) Google Scholar, 14Liu J. Juo S.H. Holopainen P. et al.Genomewide linkage analysis of celiac disease in Finnish families.Am J Hum Genet. 2002; 70: 51-59Abstract Full Text Full Text PDF PubMed Scopus (87) Google Scholar contains a cluster of cytokine genes and other immune-related candidates that also have been implicated in Crohn's disease, but studies in CD have not detected associated variants.17Ryan A.W. Thornton J.M. Brophy K. et al.Haplotype variation at the IBD5/SLC22A4 locus (5q31) in coeliac disease in the Irish population.Tissue Antigens. 2004; 64: 195-198Crossref PubMed Scopus (14) Google Scholar, 18Ryan A.W. Thornton J.M. Brophy K. et al.Chromosome 5q candidate genes in coeliac disease: genetic variation at IL4, IL5, IL9, IL13, IL17B and NR3C1.Tissue Antigens. 2005; 65: 150-155Crossref PubMed Scopus (26) Google Scholar The locus on 19q13,11Zhong F. McCombs C.C. Olson J.M. et al.An autosomal screen for genes that predispose to celiac disease in the western counties of Ireland.Nat Genet. 1996; 14: 329-333Crossref PubMed Scopus (164) Google Scholar, 13King A.L. Yiannakou J.Y. Brett P.M. et al.A genome-wide family-based linkage study of coeliac disease.Ann Hum Genet. 2000; 64: 479-490Crossref PubMed Google Scholar termed CELIAC4, harbors a functional candidate implicated in intestinal barrier integrity, (MYO9B) that was associated initially with CD and also with inflammatory bowel disease.19Monsuur A.J. de Bakker P.I. Alizadeh B.Z. et al.Myosin IXB variant increases the risk of celiac disease and points toward a primary intestinal barrier defect.Nat Genet. 2005; 37: 1341-1344Crossref PubMed Scopus (211) Google Scholar, 20Van Bodegraven A.A. Curley C.R. Hunt K.A. et al.Genetic variation in myosin IXB is associated with ulcerative colitis.Gastroenterology. 2006; 131: 1768-1774Abstract Full Text Full Text PDF PubMed Scopus (92) Google Scholar Again, attempts to replicate these findings in other populations have been contradictory.21Hunt K.A. Monsuur A.J. McArdle W.L. et al.Lack of association of MYO9B genetic variants with coeliac disease in a British cohort.Gut. 2006; 55: 969-972Crossref PubMed Scopus (57) Google Scholar, 22Amundsen S.S. Monsuur A.J. Wapenaar M.C. et al.Association analysis of MYO9B gene polymorphisms with celiac disease in a Swedish/Norwegian cohort.Hum Immunol. 2006; 67: 341-345Crossref PubMed Scopus (44) Google Scholar, 23Giordano M. Marano C. Mellai M. et al.A family-based study does not confirm the association of MYO9B with celiac disease in the Italian population.Genes Immun. 2006; 7: 606-608Crossref PubMed Scopus (28) Google Scholar, 24Cirillo G. Di Domenico M.R. Corsi I. et al.Do MYO9B genetic variants predispose to coeliac disease? An association study in a cohort of South Italian children.Dig Liver Dis. 2007; 39: 228-231Abstract Full Text Full Text PDF PubMed Scopus (18) Google Scholar More recently, KIR2DL5B, a member of the killer immunoglobulin-like–receptor family, which also maps to 19q13, has been shown to predispose to CD.25Santin I. Castellanos-Rubio A. Perez de Nanclares G. et al.Association of KIR2DL5B gene with celiac disease supports the susceptibility locus on 19q13.4.Genes Immun. 2002; 8: 171-176Crossref Scopus (20) Google Scholar In general, however, there is a marked lack of consistency across linkage studies, and the genuine contributors to disease susceptibility are difficult to identify. Gene expression profiling analyses of the intestinal mucosa also have been performed in CD, and the 2 microarray-based studies performed to date have contributed to our understanding of the functional mechanisms of disease progression.26Juuti-Uusitalo K. Maki M. Kaukinen K. et al.cDNA microarray analysis of gene expression in coeliac disease jejunal biopsy samples.J Autoimmun. 2002; 22: 249-265Crossref Scopus (32) Google Scholar, 27Diosdado B. Wapenaar M.C. Franke L. et al.A microarray screen for novel candidate genes in coeliac disease pathogenesis.Gut. 2004; 53: 944-951Crossref PubMed Scopus (70) Google Scholar Genes and pathways that might be important in the pathogenesis of the disease have been highlighted, but again, the genetic implication of these findings, in terms of discovering disease-associated variants within those particular genes, still is pending. The observed alterations in expression levels are, in the majority of cases, secondary to (and not causative of) the disease process, and do not correspond with structural changes in the genomic sequence. CD is the result of a combination of mutations (in more than one gene) upon which environmental factors act, altogether modulating the degree of predisposition to, as well as the severity and clinical manifestations of, the disease. Each single, primary alteration of gene expression levels associated with the disease thus is expected to be modest. Although some changes might be caused by SNPs located within coding regions, and cause amino acid substitutions or stop codons that have a dramatic impact on protein function, gene variants associated with complex diseases are likely to be located in noncoding regulatory regions of genes and produce more subtle effects on the intensity or timing of gene expression. These regulatory SNPs might alter the level of gene expression because they modify transcription factor binding sites or because they affect the structure of messenger RNA (mRNA) in ways that alter splicing, posttranscriptional processing, and stability.28Prokunina L. Alarcon-Riquelme M.E. Regulatory SNPs in complex diseases: their identification and functional validation.Expert Rev Mol Med. 2004; 6: 1-15Crossref PubMed Scopus (66) Google Scholar In the present work, we have taken advantage of the extraordinary power of whole-genome expression profiling for the identification of genes and pathways involved in CD and combined these results with the genomic regions that previously have shown strong evidence of linkage to CD. Our aim was to identify target genes for disease-association studies on the basis of functional and positional criteria. Finally, we selected regulatory SNPs that might be responsible for the changes observed in expression and performed a case-control, disease-association study. The study was approved by the institutional ethics board and informed consent was obtained from all subjects or their parents. An outline of the strategy used in this study is shown in Figure 1. CD was diagnosed according to the European Society of Pediatric Gastroenterology Hepatology and Nutrition criteria, including antigliadin, anti-endomysium, and antitransglutaminase antibody determinations as well as a confirmatory small bowel biopsy. Two biopsy specimens from the distal duodenum of each patient were obtained using a double-port small intestinal pediatric capsule; one of the samples served for clinical pathology examination and diagnosis of CD and the other sample was used in the present study. We investigated changes in gene expression profiles of the intestinal mucosa induced by chronic exposure to dietary gluten, as well as the acute effects of gliadin on CD-susceptible intestinal epithelium using an in vitro stimulation model of intestinal biopsy specimens. For the analysis of the effects of chronic, or long-term, exposure to gliadin, biopsy specimens from 9 newly diagnosed CD patients with clinically active disease (positive for CD-associated antibodies and presenting atrophy of intestinal villi with crypt hyperplasia) who were on a nonrestricted (gluten-containing) diet at that time, were compared with tissue samples from 9 normalized CD patients (asymptomatic, antibody-negative, and with a recovered intestinal epithelium) who had been on a strict gluten-free diet for more than 2 years. Biopsy samples were immediately frozen and stored in liquid nitrogen until RNA was extracted. For the acute response experiment, duodenal biopsy specimens were obtained from 10 gluten-free diet–treated CD patients (as described earlier) and each tissue sample was cut into 2 portions, which were incubated separately in 1 mL of RPMI medium, with and without the addition of 10 μg/mL gliadin (cat no. G3375; Sigma, St. Louis, MO) at 37°C and 5% CO2 for 4 hours.29Martin-Pagola A. Perez-Nanclares G. Ortiz L. et al.MICA response to gliadin in intestinal mucosa from celiac patients.Immunogenetics. 2004; 56: 549-554Crossref PubMed Scopus (37) Google Scholar Samples subsequently were frozen and stored in liquid nitrogen until RNA was extracted. Frozen tissue samples were disrupted with disposable plastic pellet pestles (Kontes, Vineland, NJ) in 1.5-mL microcentrifuge tubes and homogenized using a QIAshredder column (QIAGEN Gmbh, Hilden, Germany). Total RNA was isolated using the RNeasy-Micro kit (QIAGEN) treated with DNase I and stored at −70°C until use. RNA was quantified by ultraviolet spectrophotometry and the quality of the sample was verified using the RNA 6000 NanoAssay on a BioAnalyzer system (Agilent Technologies Inc., Santa Clara, CA). On average, 3 μg of total RNA were isolated from each whole biopsy specimen, which was used for the chronic exposure experiment, and 0.5 μg from each incubated portion, which was used for the acute response experiment. Gene expression profiling was performed using the Human U133 Plus 2.0 array (Affymetrix, Santa Clara, CA). In the chronic exposure experiment, double-stranded complementary DNA (cDNA) and biotinylated complementary RNA were synthesized from 2 μg of total RNA using the One-Cycle cDNA synthesis kit and the IVT labeling kit (both from Affymetrix). For the acute response experiment, the Two-Cycle cDNA synthesis kit (Affymetrix) was used, starting with 200 ng of total RNA. Array hybridization, washing, and staining procedures were performed according to the manufacturer's protocols. By using the Gene Chip Operating Software v1.2 (Affymetrix), results from each individual microarray were scaled so that the average probe set hybridization signal intensity value (target intensity) was 50, and raw image files were processed to .CEL files, which were used for the comparison of gene expression profiles. Experimental results (in .CEL file format) are available from the authors upon request. A more extensive description of genes and biological pathways identified by microarray experiments is currently in preparation. Differentially expressed transcripts were identified using MUSC ArrayQuest, a web-accessible (http://proteogenomics.musc.edu/arrayquest.html) genomic analysis process controller from the Medical University of South Carolina, where .CEL files were uploaded and user-specified analysis methods are performed on computers loaded with the R programming language and Bioconductor software packages (available at http://www.bioconductor.org/).30Argraves G.L. Jani S. Barth J.L. et al.ArrayQuest: a web resource for the analysis of DNA microarray data.BMC Bioinformatics. 2005; 6: 287Crossref PubMed Scopus (18) Google Scholar For the present study, method no. 12 of the MUSC Array Quest Methods Library was selected: a detailed description is available at the web site, but, in brief, this method normalizes hybridization data using Robust Multichip Average, and differentially expressed genes can be filtered based on fold-change, t test, and/or false discovery rate thresholds, all of which can be adjusted by the user to obtain a reasonable output in terms of the number and characteristics of the genes that are considered differentially expressed. Quantitative reverse-transcription polymerase chain reaction was used to replicate the microarray results in independent RNA samples in a randomly selected set of altered genes. For the acute experiment, total RNA was isolated from biopsy specimens from 6 additional patients with active CD and 6 other patients with treated CD. The mRNA levels of INFG, UBD, EPHX1, TAP1, ACAA2, CD47, RNASE4, ACOT7, HIP1R, NOD27, and PSME2 were measured in individual samples. In the acute response experiment, expression levels of HDAC4, TMEM37, PDZK1, TREH, RAB6IP2, ALDOB, SSX2IP, SLC38A1, SLC25A16, and APOC2 were analyzed in RNA pools prepared from 10 gliadin-stimulated and nonstimulated biopsy pairs from treated CD patients, to ensure enough material for replication of all the genes. Primers and probes for each gene were purchased as commercial Assay-on-Demand sets (Applied Biosystems, Foster City, CA) and RT-PCR reactions were performed in duplicate on an ABI PRISM 7900 Sequence Detection System (Applied Biosystems), with a single-reaction enzyme mixture using QuantiTect Probe reverse-transcription polymerase chain reaction (QIAGEN Gmbh) as previously described.29Martin-Pagola A. Perez-Nanclares G. Ortiz L. et al.MICA response to gliadin in intestinal mucosa from celiac patients.Immunogenetics. 2004; 56: 549-554Crossref PubMed Scopus (37) Google Scholar The expression of the housekeeping gene RPLPO (large ribosomal protein) was quantified simultaneously in each experiment and used as an endogenous control of input RNA, and relative expression of each gene was calculated using the accurate cycle threshold method. For the selection of candidate genes for the genetic association study, we selected chromosomal positions that had been identified either by at least 2 independent genome-wide linkage studies, or those that although described only once, showed a logarithm of odds score greater than 2.11Zhong F. McCombs C.C. Olson J.M. et al.An autosomal screen for genes that predispose to celiac disease in the western counties of Ireland.Nat Genet. 1996; 14: 329-333Crossref PubMed Scopus (164) Google Scholar, 12Greco L. Corazza G. Babron M.C. et al.Genome search in celiac disease.Am J Hum Genet. 1998; 62: 669-675Abstract Full Text Full Text PDF PubMed Scopus (202) Google Scholar, 13King A.L. Yiannakou J.Y. Brett P.M. et al.A genome-wide family-based linkage study of coeliac disease.Ann Hum Genet. 2000; 64: 479-490Crossref PubMed Google Scholar, 14Liu J. Juo S.H. Holopainen P. et al.Genomewide linkage analysis of celiac disease in Finnish families.Am J Hum Genet. 2002; 70: 51-59Abstract Full Text Full Text PDF PubMed Scopus (87) Google Scholar, 15Popat S. Bevan S. Braegger C.P. et al.Genome screening of coeliac disease.J Med Genet. 2002; 39: 328-331Crossref PubMed Google Scholar, 16Garner C.P. Ding Y.C. Steele L. et al.Genome-wide linkage analysis of 160 North American families with celiac disease.Genes Immun. 2007; 8: 108-114Crossref PubMed Scopus (18) Google Scholar The MHC region on chromosome 6p21 intentionally was excluded from this analysis because of its genetic complexity. The selection of polymorphisms for association studies was performed under the assumption that changes observed in expression levels in disease or in response to external stimuli (gliadin) are not exclusively the consequence of nonsynonymous nucleotide polymorphisms, so the 4 SNP search tools (FastSNP [http://fastsnp.ibms.sinica.edu.tw], PupaSuite [http://pupasuite.bioinfo.cipf.es/], SNPselector [http://snpselector.duhs.duke.edu/], and TAMAL [http://neoref.ils.unc.edu/tamal/])31Yuan H.Y. Chiou J.J. Tseng W.H. et al.FASTSNP: an always up-to-date and extendable service for SNP function analysis and prioritization.Nucleic Acids Res. 2006; 34: 635-641Crossref PubMed Scopus (452) Google Scholar, 32Conde L. Vaquerizas J.M. Dopazo H. et al.PupaSuite: finding functional single nucleotide polymorphisms for large-scale genotyping purposes.Nucleic Acids Res. 2006; 34: 621-625Crossref Scopus (188) Google Scholar, 33Xu H. Gregory S.G. Hauser E.R. et al.SNPselector: a web tool for selecting SNPs for genetic association studies.Bioinformatics. 2005; 21: 4181-4186Crossref PubMed Scopus (93) Google Scholar, 34Hemminger B.M. Saelim B. Sullivan P.F. TAMAL: an integrated approach to choosing SNPs for genetic studies of human complex traits.Bioinformatics. 2006; 22: 626-627Crossref PubMed Scopus (61) Google Scholar were adjusted to prioritize putative regulatory SNPs, including those located in predicted transcription factor binding sites, CpG islands, exon–intron boundaries, promoter regions, as well as splicing enhancers, and those forming triplex DNA structures (Figure 1). In addition, only SNPs having a unique mapping location in the NCBI b35 human genome assembly and a minor allele frequency greater than 5% in Caucasians (according to genotyping results deposited in dbSNP [http://www.ncbi.nlm.nih.gov/SNP/index.html] release 125) were included. SNPs that could not be designed for an Illumina GoldenGate assay (Illumina Inc., San Diego, CA) were replaced, when possible, by tag SNPs from release 21 of the HapMap project (http://www.hapmap.org/). The study population for genetic association studies consisted of 264 patients with CD (152 girls and 112 boys; mean age at diagnosis, 3.2 y; range, 0.7–13.7 y) who currently are being followed up at the Pediatric Gastroenterology Units of the 2 participating hospitals, and 214 healthy adult volunteers from the general population (110 women and 104 men) with no personal or family history of CD. Both patients and controls were residents in the Basque Country and of Caucasian ethnic origin. Oligonucleotide pools for 361 SNPs were purchased as a GoldenGate Custom Panel (Illumina) and DNA samples were genotyped and plates were processed following the manufacturer's instructions (http://www.illumina.com/downloads/GOLDENGATEASSAY.pdf). The genotype confidence score of the assay for keeping allele calls was set to 0.25. SNPs with ambiguous genotype in more than 10% of the samples were removed from subsequent analysis. Interassay reproducibility was assessed by including 1 of the samples in the 5 genotyping experiments performed. Analysis of the genotyping results were performed using PLINK v.0.99r association analysis toolset (http://pngu.mgh.harvard.edu/∼purcell/plink/).35Purcell S. Neale B. Todd-Brown K. et al.PLINK: a toolset for whole-genome association and population-based linkage analysis.Am J Hum Genet. 2007; 81: 559-575Abstract Full Text Full Text PDF PubMed Scopus (21117) Google Scholar Before disease-association studies, SNPs with a minor allele frequency lower than 0.01 in the complete dataset and those that failed the Hardy–Weinberg equilibrium test (P < .001) in either cases or controls were eliminated. The Cochran–Armitage case/control genotypic test for trend was used for single-marker disease-association analysis, and P values (uncorrected and after adjusting for multiple testing using the Bonferroni correction [number of tests = number of SNPs = 330]), odds ratios, and 95% confidence intervals are reported. To perform gene-based multimarker association analyses, the sliding-window option in PLINK was used. In this approach, haplotypes in sliding windows of a fixed number of SNPs, shifting 1 SNP at a time (haplotype length ranging from 2 SNPs to the total number of SNPs in each particular gene), were inferred for every individual using the expectation-maximization algorithm. Subsequently, a case/control disease association test was performed with each haplotype, and asymptotic P values were calculated. Nonsupervised hierarchic clustering of microarray results of intestinal biopsy specimens from newly diagnosed CD patients and from patients on a gluten-free diet could separate both conditions efficiently, indicating that the 2 groups are relatively homogeneous and uniform, regardless of background expression levels in each of the individuals (data not shown). In consequence, stringent selection parameters based on false discovery rates could be used to identify differentially expressed genes: fixing a P value of .01, false discovery rate thresholds of 0.025, 0.05, and 0.1 yielded 1453, 1647, and 3305 differentially expressed sequences, respectively. In contrast, the 2 portions of the same biopsy specimen (incubated with and without gliadin in the acute experiment) tended to cluster together, indicating that background similarities in each sample were stronger than the differences provoked by the gliadin insult. Consequently, less stringent thresholds, based on fold-change ratio, were used. With a fixed P value of .01, tweaking the fold-change ratio between 2 and 1.2, resulted in a number of sequences with altered expression levels ranging from 11 to 137, respectively. To obtain a reasonable list of genes for SNP selection and disease-association studies, a threshold of false discovery rates less than 0.05 and fold-change ratios greater than 1.3 were established, respectively, resulting in a list of 1647 differentially expressed sequences in the chronic exposure experiment and 96 in the in vitro stimulated biopsy specimens (including 1287 and 87 known genes, respectively) (supplementary Ta
Referência(s)