Gene-Expression Variation Within and Among Human Populations
2007; Elsevier BV; Volume: 80; Issue: 3 Linguagem: Inglês
10.1086/512017
ISSN1537-6605
AutoresJohn D. Storey, Jennifer Madeoy, Jeanna Strout, Mark M. Wurfel, James Ronald, Joshua M. Akey,
Tópico(s)Bioinformatics and Genomic Networks
ResumoUnderstanding patterns of gene-expression variation within and among human populations will provide important insights into the molecular basis of phenotypic diversity and the interpretation of patterns of expression variation in disease. However, little is known about how gene-expression variation is apportioned within and among human populations. Here, we characterize patterns of natural gene-expression variation in 16 individuals of European and African ancestry. We find extensive variation in gene-expression levels and estimate that ∼83% of genes are differentially expressed among individuals and that ∼17% of genes are differentially expressed among populations. By decomposing total gene-expression variation into within- versus among-population components, we find that most expression variation is due to variation among individuals rather than among populations, which parallels observations of extant patterns of human genetic variation. Finally, we performed allele-specific quantitative polymerase chain reaction to demonstrate that cis-regulatory variation in the lymphocyte adaptor protein (SH2B adapter protein 3) contributes to differential expression between European and African samples. These results provide the first insight into how human population structure manifests itself in gene-expression levels and will help guide the search for regulatory quantitative trait loci. Understanding patterns of gene-expression variation within and among human populations will provide important insights into the molecular basis of phenotypic diversity and the interpretation of patterns of expression variation in disease. However, little is known about how gene-expression variation is apportioned within and among human populations. Here, we characterize patterns of natural gene-expression variation in 16 individuals of European and African ancestry. We find extensive variation in gene-expression levels and estimate that ∼83% of genes are differentially expressed among individuals and that ∼17% of genes are differentially expressed among populations. By decomposing total gene-expression variation into within- versus among-population components, we find that most expression variation is due to variation among individuals rather than among populations, which parallels observations of extant patterns of human genetic variation. Finally, we performed allele-specific quantitative polymerase chain reaction to demonstrate that cis-regulatory variation in the lymphocyte adaptor protein (SH2B adapter protein 3) contributes to differential expression between European and African samples. These results provide the first insight into how human population structure manifests itself in gene-expression levels and will help guide the search for regulatory quantitative trait loci. Gene expression is the primary mechanism by which information encoded in the genome is converted into developmental, morphological, and physiological phenotypes.1Rifkin SA Kim J White KP Evolution of gene expression in the Drosophila melanogaster subgroup.Nat Genet. 2003; 33: 138-144Crossref PubMed Scopus (284) Google Scholar Gene expression is also an important source of evolutionary change within and among species,2King MC Wilson AC Evolution at two levels in humans and chimpanzees.Science. 1975; 188: 107-116Crossref PubMed Scopus (1907) Google Scholar and aberrant gene expression has been implicated in the pathogenesis of numerous diseases.3Knight JC Regulatory polymorphisms underlying com plex disease traits.J Mol Med. 2005; 83: 97-109Crossref PubMed Scopus (184) Google Scholar, 4Yan H Zhou W Allelic variations in gene expression.Curr Opin Oncol. 2004; 16: 39-43Crossref PubMed Scopus (52) Google Scholar Thus, understanding the amount, structure, and patterns of gene-expression variation is of fundamental importance to both biomedical research and evolutionary biology.5Oleksiak MF Churchill GA Crawford DL Variation in gene expression within and among natural populations.Nat Genet. 2002; 32: 261-266Crossref PubMed Scopus (576) Google Scholar Although it is well known that 85%–95% of human genetic variation is due to variation among individuals within a population, whereas 5%–15% is attributable to variation among populations,6Barbujani G Magagni A Minch E Cavalli-Sforza LL An apportionment of human DNA diversity.Proc Natl Acad Sci USA. 1997; 94: 4516-4519Crossref PubMed Scopus (362) Google Scholar, 7Akey JM Zhang G Zhang K Jin L Shriver MD Interrogating a high-density SNP map for signatures of natural selection.Genome Res. 2002; 12: 1805-1814Crossref PubMed Scopus (707) Google Scholar, 8Rosenberg NA Pritchard JK Weber JL Cann HM Kidd KK Zhivotovsky LA Feldman MW Genetic structure of human populations.Science. 2002; 298: 2381-2385Crossref PubMed Scopus (2085) Google Scholar, 9Excoffier L Hamilton G Comment on “Genetic structure of human populations.”.Science. 2003; 300: 1877Crossref PubMed Scopus (50) Google Scholar it remains unclear whether similar levels of within- versus among-population components of variation will extend to higher-level phenotypes such as gene-expression levels. Whereas some prior work on gene-expression differences among human populations has been done in the context of disease studies,10He XS Ji X Hale MB Cheung R Ahmed A Guo Y Nolan GP Pfeffer LM Wright TL Risch N et al.Global transcriptional response to interferon is a determinant of HCV treatment outcome and is modified by race.Hepatology. 2006; 44: 352-359Crossref PubMed Scopus (82) Google Scholar, 11Dysvik B Vasstrand EN Lovlie R Elgindi OA Kross KW Aarstad HJ Johannessen AC Jonassen I Ibrahim SO Gene expression profiles of head and neck carcinomas from Sudanese and Norwegian patients reveal common biological pathways regardless of race and lifestyle.Clin Cancer Res. 2006; 12: 1109-1120Crossref PubMed Scopus (19) Google Scholar to our knowledge, there have been no systematic and quantitative attempts to apportion natural variation in gene-expression levels into within- and among-population components akin to several excellent studies in model organisms.5Oleksiak MF Churchill GA Crawford DL Variation in gene expression within and among natural populations.Nat Genet. 2002; 32: 261-266Crossref PubMed Scopus (576) Google Scholar, 12Oleksiak MF Roach JL Crawford DL Natural variation in cardiac metabolism and gene expression in fundulus heteroclitus.Nat Genet. 2005; 37: 67-72PubMed Google Scholar, 13Townsend JP Cavalieri D Hartl DL Population genetic variation in genome-wide gene expression.Mol Biol Evol. 2003; 20: 955-963Crossref PubMed Scopus (166) Google Scholar, 14Jin W Riley RM Wolfinger RD White KP Passador-Gurgel G Gibson G The contributions of sex, genotype and age to transcriptional variance in Drosophila melanogaster..Nat Genet. 2001; 29: 389-395Crossref PubMed Scopus (522) Google Scholar To begin to address these issues, we used Affymetrix Human Focus Arrays to study gene-expression levels in B lymphoblastoid cells derived from eight unrelated individuals of northern and western European ancestry (CEU) and eight unrelated individuals from the Yoruba of Ibadan, Nigeria (YRI). These samples are a subset of the CEU and YRI individuals used in the International HapMap project,15The International HapMap ConsortiumA haplotype map of the human genome.Nature. 2005; 437: 1299-1320Crossref PubMed Scopus (4776) Google Scholar and cell lines were obtained from the Coriell Cell Repositories (samples GM06995, GM07029, GM07349, GM10845, GM10851, GM10856, GM10857, GM10860, GM19138, GM18516, GM18859, GM18871, GM18501, GM18504, GM18507, and GM18522). All study individuals were males, to eliminate the potential confounding effects of sex on gene-expression levels. We performed tissue culture and RNA extraction as described elsewhere16Monks SA Leonardson A Zhu H Cundiff P Pietrusiak P Edwards S Phillips JW Sachs A Schadt EE Genetic inheritance of gene expression in human cell lines.Am J Hum Genet. 2004; 75: 1094-1105Abstract Full Text Full Text PDF PubMed Scopus (299) Google Scholar, 17Cheung VG Conlin LK Weber TM Arcaro M Jen KY Morley M Spielman RS Natural variation in human gene expression assessed in lymphoblastoid cells.Nat Genet. 2003; 33: 422-425Crossref PubMed Scopus (462) Google Scholar and assessed RNA integrity by measuring the optical density 260/280 ratio and subjecting the sample to analysis with the Agilent Bioanalyzer 2100. Extracted RNA was labeled and hybridized according to the manufacturer’s protocol (Affymetrix). We performed quantile normalization and used the RMA algorithm to combine probe-set intensities into a single measure of expression for each gene.18Irizarry RA Bolstad BM Collin F Cope LM Hobbs B Speed TP Summaries of Affymetrix GeneChip probe level data.Nucleic Acids Res. 2003; 31: e15Crossref PubMed Scopus (4021) Google Scholar Low-intensity probe sets that were deemed absent in ≥50% of the arrays with use of the algorithms implemented in MAS519Hubbell E Liu WM Mei R Robust estimators for expression analysis.Bioinformatics. 2002; 18: 1585-1592Crossref PubMed Scopus (495) Google Scholar were discarded in subsequent statistical analyses, resulting in 5,194 analyzable probe sets. All reported results were robust to different normalization methods and definitions of low-intensity genes (results not shown). Technical replicates were obtained for each individual, resulting in a total of 32 microarrays. Of the ∼8,500 genes on the array, 5,194 were expressed in lymphoblastoid cells, which is comparable with previous observations.17Cheung VG Conlin LK Weber TM Arcaro M Jen KY Morley M Spielman RS Natural variation in human gene expression assessed in lymphoblastoid cells.Nat Genet. 2003; 33: 422-425Crossref PubMed Scopus (462) Google Scholar We used a fully nested, mixed-model analysis to identify genes differentially expressed among individuals within populations and genes differentially expressed among populations (see appendix A). This model allows tests of differential expression among individuals to be performed while properly accounting for population effects and technical variation. Similarly, it allows tests of differential expression among populations while properly accounting for individual effects and technical variation. We used methodology described elsewhere22Storey JD Tibshirani R Statistical significance for genome-wide studies.Proc Natl Acad Sci USA. 2003; 100: 9440-9445Crossref PubMed Scopus (7094) Google Scholar, 25Storey JD A direct approach to false discovery rates.J R Stat Soc Ser B. 2002; 64: 479-498Crossref Scopus (3864) Google Scholar to analyze the complete distribution of P values resulting from tests of differential expression, to estimate the proportion of all genes that are differentially expressed either within or between the CEU and YRI samples. Under the null hypothesis of no differential expression, we expect the P values to be uniformly distributed between 0 and 1. Conversely, if the data set contains differentially expressed genes, the distribution of P values will be skewed toward 0.22Storey JD Tibshirani R Statistical significance for genome-wide studies.Proc Natl Acad Sci USA. 2003; 100: 9440-9445Crossref PubMed Scopus (7094) Google Scholar We estimated that ∼83% of genes are differentially expressed among individuals and ∼17% of genes are differentially expressed between the CEU and YRI samples (fig. 1). That these two percentages add up to 100% is a coincidence; we found that the significance of tests for differential expression within and among populations was uncorrelated (see appendix A). The estimated proportion of genes that possess interindividual variation is consistent with previous studies describing pervasive cis-regulatory variation in humans.26Yan H Yuan W Velculescu VE Vogelstein B Kinzler KW Allelic variation in human gene expression.Science. 2002; 297: 1143Crossref PubMed Scopus (601) Google Scholar, 27Lo HS Wang Z Hu Y Yang HH Gere S Buetow KH Lee MP Allelic variation in gene expression is common in the human genome.Genome Res. 2003; 13: 1855-1862Crossref PubMed Scopus (126) Google Scholar, 28Bray NJ Buckland PR Owens MJ O’Donovan MC Cis-acting variation in the expression of high proportion of genes in human brain.Hum Genet. 2003; 113: 149-153Crossref PubMed Scopus (195) Google Scholar To our knowledge, there have been no systematic studies of gene-expression differences among human populations. Thus, these results demonstrate substantial natural variation in gene-expression levels both within and among populations and show that population structure exists in levels of transcript abundance. We next investigated the magnitude of expression differences observed within and among populations, which, in general, was relatively small (fig. 2). For example, of the ∼83% of genes estimated to be differentially expressed among individuals, 1,210 were significant at a false-discovery rate (FDR) ≤1%. These genes varied by an average factor of 1.25 across individuals (see fig. 2), although 20 loci varied by a factor of 2. Of the ∼17% of genes estimated to be differentially expressed between the CEU and YRI samples, 50 were significant at an FDR ≤20%. The average absolute log2 difference in mean expression levels between samples for these 50 genes was 0.73 (corresponding to a 1.65-fold change). Although the majority of observed gene-expression differences within and between populations are modest, even small perturbations in expression can have significant functional and phenotypic consequences.29Yan H Dobbie Z Gruber SB Markowitz S Romans K Giardiello FM Kinzler KW Vogelstein B Small changes in expression affect predisposition to tumorigenesis.Nat Genet. 2002; 30: 25-26Crossref PubMed Scopus (189) Google Scholar, 30Bray NJ Buckland PR Williams NM Williams HJ Norton N Owen MJ O’Donovan MC A haplotype implicated in schizophrenia susceptibility is associated with reduced COMT expression in human brain.Am J Hum Genet. 2003; 73: 152-161Abstract Full Text Full Text PDF PubMed Scopus (306) Google Scholar The results for all tests of differential expression are presented in a tab-delimited txt file (online only). To get a broad overview of the types of pathways that differentially expressed genes participate in, we tested whether they were overrepresented among PANTHER biological pathways.31Thomas PD Kejariwal A Campbell MJ Mi H Diemer K Guo N Ladunga I Ulitsky-Lazareva B Muruganujan A Rabkin S Vandergriff JA Doremieux O PANTHER: a browsable database of gene products organized by biological function, using curated protein family and subfamily classification.Nucl Acids Res. 2003; 31: 334-341Crossref PubMed Scopus (509) Google Scholar In this analysis, we considered the top 10% of genes differentially expressed either between individuals or between populations. Only two pathways were nominally significant (P=.05) for genes differentially expressed among individuals, and no pathways remained significant after correction for multiple hypothesis tests (table 1). Thus, gene-expression differences among individuals are found in a wide variety of pathways, which is consistent with our estimate that ∼83% of genes are differentially expressed among individuals. Examples of genes with large interindividual variation in expression include RAGE (MIM *605762) and LRAP (MIM *609497), the expression levels of which correlate with diabetic complications32dos Santos KG Canani LH Gross JL Tschiedel B Pires Souto KE Roisenberg I The -374A allele of the receptor for advanced glycation end products gene is associated with a decreased risk of ischemic heart disease in African-Brazilians with type 2 diabetes.Mol Genet Metab. 2005; 85: 149-156Crossref PubMed Scopus (52) Google Scholar and improper antigen processing,33Fruci D Ferracuti S Limongi MZ Cunsolo V Giorda E Fraioli R Sibilio L Carroll O Hattori A van Endert PM et al.Expression of endoplasmic reticulum aminopeptidases in EBV-B cell lines from healthy donors and in leukemia/lymphoma, carcinoma, and melanoma cell lines.J Immunol. 2006; 176: 4869-4879PubMed Google Scholar respectively. Genes differentially expressed between the CEU and YRI samples were strongly enriched in inflammatory pathways, even after a strict Bonferroni correction for multiple-hypothesis tests (table 1). Included in this set of genes are several cytokines and chemokine receptors (CCL22 [MIM *602957], CCL5 [MIM *187011], CCR2 [MIM *601267], CCR7 [MIM *600242], and CXCR3 [MIM *300574]) that have been implicated in numerous cardiovascular, infectious, and immune-related diseases.34Charo IF Peters W Chemokine receptor 2 (CCR2) in atherosclerosis, infectious diseases, and regulation of T-cell polarization.Microcirculation. 2003; 10: 259-264Crossref PubMed Google Scholar, 35Leung TF Tang NL Lam CW Li AM Fung SL Chan IH Wong GW RANTES G-401A polymorphism is associated with allergen sensitization and FEV1 in Chinese children.Respir Med. 2005; 99: 216-219Abstract Full Text Full Text PDF PubMed Scopus (18) Google ScholarTable 1Enrichment of PANTHER Biological Pathways among Differentially Expressed GenesSample Comparison and PANTHER Biological PathwayPIndividuals: Inflammation mediated by chemokine and cytokine1.91×10−2 T-cell activation3.01×10−2Populations: Inflammation mediated by chemokine and cytokine2.91×10−4 Histamine H1 receptor–mediated signaling pathway3.90×10−3 Toll-receptor signaling pathway1.02×10−2 Fibroblast growth factor–signaling pathway1.11×10−2 Vascular endothelial growth factor–signaling pathway1.14×10−2 T-cell activation1.32×10−2 EGF receptor–signaling pathway1.53×10−2 B-cell activation2.70×10−2 Notch-signaling pathway2.99×10−2 Enkephalin release2.99×10−2 5HT2 type receptor–mediated signaling pathway4.24×10−2Note.—All pathways nominally enriched at P=.05 are shown; bold type indicates significance after a Bonferroni correction for multiple hypothesis tests. Open table in a new tab Note.— All pathways nominally enriched at P=.05 are shown; bold type indicates significance after a Bonferroni correction for multiple hypothesis tests. Simply identifying genes differentially expressed within or among populations may provide an incomplete view of the quantitative details of gene-expression variation. For instance, we found examples where expression variation was observed primarily between populations but not individuals, both among individuals and between populations, or among individuals but not between populations (fig. 3). Classifying genes by differential versus no differential expression fails to accurately reflect the quantitative patterns of how expression variation is apportioned into within- and among-population components; therefore, it is important to investigate how much of total gene-expression variation is explained by individual and population effects. To this end, for each gene, we estimated the proportion of total gene-expression variation due to either differences among individuals or differences between populations, while properly taking into account technical variation (see appendix A). The median proportion of variation due to interindividual variation is 0.85 (fig. 4), which is nearly identical to levels of population structure observed in extant patterns of human genetic variation.6Barbujani G Magagni A Minch E Cavalli-Sforza LL An apportionment of human DNA diversity.Proc Natl Acad Sci USA. 1997; 94: 4516-4519Crossref PubMed Scopus (362) Google Scholar, 7Akey JM Zhang G Zhang K Jin L Shriver MD Interrogating a high-density SNP map for signatures of natural selection.Genome Res. 2002; 12: 1805-1814Crossref PubMed Scopus (707) Google Scholar, 8Rosenberg NA Pritchard JK Weber JL Cann HM Kidd KK Zhivotovsky LA Feldman MW Genetic structure of human populations.Science. 2002; 298: 2381-2385Crossref PubMed Scopus (2085) Google Scholar, 9Excoffier L Hamilton G Comment on “Genetic structure of human populations.”.Science. 2003; 300: 1877Crossref PubMed Scopus (50) Google Scholar In addition, similar to estimates of genetic structure at individual loci,7Akey JM Zhang G Zhang K Jin L Shriver MD Interrogating a high-density SNP map for signatures of natural selection.Genome Res. 2002; 12: 1805-1814Crossref PubMed Scopus (707) Google Scholar, 36Weir BS Cardon LR Anderson AD Nielsen DM Hill WG Measures of human population structure show heterogeneity among genomic regions.Genome Res. 2005; 15: 1468-1476Crossref PubMed Scopus (214) Google Scholar the distribution of population structure in expression levels across genes varies considerably (fig. 4).Figure 4Distribution of the proportion of total gene-expression variation explained by variation among individuals. The percentile of the proportion of all genes (X-axis) versus the proportion of total expression variation explained by interindividual differences (Y-axis) is shown for the observed (solid black line) and randomized (solid gray line) data. For example, the median (50th percentile) (dashed red line) proportion of variation explained by differences among individuals is 85%, leaving 15% explained by differences between populations. The dashed blue lines indicate the interquartile range. Note that the magnitude of total gene-expression variation attributable to interindividual differences in the observed data is considerably greater compared with the randomized data.View Large Image Figure ViewerDownload Hi-res image Download (PPT) To fully understand the genetic architecture of gene-expression levels and how population structure influences patterns of gene-expression variation, it will be necessary both to confirm predictions from microarray data and to delineate the molecular mechanisms governing regulatory variation. To begin to address these issues, we performed quantitative allele-specific PCR (qPCR) on SH2B3 (MIM *605093), which encodes for an adaptor protein that regulates growth factor and cytokine receptor-mediated pathways37Fitau J Boulday G Coulon F Quillard T Charreau B The adaptor molecule Lnk negatively regulates tumor necrosis factor-α-dependent VCAM-1 expression in endothelial cells through inhibition of the ERK1 and -2 pathways.J Biol Chem. 2006; 281: 20148-20159Crossref PubMed Scopus (48) Google Scholar and was in the top 1% of genes differentially expressed between the CEU and YRI samples. We generated double-stranded cDNA and used TaqMan (Applied Biosystems) allelic discrimination assays to interrogate the expression level of each allele for a SNP (rs1107853) in the coding region of SH2B3, as described elsewhere.38Ronald J Brem RB Whittle J Kruglyak L Local regulatory variation in Saccharomyces cerevisiae.PLoS Genet. 2005; 1: e25Crossref PubMed Scopus (122) Google Scholar We constructed a dilution series of heterozygous genomic DNA to estimate dye effects and differences in allele-specific hybridization efficiency, as described elsewhere.38Ronald J Brem RB Whittle J Kruglyak L Local regulatory variation in Saccharomyces cerevisiae.PLoS Genet. 2005; 1: e25Crossref PubMed Scopus (122) Google Scholar To obtain the overall expression level for each individual, we summed the expression levels of the two alleles after adjusting for dye effects and hybridization effects. All qPCR experiments were performed in triplicate. In our microarray experiments, SH2B3 showed an average log2-fold change between the CEU and YRI samples of 0.52 (P=6.5×10−4; FDR=0.134). Consistent with the microarray data, the qPCR results also demonstrate that SH2B3 is differentially expressed between the CEU and YRI samples (P=.0157) (fig. 5a). To better understand the molecular basis for the observed difference in expression, we asked whether the expression level of one allele was different from the other in heterozygous individuals. If so, this provides evidence of cis-regulatory effects.26Yan H Yuan W Velculescu VE Vogelstein B Kinzler KW Allelic variation in human gene expression.Science. 2002; 297: 1143Crossref PubMed Scopus (601) Google Scholar There was a significant difference (P=1.18×10−3) in expression between alleles in heterozygous cDNA versus genomic DNA, strongly suggesting cis-regulatory effects (fig. 5b). Interestingly, these observations coincide with patterns of genetic variation at SH2B3, since there are 13 SNPs with large allele-frequency differences (FST≥0.45) between the CEU and YRI samples (fig. 5c). Five of these highly differentiated SNPs occur in conserved regions, as determined by alignment of 17 vertebrate genomes, making them strong candidates for future functional studies. We calculated the empirical probability of observing a SNP with a pairwise FST≥0.45 between the CEU and YRI samples, on the basis of all autosomal markers contained in HapMap release 21, to be ∼0.05, and this magnitude of allele-frequency difference is consistent with a signature of local adaptation.7Akey JM Zhang G Zhang K Jin L Shriver MD Interrogating a high-density SNP map for signatures of natural selection.Genome Res. 2002; 12: 1805-1814Crossref PubMed Scopus (707) Google Scholar, 39Akey JM Eberle MA Rieder MJ Carlson CS Shriver MD Nickerson DA Kruglyak L Population history and natural selection shape patterns of genetic variation in 132 genes.PLoS Biol. 2004; 2: e286Crossref PubMed Scopus (392) Google ScholarSH2B3 also possesses unusually large levels of linkage disequilibrium compared with the rest of the genome,40Wang ET Kodama G Baldi P Moyzis RK Global landscape of recent inferred Darwinian selection for Homo sapiens..Proc Natl Acad Sci USA. 2006; 103: 135-140Crossref PubMed Scopus (265) Google Scholar which provides additional support for the hypothesis that this locus has been subject to adaptive evolution, although additional studies will be necessary to make more-definitive inferences about its evolutionary history. In summary, consistent with previous studies of model organisms,5Oleksiak MF Churchill GA Crawford DL Variation in gene expression within and among natural populations.Nat Genet. 2002; 32: 261-266Crossref PubMed Scopus (576) Google Scholar, 12Oleksiak MF Roach JL Crawford DL Natural variation in cardiac metabolism and gene expression in fundulus heteroclitus.Nat Genet. 2005; 37: 67-72PubMed Google Scholar, 13Townsend JP Cavalieri D Hartl DL Population genetic variation in genome-wide gene expression.Mol Biol Evol. 2003; 20: 955-963Crossref PubMed Scopus (166) Google Scholar, 14Jin W Riley RM Wolfinger RD White KP Passador-Gurgel G Gibson G The contributions of sex, genotype and age to transcriptional variance in Drosophila melanogaster..Nat Genet. 2001; 29: 389-395Crossref PubMed Scopus (522) Google Scholar our results demonstrate that considerable natural variation in gene-expression levels exists within and among human populations. Genes differentially expressed among populations may be particularly relevant to explore as candidate susceptibility loci for diseases whose prevalence varies as a function of ethnicity and may be amenable to genetic dissection by admixture linkage-disequilibrium mapping.41Halder I Shriver MD Measuring and using admixture to study the genetics of complex diseases.Hum Genomics. 2003; 1: 52-62Crossref PubMed Scopus (73) Google Scholar Importantly, we also show that simply focusing on differentially expressed genes can lead to an incomplete understanding of how gene-expression variation is apportioned within and among human populations. By decomposing expression variation into its component sources, we find that, similar to that observed for genetic variation, the majority of gene-expression variation is due to differences among individuals rather than among populations. These observations are subject to several caveats, including the fact that a relatively small number of individuals and populations were studied and that expression levels were measured only in a single (transformed) cell type. For example, of the ∼17% of genes expressed in B lymphoblasts that we estimate to be differentially expressed between the CEU and YRI samples, 50 could be identified at an FDR 500,000 statistics from the null distribution by permuting the individual labels within each population and recomputing the generalized likelihood-ratio statistics on these permuted data. The observed and null statistics were then used to estimate an FDR Q value (J.D.S.'s Web site) for each gene as described elsewhere.22Storey JD Tibshirani R Statistical significance for genome-wide studies.Proc Natl Acad Sci USA. 2003; 100: 9440-9445Crossref PubMed Scopus (7094) Google Scholar This also provides a conservative estimate of the total proportion (83%) of interindividual differentially expressed genes. We performed several diagnostic procedures to make sure that the results derived from tests of differential expression within and among populations were genuine and not confounded by technical artifacts. First, we observed that the significance of these two types of differential expression appeared to be independent. Genes showed both types of differential expression, only one type, or neither type in proportions expected by chance, given the fact that the two types of differential expression occur independently. Second, the correlation of log[p/(1−p)] of the two sets of P values was only 5%, which is well within the range observed under random permutations of the P values. Third, the error-variance estimates did not show any functional relationships with the estimated proportion of variation due to interindividual or population differences, indicating that our model successfully separated the biological signal of interest from the technical and measurement errors. Download .txt (.33 MB) Help with txt files Download .txt (.63 MB) Help with txt files Addendum et al.The American Journal of Human GeneticsJune, 2007In BriefStorey et al. (March 2007 80:502–509) Full-Text PDF Open Archive
Referência(s)