Artigo Acesso aberto Revisado por pares

Identification of Quantitative Trait Loci Underlying Proteome Variation in Human Lymphoblastoid Cells

2010; Elsevier BV; Volume: 9; Issue: 7 Linguagem: Inglês

10.1074/mcp.m900378-mcp200

ISSN

1535-9484

Autores

Nikhil Garge, Huaqin Pan, Megan D. Rowland, Benjamin J. Cargile, Xinxin Zhang, Phillip Cooley, Grier P. Page, Maureen K. Bunger,

Tópico(s)

Bioinformatics and Genomic Networks

Resumo

Population-based variability in protein expression patterns, especially in humans, is often observed but poorly understood. Moreover, very little is known about how interindividual genetic variation contributes to protein expression patterns. To begin to address this, we describe elements of technical and biological variations contributing to expression of 544 proteins in a population of 24 individual human lymphoblastoid cell lines that have been extensively genotyped as part of the International HapMap Project. We determined that expression levels of 10% of the proteins were tightly correlated to cell doubling rates. Using the publicly available genotypes for these lymphoblastoid cell lines, we applied a genetic association approach to identify quantitative trait loci associated with protein expression variation. Results identified 24 protein forms corresponding to 15 proteins for which genetic elements were responsible for >50% of the expression variation. The genetic variation associated with protein expression levels were located in cis with the gene coding for the transcript of the protein for 19 of these protein forms. Four of the genetic elements identified were coding non-synonymous single nucleotide polymorphisms that resulted in migration pattern changes in the two-dimensional gel. This is the first description of large scale proteomics analysis demonstrating the direct relationship between genome and proteome variations in human cells. Population-based variability in protein expression patterns, especially in humans, is often observed but poorly understood. Moreover, very little is known about how interindividual genetic variation contributes to protein expression patterns. To begin to address this, we describe elements of technical and biological variations contributing to expression of 544 proteins in a population of 24 individual human lymphoblastoid cell lines that have been extensively genotyped as part of the International HapMap Project. We determined that expression levels of 10% of the proteins were tightly correlated to cell doubling rates. Using the publicly available genotypes for these lymphoblastoid cell lines, we applied a genetic association approach to identify quantitative trait loci associated with protein expression variation. Results identified 24 protein forms corresponding to 15 proteins for which genetic elements were responsible for >50% of the expression variation. The genetic variation associated with protein expression levels were located in cis with the gene coding for the transcript of the protein for 19 of these protein forms. Four of the genetic elements identified were coding non-synonymous single nucleotide polymorphisms that resulted in migration pattern changes in the two-dimensional gel. This is the first description of large scale proteomics analysis demonstrating the direct relationship between genome and proteome variations in human cells. Recent research has shown that gene expression variation in both humans and model organisms behaves as a complex genetic trait (1.Monks S.A. Leonardson A. Zhu H. Cundiff P. Pietrusiak P. Edwards S. Phillips J.W. Sachs A. Schadt E.E. Genetic inheritance of gene expression in human cell lines.Am. J. Hum. Genet. 2004; 75: 1094-1105Abstract Full Text Full Text PDF PubMed Scopus (299) Google Scholar, 2.Bergen A.W. Baccarelli A. McDaniel T.K. Kuhn K. Pfeiffer R. Kakol J. Bender P. Jacobs K. Packer B. Chanock S.J. Yeager M. cis sequence effects on gene expression.BMC Genomics. 2007; 8: 296Crossref PubMed Scopus (5) Google Scholar, 3.Brem R.B. Kruglyak L. The landscape of genetic complexity across 5,700 gene expression traits in yeast.Proc. Natl. Acad. Sci. U.S.A. 2005; 102: 1572-1577Crossref PubMed Scopus (472) Google Scholar, 4.Brem R.B. Yvert G. Clinton R. Kruglyak L. Genetic dissection of transcriptional regulation in budding yeast.Science. 2002; 296: 752-755Crossref PubMed Scopus (1058) Google Scholar, 5.Gilad Y. Rifkin S.A. Pritchard J.K. Revealing the architecture of gene regulation: the promise of eQTL studies.Trends Genet. 2008; 24: 408-415Abstract Full Text Full Text PDF PubMed Scopus (374) Google Scholar). The genetic basis of gene expression can be treated in a manner similar to that of other complex quantitative phenotypes, such as body weight and blood pressure. The identification of quantitative trait loci associating with individual mRNA expression levels (expression quantitative trait loci (eQTL) 1The abbreviations used are:eQTLexpression quantitative trait locus (loci if plural)LCLlymphoblastoid cell line2Dtwo-dimensionalBH-FDRBenjamini and Hochberg false discovery rateCEPHCentre d'Etude du PolymorphismeCEUCaucasians of European descent living in UtahGWAgenome-wide associationLDlinkage disequilibriumNS-SNPnon-synonymous single nucleotide polymorphismPDTpopulation doubling timepeQTLprotein expression quantitative trait locus (loci if plural)QTLquantitative trait locus (loci if plural)SNPsingle nucleotide polymorphismIDidentityLCP1L-plastinCPNE1Copine-1HCLS1hematopoietic cell-specific lyn substrate 1UTRuntranslated regionEXOC4exocyst complex component 4.) has been reported in yeast and mice using recombinant inbred strains (6.Ronald J. Brem R.B. Whittle J. Kruglyak L. Local regulatory variation in Saccharomyces cerevisiae.PLoS Genet. 2005; 1: e25Crossref PubMed Scopus (123) Google Scholar, 7.Cervino A.C. Li G. Edwards S. Zhu J. Laurie C. Tokiwa G. Lum P.Y. Wang S. Castellani L.W. Lusis A.J. Carlson S. Sachs A.B. Schadt E.E. Integrating QTL and high-density SNP analyses in mice to identify Insig2 as a susceptibility gene for plasma cholesterol levels.Genomics. 2005; 86: 505-517Crossref PubMed Scopus (118) Google Scholar, 8.Davis R.C. Schadt E.E. Cervino A.C. Péterfy M. Lusis A.J. Ultrafine mapping of SNPs from mouse strains C57BL/6J, DBA/2J, and C57BLKS/J for loci contributing to diabetes and atherosclerosis susceptibility.Diabetes. 2005; 54: 1191-1199Crossref PubMed Scopus (33) Google Scholar, 9.Ghazalpour A. Doss S. Sheth S.S. Ingram-Drake L.A. Schadt E.E. Lusis A.J. Drake T.A. Genomic analysis of metabolic pathway gene expression in mice.Genome Biol. 2005; 6: R59Crossref PubMed Google Scholar, 10.Schadt E.E. Monks S.A. Drake T.A. Lusis A.J. Che N. Colinayo V. Ruff T.G. Milligan S.B. Lamb J.R. Cavet G. Linsley P.S. Mao M. Stoughton R.B. Friend S.H. Genetics of gene expression surveyed in maize, mouse and man.Nature. 2003; 422: 297-302Crossref PubMed Scopus (1192) Google Scholar, 11.Doss S. Schadt E.E. Drake T.A. Lusis A.J. Cis-acting expression quantitative trait loci in mice.Genome Res. 2005; 15: 681-691Crossref PubMed Scopus (230) Google Scholar). It has also been described in humans using lymphocytes, Epstein-Barr virus-transformed lymphoblastoid cell lines (LCLs), and tissue biopsies (10.Schadt E.E. Monks S.A. Drake T.A. Lusis A.J. Che N. Colinayo V. Ruff T.G. Milligan S.B. Lamb J.R. Cavet G. Linsley P.S. Mao M. Stoughton R.B. Friend S.H. Genetics of gene expression surveyed in maize, mouse and man.Nature. 2003; 422: 297-302Crossref PubMed Scopus (1192) Google Scholar, 12.Emilsson V. Thorleifsson G. Zhang B. Leonardson A.S. Zink F. Zhu J. Carlson S. Helgason A. Walters G.B. Gunnarsdottir S. Mouy M. Steinthorsdottir V. Eiriksdottir G.H. Bjornsdottir G. Reynisdottir I. Gudbjartsson D. Helgadottir A. Jonasdottir A. Jonasdottir A. Styrkarsdottir U. Gretarsdottir S. Magnusson K.P. Stefansson H. Fossdal R. Kristjansson K. Gislason H.G. Stefansson T. Leifsson B.G. Thorsteinsdottir U. Lamb J.R. Gulcher J.R. Reitman M.L. Kong A. Schadt E.E. Stefansson K. Genetics of gene expression and its effect on disease.Nature. 2008; 452: 423-428Crossref PubMed Scopus (1016) Google Scholar, 13.Schadt E.E. Molony C. Chudin E. Hao K. Yang X. Lum P.Y. Kasarskis A. Zhang B. Wang S. Suver C. Zhu J. Millstein J. Sieberts S. Lamb J. GuhaThakurta D. Derry J. Storey J.D. Avila-Campillo I. Kruger M.J. Johnson J.M. Rohl C.A. van Nas A. Mehrabian M. Drake T.A. Lusis A.J. Smith R.C. Guengerich F.P. Strom S.C. Schuetz E. Rushmore T.H. Ulrich R. Mapping the genetic architecture of gene expression in human liver.PLoS Biol. 2008; 6: e107Crossref PubMed Scopus (787) Google Scholar, 14.Dermitzakis E.T. Stranger B.E. Genetic variation in human gene expression.Mamm. Genome. 2006; 17: 503-508Crossref PubMed Scopus (41) Google Scholar, 15.Stranger B.E. Forrest M.S. Clark A.G. Minichiello M.J. Deutsch S. Lyle R. Hunt S. Kahl B. Antonarakis S.E. Tavaré S. Deloukas P. Dermitzakis E.T. Genome-wide associations of gene expression variation in humans.PLoS Genet. 2005; 1: e78Crossref PubMed Scopus (432) Google Scholar, 16.Stranger B.E. Forrest M.S. Dunning M. Ingle C.E. Beazley C. Thorne N. Redon R. Bird C.P. de Grassi A. Lee C. Tyler-Smith C. Carter N. Scherer S.W. Tavaré S. Deloukas P. Hurles M.E. Dermitzakis E.T. Relative impact of nucleotide and copy number variation on gene expression phenotypes.Science. 2007; 315: 848-853Crossref PubMed Scopus (1357) Google Scholar, 17.Stranger B.E. Nica A.C. Forrest M.S. Dimas A. Bird C.P. Beazley C. Ingle C.E. Dunning M. Flicek P. Koller D. Montgomery S. Tavaré S. Deloukas P. Dermitzakis E.T. Population genomics of human gene expression.Nat. Genet. 2007; 39: 1217-1224Crossref PubMed Scopus (916) Google Scholar, 18.Cheung V.G. Conlin L.K. Weber T.M. Arcaro M. Jen K.Y. Morley M. Spielman R.S. Natural variation in human gene expression assessed in lymphoblastoid cells.Nat. Genet. 2003; 33: 422-425Crossref PubMed Scopus (463) Google Scholar, 19.Cheung V.G. Spielman R.S. Ewens K.G. Weber T.M. Morley M. Burdick J.T. Mapping determinants of human gene expression by regional and genome-wide association.Nature. 2005; 437: 1365-1369Crossref PubMed Scopus (487) Google Scholar, 20.Morley M. Molony C.M. Weber T.M. Devlin J.L. Ewens K.G. Spielman R.S. Cheung V.G. Genetic analysis of genome-wide variation in human gene expression.Nature. 2004; 430: 743-747Crossref PubMed Scopus (953) Google Scholar, 21.Spielman R.S. Bastone L.A. Burdick J.T. Morley M. Ewens W.J. Cheung V.G. Common genetic variants account for differences in gene expression among ethnic groups.Nat. Genet. 2007; 39: 226-231Crossref PubMed Scopus (400) Google Scholar). Several recent studies using eQTL analysis have also been used to narrow the list of candidate genes for Type I diabetes and coronary artery disease susceptibility in obese mouse strains (7.Cervino A.C. Li G. Edwards S. Zhu J. Laurie C. Tokiwa G. Lum P.Y. Wang S. Castellani L.W. Lusis A.J. Carlson S. Sachs A.B. Schadt E.E. Integrating QTL and high-density SNP analyses in mice to identify Insig2 as a susceptibility gene for plasma cholesterol levels.Genomics. 2005; 86: 505-517Crossref PubMed Scopus (118) Google Scholar, 12.Emilsson V. Thorleifsson G. Zhang B. Leonardson A.S. Zink F. Zhu J. Carlson S. Helgason A. Walters G.B. Gunnarsdottir S. Mouy M. Steinthorsdottir V. Eiriksdottir G.H. Bjornsdottir G. Reynisdottir I. Gudbjartsson D. Helgadottir A. Jonasdottir A. Jonasdottir A. Styrkarsdottir U. Gretarsdottir S. Magnusson K.P. Stefansson H. Fossdal R. Kristjansson K. Gislason H.G. Stefansson T. Leifsson B.G. Thorsteinsdottir U. Lamb J.R. Gulcher J.R. Reitman M.L. Kong A. Schadt E.E. Stefansson K. Genetics of gene expression and its effect on disease.Nature. 2008; 452: 423-428Crossref PubMed Scopus (1016) Google Scholar, 13.Schadt E.E. Molony C. Chudin E. Hao K. Yang X. Lum P.Y. Kasarskis A. Zhang B. Wang S. Suver C. Zhu J. Millstein J. Sieberts S. Lamb J. GuhaThakurta D. Derry J. Storey J.D. Avila-Campillo I. Kruger M.J. Johnson J.M. Rohl C.A. van Nas A. Mehrabian M. Drake T.A. Lusis A.J. Smith R.C. Guengerich F.P. Strom S.C. Schuetz E. Rushmore T.H. Ulrich R. Mapping the genetic architecture of gene expression in human liver.PLoS Biol. 2008; 6: e107Crossref PubMed Scopus (787) Google Scholar). These initial studies have established that more than 80% of mRNA expression phenotypes are heritable and that, on average, 30% of the variation is due to the genetic sources. As a result, the mapping of eQTL responsible for gene expression variation in disease suggests that eQTL association studies may be an additional powerful method in uncovering the causal genetic determinants of disease and disease susceptibility (5.Gilad Y. Rifkin S.A. Pritchard J.K. Revealing the architecture of gene regulation: the promise of eQTL studies.Trends Genet. 2008; 24: 408-415Abstract Full Text Full Text PDF PubMed Scopus (374) Google Scholar, 7.Cervino A.C. Li G. Edwards S. Zhu J. Laurie C. Tokiwa G. Lum P.Y. Wang S. Castellani L.W. Lusis A.J. Carlson S. Sachs A.B. Schadt E.E. Integrating QTL and high-density SNP analyses in mice to identify Insig2 as a susceptibility gene for plasma cholesterol levels.Genomics. 2005; 86: 505-517Crossref PubMed Scopus (118) Google Scholar, 11.Doss S. Schadt E.E. Drake T.A. Lusis A.J. Cis-acting expression quantitative trait loci in mice.Genome Res. 2005; 15: 681-691Crossref PubMed Scopus (230) Google Scholar, 12.Emilsson V. Thorleifsson G. Zhang B. Leonardson A.S. Zink F. Zhu J. Carlson S. Helgason A. Walters G.B. Gunnarsdottir S. Mouy M. Steinthorsdottir V. Eiriksdottir G.H. Bjornsdottir G. Reynisdottir I. Gudbjartsson D. Helgadottir A. Jonasdottir A. Jonasdottir A. Styrkarsdottir U. Gretarsdottir S. Magnusson K.P. Stefansson H. Fossdal R. Kristjansson K. Gislason H.G. Stefansson T. Leifsson B.G. Thorsteinsdottir U. Lamb J.R. Gulcher J.R. Reitman M.L. Kong A. Schadt E.E. Stefansson K. Genetics of gene expression and its effect on disease.Nature. 2008; 452: 423-428Crossref PubMed Scopus (1016) Google Scholar, 18.Cheung V.G. Conlin L.K. Weber T.M. Arcaro M. Jen K.Y. Morley M. Spielman R.S. Natural variation in human gene expression assessed in lymphoblastoid cells.Nat. Genet. 2003; 33: 422-425Crossref PubMed Scopus (463) Google Scholar, 20.Morley M. Molony C.M. Weber T.M. Devlin J.L. Ewens K.G. Spielman R.S. Cheung V.G. Genetic analysis of genome-wide variation in human gene expression.Nature. 2004; 430: 743-747Crossref PubMed Scopus (953) Google Scholar, 22.Pastinen T. Ge B. Hudson T.J. Influence of human genome polymorphism on gene expression.Hum. Mol. Genet. 2006; 15: R9-16Crossref PubMed Scopus (108) Google Scholar, 23.Pastinen T. Ge B. Gurd S. Gaudin T. Dore C. Lemire M. Lepage P. Harmsen E. Hudson T.J. Mapping common regulatory variants to human haplotypes.Hum. Mol. Genet. 2005; 14: 3963-3971Crossref PubMed Scopus (84) Google Scholar, 24.Pastinen T. Sladek R. Gurd S. Sammak A. Ge B. Lepage P. Lavergne K. Villeneuve A. Gaudin T. Brändström H. Beck A. Verner A. Kingsley J. Harmsen E. Labuda D. Morgan K. Vohl M.C. Naumova A.K. Sinnett D. Hudson T.J. A survey of genetic and epigenetic variation affecting human gene expression.Physiol. Genomics. 2004; 16: 184-193Crossref PubMed Scopus (214) Google Scholar, 25.Dixon A.L. Liang L. Moffatt M.F. Chen W. Heath S. Wong K.C. Taylor J. Burnett E. Gut I. Farrall M. Lathrop G.M. Abecasis G.R. Cookson W.O. A genome-wide association study of global gene expression.Nat. Genet. 2007; 39: 1202-1207Crossref PubMed Scopus (796) Google Scholar, 26.Drake T.A. Schadt E.E. Davis R.C. Lusis A.J. Integrating genetic and gene expression data to study the metabolic syndrome and diabetes in mice.Am. J. Ther. 2005; 12: 503-511Crossref PubMed Scopus (22) Google Scholar). expression quantitative trait locus (loci if plural) lymphoblastoid cell line two-dimensional Benjamini and Hochberg false discovery rate Centre d'Etude du Polymorphisme Caucasians of European descent living in Utah genome-wide association linkage disequilibrium non-synonymous single nucleotide polymorphism population doubling time protein expression quantitative trait locus (loci if plural) quantitative trait locus (loci if plural) single nucleotide polymorphism identity L-plastin Copine-1 hematopoietic cell-specific lyn substrate 1 untranslated region exocyst complex component 4. Most disease phenotypes have both causal and reactive relationships to several hundred protein changes across multiple tissue types and environmental variables (26.Drake T.A. Schadt E.E. Davis R.C. Lusis A.J. Integrating genetic and gene expression data to study the metabolic syndrome and diabetes in mice.Am. J. Ther. 2005; 12: 503-511Crossref PubMed Scopus (22) Google Scholar, 27.Cookson W. Liang L. Abecasis G. Moffatt M. Lathrop M. Mapping complex disease traits with global gene expression.Nat. Rev. Genet. 2009; 10: 184-194Crossref PubMed Scopus (615) Google Scholar, 28.Schadt E.E. Exploiting naturally occurring DNA variation and molecular profiling data to dissect disease and drug response traits.Curr. Opin. Biotechnol. 2005; 16: 647-654Crossref PubMed Scopus (45) Google Scholar). Consequently, the genetic components that are involved in complex disease could number in the hundreds. However, disease phenotype variations must relate to variation of individual proteins. One hypothesis is that each protein trait should be less complexly associated with the genome than is a clinical disease phenotype. Therefore, determining associations between genetic polymorphism and individual protein traits affords a method of building the complex relationship between genome and disease from the bottom up. Furthermore, an understanding of this relationship has additional practical implications in disease and drug response protein biomarker research in humans where identification of associating polymorphisms could be utilized as co-variables in the discovery process. The regulation of any given protein level is influenced by transcript abundance and post-translational modifications that can increase or decrease protein turnover in response to physiological conditions. This additional regulation at the protein level means that, frequently, transcript levels do not correlate well with corresponding protein expression levels (29.Gygi S.P. Rochon Y. Franza B.R. Aebersold R. Correlation between protein and mRNA abundance in yeast.Mol. Cell. Biol. 1999; 19: 1720-1730Crossref PubMed Scopus (3193) Google Scholar). Furthermore, many disease phenotypes are directly related to the degree and type of protein post-translational modifications, such as alterations in protein folding and ubiquitin modifications (30.Xie H. Vucetic S. Iakoucheva L.M. Oldfield C.J. Dunker A.K. Obradovic Z. Uversky V.N. Functional anthology of intrinsic disorder. 3. Ligands, post-translational modifications, and diseases associated with intrinsically disordered proteins.J. Proteome Res. 2007; 6: 1917-1932Crossref PubMed Scopus (315) Google Scholar, 31.Vucetic S. Xie H. Iakoucheva L.M. Oldfield C.J. Dunker A.K. Obradovic Z. Uversky V.N. Functional anthology of intrinsic disorder. 2. Cellular components, domains, technical terms, developmental processes, and coding sequence diversities correlated with long disordered regions.J. Proteome Res. 2007; 6: 1899-1916Crossref PubMed Scopus (207) Google Scholar, 32.Xie H. Vucetic S. Iakoucheva L.M. Oldfield C.J. Dunker A.K. Uversky V.N. Obradovic Z. Functional anthology of intrinsic disorder. 1. Biological processes and functions of proteins with long disordered regions.J. Proteome Res. 2007; 6: 1882-1898Crossref PubMed Scopus (449) Google Scholar). Therefore, a deeper understanding of the molecular relationship between genotype and disease can be ascertained by studying the relationship between the genome and proteome. Efforts to identify genetic elements associated with protein expression variation in yeast and mouse brain tissue have been described (33.Melzer D. Perry J.R. Hernandez D. Corsi A.M. Stevens K. Rafferty I. Lauretani F. Murray A. Gibbs J.R. Paolisso G. Rafiq S. Simon-Sanchez J. Lango H. Scholz S. Weedon M.N. Arepalli S. Rice N. Washecka N. Hurst A. Britton A. Henley W. van de Leemput J. Li R. Newman A.B. Tranah G. Harris T. Panicker V. Dayan C. Bennett A. McCarthy M.I. Ruokonen A. Jarvelin M.R. Guralnik J. Bandinelli S. Frayling T.M. Singleton A. Ferrucci L. A genome-wide association study identifies protein quantitative trait loci (pQTLs).Plos Genet. 2008; 4: e1000072Crossref PubMed Scopus (361) Google Scholar, 34.Foss E.J. Radulovic D. Shaffer S.A. Ruderfer D.M. Bedalov A. Goodlett D.R. Kruglyak L. Genetic basis of proteome variation in yeast.Nat. Genet. 2007; 39: 1369-1375Crossref PubMed Scopus (186) Google Scholar, 35.Klose J. Nock C. Herrmann M. Stühler K. Marcus K. Blüggel M. Krause E. Schalkwyk L.C. Rastan S. Brown S.D. Büssow K. Himmelbauer H. Lehrach H. Genetic analysis of the mouse brain proteome.Nat. Genet. 2002; 30: 385-393Crossref PubMed Scopus (213) Google Scholar). Using linkage analysis in yeast, Foss et al. (34.Foss E.J. Radulovic D. Shaffer S.A. Ruderfer D.M. Bedalov A. Goodlett D.R. Kruglyak L. Genetic basis of proteome variation in yeast.Nat. Genet. 2007; 39: 1369-1375Crossref PubMed Scopus (186) Google Scholar) compared results for both transcript and protein-associated QTL in identical samples resulting from an intercross of two divergent yeast strains. This study established that proteomic variation is just as heritable as transcript variation; however, loci associated with transcript variation was more readily mapped compared with those associated with protein variation. This implies that either more technical or stochastic noise is present in a proteomic evaluation or that proteomic regulation is more complexly related to the genome than is transcript variation. In support of the latter assertion, more of the linkages that were identified for protein regulation were located outside of the region of the genome that codes for the transcript (i.e. in trans). In other words, alleles not associated with the transcript coding region were more likely to appear as the primary regulators of protein expression in this system. A similar observation was made by Klose et al. (35.Klose J. Nock C. Herrmann M. Stühler K. Marcus K. Blüggel M. Krause E. Schalkwyk L.C. Rastan S. Brown S.D. Büssow K. Himmelbauer H. Lehrach H. Genetic analysis of the mouse brain proteome.Nat. Genet. 2002; 30: 385-393Crossref PubMed Scopus (213) Google Scholar) in describing the identification of linkages associated with shifts in migration patterns on two-dimensional (2D) gels of mouse brain tissue. In humans, although there is a significant body of literature describing efforts to identify transcript QTL, efforts to identify proteome QTL have lagged behind. In one study, Melzer et al. (33.Melzer D. Perry J.R. Hernandez D. Corsi A.M. Stevens K. Rafferty I. Lauretani F. Murray A. Gibbs J.R. Paolisso G. Rafiq S. Simon-Sanchez J. Lango H. Scholz S. Weedon M.N. Arepalli S. Rice N. Washecka N. Hurst A. Britton A. Henley W. van de Leemput J. Li R. Newman A.B. Tranah G. Harris T. Panicker V. Dayan C. Bennett A. McCarthy M.I. Ruokonen A. Jarvelin M.R. Guralnik J. Bandinelli S. Frayling T.M. Singleton A. Ferrucci L. A genome-wide association study identifies protein quantitative trait loci (pQTLs).Plos Genet. 2008; 4: e1000072Crossref PubMed Scopus (361) Google Scholar) describe experiments to associate single nucleotide polymorphisms (SNPs) with levels of 42 serum proteins detected in clinical assays in a population of 1200 human subjects. This effort resulted in the identification of protein QTL associating with eight serum proteins. In contrast to the work described for yeast and mouse linkage analysis by Klose et al. (35.Klose J. Nock C. Herrmann M. Stühler K. Marcus K. Blüggel M. Krause E. Schalkwyk L.C. Rastan S. Brown S.D. Büssow K. Himmelbauer H. Lehrach H. Genetic analysis of the mouse brain proteome.Nat. Genet. 2002; 30: 385-393Crossref PubMed Scopus (213) Google Scholar) and Foss et al. (34.Foss E.J. Radulovic D. Shaffer S.A. Ruderfer D.M. Bedalov A. Goodlett D.R. Kruglyak L. Genetic basis of proteome variation in yeast.Nat. Genet. 2007; 39: 1369-1375Crossref PubMed Scopus (186) Google Scholar), the QTL identified in this study were mostly located in the region surrounding the transcript-coding allele (i.e. in cis). Collectively, these previous studies highlight the additional complexities of identifying genetic elements associated with proteins as compared with genetic elements associated with transcripts. Therefore, success for such studies in the future will be highly dependent on identifying the most appropriate methods for quantitatively evaluating the proteome. Large scale quantitative proteomics is a challenging methodological process. Shotgun proteomics approaches in which proteins are digested in mixtures and separated by liquid chromatography upstream of mass spectrometric detection is increasingly used for large scale unbiased proteomics. In this approach, relative protein quantitation can be carried out using either differential labeling with isotopic reagents or label-free methods, such as peak height determination or spectral counting (36.Mann M. Functional and quantitative proteomics using SILAC.Nat. Rev. Mol. Cell Biol. 2006; 7: 952-958Crossref PubMed Scopus (769) Google Scholar, 37.Gygi S.P. Rist B. Gerber S.A. Turecek F. Gelb M.H. Aebersold R. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags.Nat. Biotechnol. 1999; 17: 994-999Crossref PubMed Scopus (4362) Google Scholar, 38.Finney G.L. Blackler A.R. Hoopmann M.R. Canterbury J.D. Wu C.C. MacCoss M.J. Label-free comparative analysis of proteomics mixtures using chromatographic alignment of high-resolution muLC-MS data.Anal. Chem. 2008; 80: 961-971Crossref PubMed Scopus (54) Google Scholar, 39.Zhang B. VerBerkmoes N.C. Langston M.A. Uberbacher E. Hettich R.L. Samatova N.F. Detecting differential and correlated protein expression in label-free shotgun proteomics.J. Proteome Res. 2006; 5: 2909-2918Crossref PubMed Scopus (329) Google Scholar, 40.Carvalho P.C. Hewel J. Barbosa V.C. Yates 3rd, J.R. Identifying differences in protein expression levels by spectral counting and feature selection.Genet. Mol. Res. 2008; 7: 342-356Crossref PubMed Scopus (74) Google Scholar, 41.Washburn M.P. Ulaszek R. Deciu C. Schieltz D.M. Yates 3rd, J.R. Analysis of quantitative proteomic data generated via multidimensional protein identification technology.Anal. Chem. 2002; 74: 1650-1657Crossref PubMed Scopus (383) Google Scholar). When using proteomics to uncover the relationship between genomic polymorphisms and proteomic variation, a more appealing approach would be to target intact proteins where differences in isoforms and post-translational modifications can be reliably monitored in contrast to the shotgun approach. Currently, the most reliable method of unbiased quantitative analysis of intact proteins is 2D DIGE. This approach takes advantage of the quantitative properties of fluorescent protein labeling dyes coupled with large format 2D DIGE (42.Lilley K.S. Friedman D.B. All about DIGE: quantification technology for differential-display 2D-gel proteomics.Expert Rev. Proteomics. 2004; 1: 401-409Crossref PubMed Scopus (247) Google Scholar, 43.Friedman D.B. Lilley K.S. Optimizing the difference gel electrophoresis (DIGE) technology.Methods Mol. Biol. 2008; 428: 93-124Crossref PubMed Google Scholar). In this study, we describe technical, biological, and genetic sources of proteomic variation in a small cohort of human LCLs derived from the Centre d'Etude du Polymorphisme (CEPH) cohort of Caucasians of European descent living in Utah (CEU) using 2D DIGE. The purpose of this work is to identify genetic elements responsible for variation in protein expression in human cells. As mentioned above, 2D DIGE is unique among proteomics methods in allowing the ability to quantitatively monitor differential modifications of intact proteins independently. Therefore, this approach should allow for observation of regulatory polymorphisms influencing protein expression levels in addition to other polymorphisms that could influence the pI and molecular weight of any given protein. To determine how much variation in protein expression levels detected by 2D DIGE analysis can be explained by genetic differences between the cell lines, we performed a protein expression quantitative trait locus (peQTL) analysis of 544 basal protein expression phenotypes using publicly available genotype data and both cis-only and genome-wide association (GWA) approaches. As a result, we identified a total of 24 peQTL associated with individual protein phenotypes. These results affirm that variability in cellular protein expression in human LCLs is influenced by polymorphisms in the genome and provide the first broad scale proteomics analysis to complement previous eQTL efforts using this model system. Furthermore, our results demonstrate that 2D DIGE has both the sensitivity and specificity to be used in understanding sources of biological variation in the proteome. Sodium chloride, Trizma (Tris base), bromphenol blue, chloroform, N,N-dimethylformamide, phosphatase inhibitor, sodium carbonate, sodium bicarbonate, ammonium bicarbonate, ammonium monobasic phosphate, l-lysine, and α-cyano-4-hydroxycinnamic acid were obtained from Sigma-Aldrich. Urea, thiourea, SDS, DTT, Tris, and iodoacetamide were obtained from GE Healthcare. Protein labeling dyes (CyDye DIGE fluors) with an N-hydroxysuccinimide ester reactive group for reaction with the ε amino group of lysine of proteins were obtained from GE Healthcare. CHAPS was obtained from USB Corp. Complete Mini protease inhibitor mixture tablets were obtained from Roche Applied Science. Methanol, acetonitrile, and HPLC grade water were obtained from Burdick & Jackson. Acetic acid was obtained from Mallinckrodt Baker. CEPH-CEU LCLs were acquired from the Coriell Institute for Medical Research. Twenty-four cell lines from the CEPH-CEU collection were obtained from the Coriell repository. Identification numbers for the 24 cell lines are GM12057, GM07345, GM12145, GM10860, GM11829, GM12056, GM11840, GM11830, GM12004, GM12144, GM10846, GM07357, GM11839, GM12003, GM06994, GM11993, GM10856, GM11992, GM07000, GM07348, GM10854, GM07029, GM10838, and GM10851. Cells were cultur

Referência(s)