Only a Small Subset of the Horizontally Transferred Chromosomal Genes in Escherichia coli Are Translated into Proteins
2004; Elsevier BV; Volume: 3; Issue: 8 Linguagem: Inglês
10.1074/mcp.m400030-mcp200
ISSN1535-9484
AutoresMasato Taoka, Yoshio Yamauchi, Takashi Shinkawa, Hiroyuki Kaji, Wakana Motohashi, Hiroshi Nakayama, Nobuhiro Takahashi, Toshiaki Isobe,
Tópico(s)RNA and protein synthesis mechanisms
ResumoHorizontally transferred genes are believed to play a critical role in the divergence of bacterial strains from a common ancestor, but whether all of these genes express functional proteins in the cell remains unknown. Here, we used an integrated LC-based protein identification technology to analyze the proteome of Escherichia coli strain K12 (JM109) and identified 1,480 expressed proteins, which are equivalent to ∼35% of the total open reading frames predicted in the genome. This subset contained proteins with cellular abundance of several dozens to hundreds of thousands of copies, and included nearly all types of proteins in terms of chemical characteristics, subcellular distribution, and function. Interestingly, the subset also contained 138 of 164 gene products that are currently known to be essential for bacterial viability (84% coverage). However, the subset contained only a very small population (10%) of protein products from genes mapped within K-loops, which are “hot spots” for the integration of foreign DNAs within the K12 genome. On the other hand, these genes in K-loops appeared to be transcribed to RNAs almost as efficiently as the native genes in the bacterial cell as monitored by DNA microarray analysis, raising the possibility that most of the recently acquired foreign genes are inadequate for the translational machinery for the native genes and do not generate functional proteins within the cell. Horizontally transferred genes are believed to play a critical role in the divergence of bacterial strains from a common ancestor, but whether all of these genes express functional proteins in the cell remains unknown. Here, we used an integrated LC-based protein identification technology to analyze the proteome of Escherichia coli strain K12 (JM109) and identified 1,480 expressed proteins, which are equivalent to ∼35% of the total open reading frames predicted in the genome. This subset contained proteins with cellular abundance of several dozens to hundreds of thousands of copies, and included nearly all types of proteins in terms of chemical characteristics, subcellular distribution, and function. Interestingly, the subset also contained 138 of 164 gene products that are currently known to be essential for bacterial viability (84% coverage). However, the subset contained only a very small population (10%) of protein products from genes mapped within K-loops, which are “hot spots” for the integration of foreign DNAs within the K12 genome. On the other hand, these genes in K-loops appeared to be transcribed to RNAs almost as efficiently as the native genes in the bacterial cell as monitored by DNA microarray analysis, raising the possibility that most of the recently acquired foreign genes are inadequate for the translational machinery for the native genes and do not generate functional proteins within the cell. It is generally accepted that the genetic diversity of organisms arose from a number of mechanisms to obtain a new gene, including lateral gene transfer from other species, creation of mosaic genes from parts of other genes, duplication of pre-existing genes, and de novo invention of genes from DNA that had previously a noncoding sequence (1Wolfe K.H. Li W.H. Molecular evolution meets the genomics revolution..Nat. Genet. 2003; 33: 255-265Google Scholar). Unlike eukaryotes, which evolve principally through the modification of pre-existing genetic information, bacteria have obtained a significant proportion of their genetic diversity through the acquisition of genes from distantly related organisms (1Wolfe K.H. Li W.H. Molecular evolution meets the genomics revolution..Nat. Genet. 2003; 33: 255-265Google Scholar, 2Doolittle R.F. Biodiversity: Microbial genomes multiply..Nature. 2002; 416: 697-700Google Scholar, 3Koonin E.V. Makarova K.S. Aravind L. Horizontal gene transfer in prokaryotes: Quantification and classification..Annu. Rev. Microbiol. 2001; 55: 709-742Google Scholar, 4Ochman H. Lawrence J.G. Groisman E.A. Lateral gene transfer and the nature of bacterial innovation..Nature. 2000; 405: 299-304Google Scholar, 5Boucher Y. Douady C.J. Papke R.T. Walsh D.A. Boudreau M.E. Nesbo C.L. Case R.J. Doolittle W.F. Lateral gene transfer and the origins of prokaryotic groups..Annu. Rev. Genet. 2003; 37: 283-328Google Scholar). This lateral, or horizontal, gene transfer has been studied by genetic approaches that compare gene content in a variety of genomes (6Nelson K.E. Clayton R.A. Gill S.R. Gwinn M.L. Dodson R.J. Haft D.H. Hickey E.K. Peterson J.D. Nelson W.C. Ketchum K.A. McDonald L. Utterback T.R. Malek J.A. Linher K.D. Garrett M.M. Stewart A.M. Cotton M.D. Pratt M.S. Phillips C.A. Richardson D. Heidelberg J. Sutton G.G. Fleischmann R.D. Eisen J.A. White O. Salzberg S.L. Smith H.O. Venter J.C. Fraser C.M. Evidence for lateral gene transfer between Archaea and bacteria from genome sequence of Thermotoga maritima.Nature. 1999; 399: 323-329Google Scholar, 7Aravind L. Tatusov R.L. Wolf Y.I. Walker D.R. Koonin E.V. Evidence for massive gene exchange between archaeal and bacterial hyperthermophiles..Trends Genet. 1998; 14: 442-444Google Scholar, 8Rivera M.C. Jain R. Moore J.E. Lake J.A. Genomic evidence for two functionally distinct gene classes..Proc. Natl. Acad. Sci. U. S. A. 1998; 95: 6239-6244Google Scholar) or find genes with atypical G+C content and pattern of codon usage (9Lawrence J.G. Ochman H. Amelioration of bacterial genomes: Rates of change and exchange..J. Mol. Evol. 1997; 44: 383-397Google Scholar,10Lawrence J.G. Ochman H. Molecular archaeology of the Escherichia coli genome..Proc. Natl. Acad. Sci. U. S. A. 1998; 95: 9413-9417Google Scholar). These studies showed that the genetic diversity of bacteria actually results not only from errors in DNA replication and repair but from horizontal exchange and recombination of DNA sequences from similar and disparate species. It is believed that the horizontally transferred genes ascribe the rapid adaptation of bacteria to novel environments and effectively change the pathogenic and ecological character of bacterial species (1Wolfe K.H. Li W.H. Molecular evolution meets the genomics revolution..Nat. Genet. 2003; 33: 255-265Google Scholar, 2Doolittle R.F. Biodiversity: Microbial genomes multiply..Nature. 2002; 416: 697-700Google Scholar, 3Koonin E.V. Makarova K.S. Aravind L. Horizontal gene transfer in prokaryotes: Quantification and classification..Annu. Rev. Microbiol. 2001; 55: 709-742Google Scholar, 4Ochman H. Lawrence J.G. Groisman E.A. Lateral gene transfer and the nature of bacterial innovation..Nature. 2000; 405: 299-304Google Scholar, 5Boucher Y. Douady C.J. Papke R.T. Walsh D.A. Boudreau M.E. Nesbo C.L. Case R.J. Doolittle W.F. Lateral gene transfer and the origins of prokaryotic groups..Annu. Rev. Genet. 2003; 37: 283-328Google Scholar). In Escherichia coli, for example, several hundreds of genes, among 4,289 predicted genes in the genome (11Blattner F.R. Plunkett 3rd, G. Bloch C.A. Perna N.T. Burland V. Riley M. Collado-Vides J. Glasner J.D. Rode C.K. Mayhew G.F. Gregor J. Davis N.W. Kirkpatrick H.A. Goeden M.A. Rose D.J. Mau B. Shao Y. The complete genome sequence of Escherichia coli K-12..Science. 1997; 277: 1453-1474Google Scholar), were acquired from other organisms because this species had diverged from Salmonella lineage 100 million years ago (12Doolittle R.F. Feng D.F. Tsang S. Cho G. Little E. Determining divergence times of the major kingdoms of living organisms with a protein clock..Science. 1996; 271: 470-477Google Scholar). Also, the subsequent multistep transfer of gene clusters conferring virulence characteristics, such as O-loops, transformed a benign strain of E. coli into a pathogenic strain (13Perna N.T. Plunkett 3rd, G. Burland V. Mau B. Glasner J.D. Rose D.J. Mayhew G.F. Evans P.S. Gregor J. Kirkpatrick H.A. Posfai G. Hackett J. Klink S. Boutin A. Shao Y. Miller L. Grotbeck E.J. Davis N.W. Lim A. Dimalanta E.T. Potamousis K.D. Apodaca J. Anantharaman T.S. Lin J. Yen G. Schwartz D.C. Welch R.A. Blattner F.R. Genome sequence of enterohaemorrhagic Escherichia coli O157:H7..Nature. 2001; 409: 529-533Google Scholar). Thus, horizontal gene transfer appears to have a large impact on bacterial evolution; however, whether all of these genes contribute bacterial diversity by expressing functional proteins in the cell remains unknown. Proteomic technologies powered by advancements in MS and bioinformatics and coupled with accumulated genome sequence data allow a comprehensive study of protein constituents of the cell and tissues. In particular, an integrated multidimensional LC-based protein identification technology is powerful for large-scale and systematic protein identifications in very complex biological samples (14Takahashi N. Kaji H. Yanagida M. Hayano T. Isobe T. Proteomics: Advanced technology for the analysis of cellular function..J. Nutr. 2003; 133: 2090S-2096SGoogle Scholar). We constructed one of those systems by combination of a fully automated microscale multidimensional LC and a high-resolution hybrid MS coupled with a data analysis system (15Isobe T. Yamauchi Y. Taoka M. Takahashi N. Automated two-dimensional LC-MS/MS for large-scale protein analysis.in: Proteins and Proteomics. Cold Spring Harbor Press, Cold Spring Harbor, NY2002: 869-876Google Scholar) and used the system for large-scale identification of proteins expressed in Caenorhabditis elegans (16Mawuenyega K.G. Kaji H. Yamuchi Y. Shinkawa T. Saito H. Taoka M. Takahashi N. Isobe T. Large-scale identification of Caenorhabditis elegans proteins by multidimensional liquid chromatography-tandem mass spectrometry..J. Proteome Res. 2003; 2: 23-35Google Scholar) or for development of a novel strategy to identify N-glycoproteins (17Kaji H. Saito H. Yamauchi Y. Shinkawa T. Taoka M. Hirabayashi J. Kasai K. Takahashi N. Isobe T. Lectin affinity capture, isotope-coded tagging and mass spectrometry to identify N-linked glycoproteins..Nat. Biotechnol. 2003; 21: 667-672Google Scholar). Here, we applied this technology to the comprehensive analysis of a proteome of E. coli strain K12 (JM109) and identified a protein subset corresponding to ∼35% of total ORFs predicted in the genome. Based on the analysis as well as the analysis of gene expression by DNA microarrays, we propose that most of the recently acquired foreign genes do not express protein products and may therefore be either pseudogenes or genes that serve to generate functional RNAs. E. coli K12 JM109 (endA1, gyrA96, thi, hsdR17, supE44, relA1, Δ(lac-proAB), recA1, F’[traD36, proAB+, laclq, lacZΔM15]) cells were grown in shaking culture at 37 °C in a Luria-Bertani medium containing 1% tryptone (Becton Dickinson, Sparks, MD), 0.5% yeast extract (Becton Dickinson), and 1% NaCl. The mixed late logarithmic and early stationary phase cells were harvested from growing bacteria (OD600 = 1.5) by centrifugation at 10,000 × g for 10 min at room temperature. The precipitate was solubilized in 6 m guanidinium hydrochloride and S-alkylated with iodoacetamide as described (15Isobe T. Yamauchi Y. Taoka M. Takahashi N. Automated two-dimensional LC-MS/MS for large-scale protein analysis.in: Proteins and Proteomics. Cold Spring Harbor Press, Cold Spring Harbor, NY2002: 869-876Google Scholar, 16Mawuenyega K.G. Kaji H. Yamuchi Y. Shinkawa T. Saito H. Taoka M. Takahashi N. Isobe T. Large-scale identification of Caenorhabditis elegans proteins by multidimensional liquid chromatography-tandem mass spectrometry..J. Proteome Res. 2003; 2: 23-35Google Scholar). The S-carbamoylmethylated proteins were dialyzed against 10 mm ammonium bicarbonate (pH 8.0) to remove the excess reagents and digested overnight at 37 °C with sequence-grade modified trypsin (Promega, Madison, WI) at an enzyme-substrate ratio of 1:100 (w/w). The digest was acidified to pH 2.0 with 1 m HCl, and the resulting precipitate was removed by centrifugation. The supernatant was adjusted to pH 8.0 with aqueous ammonia (8 m) and was analyzed immediately for protein identification. The tryptic digest was analyzed by an automated microscale 2DLC-MS/MS system as described (15Isobe T. Yamauchi Y. Taoka M. Takahashi N. Automated two-dimensional LC-MS/MS for large-scale protein analysis.in: Proteins and Proteomics. Cold Spring Harbor Press, Cold Spring Harbor, NY2002: 869-876Google Scholar, 16Mawuenyega K.G. Kaji H. Yamuchi Y. Shinkawa T. Saito H. Taoka M. Takahashi N. Isobe T. Large-scale identification of Caenorhabditis elegans proteins by multidimensional liquid chromatography-tandem mass spectrometry..J. Proteome Res. 2003; 2: 23-35Google Scholar). Briefly, the peptide mixture was separated by a combination of first-dimensional anion-exchange LC on a bioassist-Q column (2-mm ID × 35 mm, 10-μm particles; TOSOH, Tokyo, Japan) and second-dimensional reversed-phase LC on a Mightysil-C18 column (320-μm ID × 100 mm, 3-μm particles; Kanto Chemicals, Tokyo, Japan), which was synchronized by a computer program. The system was also equipped with a small “trap” precolumn packed with Mightysil-C18 (1-mm ID × 5 mm) that was inserted between the two analytical columns through a six-way column-switching valve to remove salts from the anion-exchange LC. The eluted peptides were sprayed directly into a Q-TOF hybrid mass spectrometer (Q-Tof2; Micromass UK Ltd., Manchester, United Kingdom). The peptides were detected in the MS mode to select a set of precursor ions for a data-dependent, CID mass spectrometric (MS/MS) analysis, and every 4 s the largest four signals selected were subjected to the MS/MS analysis. The total analysis time for a single 2DLC-MS/MS process was 16 h. The large volume of MS/MS data that were generated was converted to text files listing mass values and intensities of fragment ions and was processed by the Mascot algorithm (Matrix Science Ltd., London, United Kingdom) for peptide assignment with reference to the E. coli sequence databases (11Blattner F.R. Plunkett 3rd, G. Bloch C.A. Perna N.T. Burland V. Riley M. Collado-Vides J. Glasner J.D. Rode C.K. Mayhew G.F. Gregor J. Davis N.W. Kirkpatrick H.A. Goeden M.A. Rose D.J. Mau B. Shao Y. The complete genome sequence of Escherichia coli K-12..Science. 1997; 277: 1453-1474Google Scholar) (m52p) maintained at the University of Wisconsin-Madison genome project (www.genome.wisc.edu/pub/analysis/m52p.fap). The parameters for the database search were as previously described (16Mawuenyega K.G. Kaji H. Yamuchi Y. Shinkawa T. Saito H. Taoka M. Takahashi N. Isobe T. Large-scale identification of Caenorhabditis elegans proteins by multidimensional liquid chromatography-tandem mass spectrometry..J. Proteome Res. 2003; 2: 23-35Google Scholar). We first screened the candidate peptides with probability-based Mowse scores that exceeded their thresholds (p < 0.05) and then applied more strict criteria for final assignment (16Mawuenyega K.G. Kaji H. Yamuchi Y. Shinkawa T. Saito H. Taoka M. Takahashi N. Isobe T. Large-scale identification of Caenorhabditis elegans proteins by multidimensional liquid chromatography-tandem mass spectrometry..J. Proteome Res. 2003; 2: 23-35Google Scholar, 18Natsume T. Yamauchi Y. Nakayama H. Shinkawa T. Yanagida M. Takahashi N. Isobe T. A direct nanoflow liquid chromatography-tandem mass spectrometry system for interaction proteomics..Anal. Chem. 2002; 74: 4725-4733Google Scholar): i) when the match scores exceeded the threshold by 10, identifications were accepted without further consideration; ii) when scores were lower than 10 above the threshold or if identifications were based on single matched MS/MS spectrum, we manually inspected the raw data for confirmation prior to acceptance. Protein annotations, such as cellular role and subcellular localization, were obtained from the knowledge databases EcoCyc (19Karp P.D. Riley M. Saier M. Paulsen I.T. Collado-Vides J. Paley S.M. Pellegrini-Toole A. Bonavides C. Gama-Castro S. The EcoCyc database..Nucleic Acids Res. 2002; 30: 56-58Google Scholar) (biocyc.org/ecocyc), COGs (20Tatusov R.L. Natale D.A. Garkavtsev I.V. Tatusova T.A. Shankavaram U.T. Rao B.S. Kiryutin B. Galperin M.Y. Fedorova N.D. Koonin E.V. The COG database: New developments in phylogenetic classification of proteins from complete genomes..Nucleic Acids Res. 2001; 29: 22-28Google Scholar) (www.ncbi.nlm.nih.gov/COG/), PEC (Ver2.26; shigen.lab.nig.ac.jp/ecoli/pec/index.jsp), and m52orfs (11Blattner F.R. Plunkett 3rd, G. Bloch C.A. Perna N.T. Burland V. Riley M. Collado-Vides J. Glasner J.D. Rode C.K. Mayhew G.F. Gregor J. Davis N.W. Kirkpatrick H.A. Goeden M.A. Rose D.J. Mau B. Shao Y. The complete genome sequence of Escherichia coli K-12..Science. 1997; 277: 1453-1474Google Scholar) (www.genome.wisc.edu/pub/analysis/m52orfs.txt). The transmembrane segments and signal peptides of proteins were predicted by the SOSUI program (21Hirokawa T. Boon-Chieng S. Mitaku S. SOSUI: Classification and secondary structure prediction system for membrane proteins..Bioinformatics. 1998; 14: 378-379Google Scholar) maintained at Tokyo University of Agriculture and Technology (sosui.proteome.bio.tuat.ac.jp/sosuiframe0.html). E. coli strain K12 (JM109) was grown as described above, and total RNA (400 μg) was isolated from 50 ml of the cells at OD600 = 1.5 by hot phenol purification protocol. RNA (10 μg) was labeled by Bioarray terminal labeling kit (Affymetrix, Santa Clara, CA) for microarray analysis. The analysis was performed on Affymetrix E. coli Antisense GeneChip arrays as described (22Soupene E. van Heeswijk W.C. Plumbridge J. Stewart V. Bertenthal D. Lee H. Prasad G. Paliy O. Charernnoppakul P. Kustu S. Physiological studies of Escherichia coli strain MG1655: Growth defects and apparent cross-regulation of gene expression..J. Bacteriol. 2003; 185: 5611-5626Google Scholar). Raw data files were analyzed by the statistical algorithm (23Liu W.M. Mei R. Di X. Ryder T.B. Hubbell E. Dee S. Webster T.A. Harrington C.A. Ho M.H. Baid J. Smeekens S.P. Analysis of high density expression microarrays with signed-rank call algorithms..Bioinformatics. 2002; 18: 1593-1599Google Scholar) in the Microarray Analysis Suite 5.1 (Affymetrix) by using the default parameters and were exported as text files for further sorting by Excel 2000 software (Microsoft, Redmond, WA). In the absolute present analysis of mRNA, we set the stringent threshold of 0.03 for the detection p value (23Liu W.M. Mei R. Di X. Ryder T.B. Hubbell E. Dee S. Webster T.A. Harrington C.A. Ho M.H. Baid J. Smeekens S.P. Analysis of high density expression microarrays with signed-rank call algorithms..Bioinformatics. 2002; 18: 1593-1599Google Scholar). Under these conditions, we typically obtained “present call” for between 2,600 and 3,000 genes. The analysis was repeated twice, and the reproducible signals were assigned as the “present” RNAs. We used laboratory strain K12 (JM109) (24Stevenson G. Neal B. Liu D. Hobbs M. Packer N.H. Batley M. Redmond J.W. Lindquist L. Reeves P. Structure of the O antigen of Escherichia coli K-12 and the sequence of its rfb gene cluster..J. Bacteriol. 1994; 176: 4144-4156Google Scholar) as a source of E. coli, because much information including its genome sequence (11Blattner F.R. Plunkett 3rd, G. Bloch C.A. Perna N.T. Burland V. Riley M. Collado-Vides J. Glasner J.D. Rode C.K. Mayhew G.F. Gregor J. Davis N.W. Kirkpatrick H.A. Goeden M.A. Rose D.J. Mau B. Shao Y. The complete genome sequence of Escherichia coli K-12..Science. 1997; 277: 1453-1474Google Scholar) is accumulated on a variety of databases. To achieve a comprehensive analysis of the proteome, whole bacterial cells collected in a mixed late logarithmic and early stationary phase culture was dissolved in 6 m guanidinium hydrochloride, S-carbamoylmethylated under reducing conditions and digested with trypsin. The resulting peptide mixture was then analyzed directly by an integrated multidimensional protein identification system (15Isobe T. Yamauchi Y. Taoka M. Takahashi N. Automated two-dimensional LC-MS/MS for large-scale protein analysis.in: Proteins and Proteomics. Cold Spring Harbor Press, Cold Spring Harbor, NY2002: 869-876Google Scholar, 16Mawuenyega K.G. Kaji H. Yamuchi Y. Shinkawa T. Saito H. Taoka M. Takahashi N. Isobe T. Large-scale identification of Caenorhabditis elegans proteins by multidimensional liquid chromatography-tandem mass spectrometry..J. Proteome Res. 2003; 2: 23-35Google Scholar, 17Kaji H. Saito H. Yamauchi Y. Shinkawa T. Taoka M. Hirabayashi J. Kasai K. Takahashi N. Isobe T. Lectin affinity capture, isotope-coded tagging and mass spectrometry to identify N-linked glycoproteins..Nat. Biotechnol. 2003; 21: 667-672Google Scholar), and spectral data were automatically processed to search the E. coli sequence database m52p (11Blattner F.R. Plunkett 3rd, G. Bloch C.A. Perna N.T. Burland V. Riley M. Collado-Vides J. Glasner J.D. Rode C.K. Mayhew G.F. Gregor J. Davis N.W. Kirkpatrick H.A. Goeden M.A. Rose D.J. Mau B. Shao Y. The complete genome sequence of Escherichia coli K-12..Science. 1997; 277: 1453-1474Google Scholar) for protein identification. After removing redundant peptide assignments derived from any single protein, we identified ∼850 proteins in a single analysis in which an average of about four peptides was assigned per protein. In LC-based protein identification technologies, however, multiple measurements of the same preparation generally increase the number of proteins identified due to the fact that the complexity of the sample peptide mixture often exceeds the separation capacity of the LC-MS system and because the selection of a peptide for MS/MS analysis is data dependent and somewhat irregular (16Mawuenyega K.G. Kaji H. Yamuchi Y. Shinkawa T. Saito H. Taoka M. Takahashi N. Isobe T. Large-scale identification of Caenorhabditis elegans proteins by multidimensional liquid chromatography-tandem mass spectrometry..J. Proteome Res. 2003; 2: 23-35Google Scholar, 25Davis M.T. Beierle J. Bures E.T. McGinley M.D. Mort J. Robinson J.H. Spahr C.S. Yu W. Luethy R. Patterson S.D. Automated LC-LC-MS-MS platform using binary ion-exchange and gradient reversed-phase chromatography for improved proteomic analyses..J. Chromatogr. B. Biomed Sci. Appl. 2001; 752: 281-291Google Scholar). Thus, we analyzed the same peptide preparation repeatedly under the same conditions to maximally cover the E. coli proteome (Fig. 1). After repeating the analysis 10 times, a composite proteome of 1,480 proteins was obtained (Supplemental Table I), assigned using more than 58,700 peptides derived from ∼162,000 MS/MS spectra. The identified proteins corresponded to ∼35% of the total 4,289 ORFs predicted in the E. coli genome (11Blattner F.R. Plunkett 3rd, G. Bloch C.A. Perna N.T. Burland V. Riley M. Collado-Vides J. Glasner J.D. Rode C.K. Mayhew G.F. Gregor J. Davis N.W. Kirkpatrick H.A. Goeden M.A. Rose D.J. Mau B. Shao Y. The complete genome sequence of Escherichia coli K-12..Science. 1997; 277: 1453-1474Google Scholar). This protein subset contained a wide range of proteins with respect to physico-chemical characteristics such as pI and molecular mass (Mr). The most acidic protein identified was the msyB gene product (pI 3.42), while the most basic was the rpmH gene product (pI 13.1). The smallest protein identified was the product of rpsV (Mr = 5.1 kDa), and the largest was the product of b2520 (Mr = 182 kDa). A 2D visualization of the pI and Mr of the 1,480 proteins and the E. coli proteome predicted from the ORFs (Fig. 2) suggests that our analysis covered > 99% of the bacterial proteome (with respect to Mr and pI). We also note that this protein subset contained not only abundant cellular proteins such as the ribosomal proteins and the elongation factors Tu and G that exist at ∼104 to 105 copies per cell (26Pederson S. Bloch P.L. Reeh S. Neidhardt F.C. Patterns of protein synthesis in: E. coli: A catalog of the amount of 140 individual proteins at different growth rates.Cell. 1978; 14: 179-190Google Scholar, 27Howe J.G. Hershey W.J.B. Initiation factor and ribosome levels are coordinately controlled in Escherichia coli growing at different rates..J. Biol. Chem. 1983; 258: 1954-1959Google Scholar, 28Bremer H. Dennis P.P. Modulation of chemical composition and other parameters of the cell growth rate.in: Escherichia coli and Salmonella, Cellular and Molecular Biology. American Society for Microbiology, Washington, D.C.1996: 1533-1569Google Scholar) but also very minor protein components such as the subunits of DNA polymerase III (dnaE, dnaX, dnaN, and holD), which are present at not more than 100 copies per cell (29Meeser, W., and Weigel, C.(1996) Initiation of chromosome replication, inEscherichia coli and Salmonella, Cellular and Molecular Biology (Neidhardt, F., Curtiss, I. R., Ingraham, J., Lin, E., Low, J. K. B., Magasanik, B., Reznikoff, W., Riley, M., Schaechter, M., and Umbarger, H., eds) Vol. 2, 2nd Ed., pp.1579–1601, American Society for Microbiology, Washington, D.C.Google Scholar). Thus, the analysis appears to have covered E. coli proteins whose cellular abundance range from 100 to 100,000 copies per cell (Table I).Table ICopy numbers of E. coli proteins identified by LC-based identification technologyProteinGeneCopy no./cell ×103Ref.Elongation factor-TutufA, tufB14027Howe J.G. Hershey W.J.B. Initiation factor and ribosome levels are coordinately controlled in Escherichia coli growing at different rates..J. Biol. Chem. 1983; 258: 1954-1959Google ScholarRibosomal proteins–2627Howe J.G. Hershey W.J.B. Initiation factor and ribosome levels are coordinately controlled in Escherichia coli growing at different rates..J. Biol. Chem. 1983; 258: 1954-1959Google ScholarElongation factor-GfusA2127Howe J.G. Hershey W.J.B. Initiation factor and ribosome levels are coordinately controlled in Escherichia coli growing at different rates..J. Biol. Chem. 1983; 258: 1954-1959Google ScholarRNA polymerase αrpoA9.627Howe J.G. Hershey W.J.B. Initiation factor and ribosome levels are coordinately controlled in Escherichia coli growing at different rates..J. Biol. Chem. 1983; 258: 1954-1959Google ScholarInitiation factor 2infB7.827Howe J.G. Hershey W.J.B. Initiation factor and ribosome levels are coordinately controlled in Escherichia coli growing at different rates..J. Biol. Chem. 1983; 258: 1954-1959Google ScholarInitiation factor 1infA6.527Howe J.G. Hershey W.J.B. Initiation factor and ribosome levels are coordinately controlled in Escherichia coli growing at different rates..J. Biol. Chem. 1983; 258: 1954-1959Google ScholarInitiation factor 3infC5.227Howe J.G. Hershey W.J.B. Initiation factor and ribosome levels are coordinately controlled in Escherichia coli growing at different rates..J. Biol. Chem. 1983; 258: 1954-1959Google ScholarElongation factor-Tstsf4.727Howe J.G. Hershey W.J.B. Initiation factor and ribosome levels are coordinately controlled in Escherichia coli growing at different rates..J. Biol. Chem. 1983; 258: 1954-1959Google ScholarRNA polymerase βrpoB3.627Howe J.G. Hershey W.J.B. Initiation factor and ribosome levels are coordinately controlled in Escherichia coli growing at different rates..J. Biol. Chem. 1983; 258: 1954-1959Google ScholarPhe tRNA synthetase αpheS3.126Pederson S. Bloch P.L. Reeh S. Neidhardt F.C. Patterns of protein synthesis in: E. coli: A catalog of the amount of 140 individual proteins at different growth rates.Cell. 1978; 14: 179-190Google ScholarIle tRNA synthetaseileS2.626Pederson S. Bloch P.L. Reeh S. Neidhardt F.C. Patterns of protein synthesis in: E. coli: A catalog of the amount of 140 individual proteins at different growth rates.Cell. 1978; 14: 179-190Google ScholarPhe tRNA synthetase βpheT2.626Pederson S. Bloch P.L. Reeh S. Neidhardt F.C. Patterns of protein synthesis in: E. coli: A catalog of the amount of 140 individual proteins at different growth rates.Cell. 1978; 14: 179-190Google ScholarGlu tRNA synthetasegltX2.326Pederson S. Bloch P.L. Reeh S. Neidhardt F.C. Patterns of protein synthesis in: E. coli: A catalog of the amount of 140 individual proteins at different growth rates.Cell. 1978; 14: 179-190Google ScholarGly tRNA synthetase βglyS2.326Pederson S. Bloch P.L. Reeh S. Neidhardt F.C. Patterns of protein synthesis in: E. coli: A catalog of the amount of 140 individual proteins at different growth rates.Cell. 1978; 14: 179-190Google ScholarGln tRNA synthetaseglnS2.126Pederson S. Bloch P.L. Reeh S. Neidhardt F.C. Patterns of protein synthesis in: E. coli: A catalog of the amount of 140 individual proteins at different growth rates.Cell. 1978; 14: 179-190Google ScholarLys tRNA synthetaselysS2.126Pederson S. Bloch P.L. Reeh S. Neidhardt F.C. Patterns of protein synthesis in: E. coli: A catalog of the amount of 140 individual proteins at different growth rates.Cell. 1978; 14: 179-190Google ScholarArg tRNA synthetaseargS1.626Pederson S. Bloch P.L. Reeh S. Neidhardt F.C. Patterns of protein synthesis in: E. coli: A catalog of the amount of 140 individual proteins at different growth rates.Cell. 1978; 14: 179-190Google ScholarThr tRNA synthetasethrS1.626Pederson S. Bloch P.L. Reeh S. Neidhardt F.C. Patterns of protein synthesis in: E. coli: A catalog of the amount of 140 individual proteins at different growth rates.Cell. 1978; 14: 179-190Google ScholarVal tRNA synthetasevalS1.626Pederson S. Bloch P.L. Reeh S. Neidhardt F.C. Patterns of protein synthesis in: E. coli: A catalog of the amount of 140 individual proteins at different growth rates.Cell. 1978; 14: 179-190Google ScholarLeu tRNA synthetaseleuS1.326Pederson S. Bloch P.L. Reeh S. Neidhardt F.C. Patterns of protein synthesis in: E. coli: A catalog of the amount of 140 individual proteins at different growth rates.Cell. 1978; 14: 179-190Google ScholarReplication initiation proteindnaA128Bremer H. Dennis P.P. Modulation of chemical composition and other parameters of the cell growth rate.in: Escherichia coli and Salmonella, Cellular and
Referência(s)