Pyrrolysine and Selenocysteine Use Dissimilar Decoding Strategies
2005; Elsevier BV; Volume: 280; Issue: 21 Linguagem: Inglês
10.1074/jbc.m501458200
ISSN1083-351X
AutoresYan Zhang, Pavel V. Baranov, John F. Atkins, Vadim N. Gladyshev,
Tópico(s)Metalloenzymes and iron-sulfur proteins
ResumoSelenocysteine (Sec) and pyrrolysine (Pyl) are known as the 21st and 22nd amino acids in protein. Both are encoded by codons that normally function as stop signals. Sec specification by UGA codons requires the presence of a cis-acting selenocysteine insertion sequence (SECIS) element. Similarly, it is thought that Pyl is inserted by UAG codons with the help of a putative pyrrolysine insertion sequence (PYLIS) element. Herein, we analyzed the occurrence of Pyl-utilizing organisms, Pyl-associated genes, and Pyl-containing proteins. The Pyl trait is restricted to several microbes, and only one organism has both Pyl and Sec. We found that methanogenic archaea that utilize Pyl have few genes that contain in-frame UAG codons, and many of these are followed with nearby UAA or UGA codons. In addition, unambiguous UAG stop signals could not be identified. This bias was not observed in Sec-utilizing organisms and non-Pyl-utilizing archaea, as well as with other stop codons. These observations as well as analyses of the coding potential of UAG codons, overlapping genes, and release factor sequences suggest that UAG is not a typical stop signal in Pyl-utilizing archaea. On the other hand, searches for conserved Pyl-containing proteins revealed only four protein families, including methylamine methyltransferases and transposases. Only methylamine methyltransferases matched the Pyl trait and had conserved Pyl, suggesting that this amino acid is used primarily by these enzymes. These findings are best explained by a model wherein UAG codons may have ambiguous meaning and Pyl insertion can effectively compete with translation termination for UAG codons obviating the need for a specific PYLIS structure. Thus, Sec and Pyl follow dissimilar decoding and evolutionary strategies. Selenocysteine (Sec) and pyrrolysine (Pyl) are known as the 21st and 22nd amino acids in protein. Both are encoded by codons that normally function as stop signals. Sec specification by UGA codons requires the presence of a cis-acting selenocysteine insertion sequence (SECIS) element. Similarly, it is thought that Pyl is inserted by UAG codons with the help of a putative pyrrolysine insertion sequence (PYLIS) element. Herein, we analyzed the occurrence of Pyl-utilizing organisms, Pyl-associated genes, and Pyl-containing proteins. The Pyl trait is restricted to several microbes, and only one organism has both Pyl and Sec. We found that methanogenic archaea that utilize Pyl have few genes that contain in-frame UAG codons, and many of these are followed with nearby UAA or UGA codons. In addition, unambiguous UAG stop signals could not be identified. This bias was not observed in Sec-utilizing organisms and non-Pyl-utilizing archaea, as well as with other stop codons. These observations as well as analyses of the coding potential of UAG codons, overlapping genes, and release factor sequences suggest that UAG is not a typical stop signal in Pyl-utilizing archaea. On the other hand, searches for conserved Pyl-containing proteins revealed only four protein families, including methylamine methyltransferases and transposases. Only methylamine methyltransferases matched the Pyl trait and had conserved Pyl, suggesting that this amino acid is used primarily by these enzymes. These findings are best explained by a model wherein UAG codons may have ambiguous meaning and Pyl insertion can effectively compete with translation termination for UAG codons obviating the need for a specific PYLIS structure. Thus, Sec and Pyl follow dissimilar decoding and evolutionary strategies. Pyrrolysine (Pyl) 1The abbreviations used are: Pyl, pyrrolysine; Sec, selenocysteine; MtmB, monomethylamine methyltransferase; MtbB, dimethylamine methyltransferase; MttB, trimethylamine methyltransferase; pylT, tRNApyl gene; PylS, pyrrolysyl-tRNA synthetase; SECIS, selenocysteine insertion sequence; PYLIS, pyrrolysine insertion sequence; ORF, open reading frame; UTR, untranslated region; RF1, class I release factor; RF2, release factor 2; selA, Sec synthase gene; SelB, Sec-specific elongation factor; EFSec, eukaryotic Sec-specific elongation factor; selC, tRNASec gene; SelD, selenophosphate synthetase; SBP2, SECIS-binding protein 2; nt, nucleotide(s). has recently been identified in the active site of monomethylamine methyltransferase (MtmB) from Methanosarcina barkeri, and sequences encoding Pyl-containing homologs of this protein were found in several other methanogenic archaea, including Methanosarcina acetivorans, Methanosarcina mazei, and Methanosarcina thermophila (1Hao B. Gong W. Ferguson T.K. James C.M. Krzycki J.A. Chan M.K. Science. 2002; 296: 1462-1466Crossref PubMed Scopus (321) Google Scholar, 2Srinivasan G. James C.M. Krzycki J.A. Science. 2002; 296: 1459-1462Crossref PubMed Scopus (449) Google Scholar, 3Krzycki J.A. Curr. Opin. Chem. Biol. 2004; 8: 484-491Crossref PubMed Scopus (76) Google Scholar). Methylamine methyltransferase genes from these organisms contain in-frame UAG codons, which do not halt translation, but encode Pyl. Following this discovery, additional Pyl-containing methyltransferases have been identified in Methanosarcina, and to date three classes of Pyl-containing methylamine methyltransferase genes are known: mtmB, dimethylamine methyltransferase (mtbB), and trimethylamine methyltransferase (mttB) (1Hao B. Gong W. Ferguson T.K. James C.M. Krzycki J.A. Chan M.K. Science. 2002; 296: 1462-1466Crossref PubMed Scopus (321) Google Scholar, 2Srinivasan G. James C.M. Krzycki J.A. Science. 2002; 296: 1459-1462Crossref PubMed Scopus (449) Google Scholar). Some Methanosarcina contain several paralogs of each methyltransferase family. Using this information, various genome sequences were scanned for genes encoding homologous Pyl-containing proteins. This search identified an mttB homolog in a Gram-positive bacterium Desulfitobacterium hafniense (2Srinivasan G. James C.M. Krzycki J.A. Science. 2002; 296: 1459-1462Crossref PubMed Scopus (449) Google Scholar). More recently, an Antarctic archaeon, Methanococcoides burtoni, has also been reported to utilize Pyl (4Goodchild A. Saunders N.F. Ertan H. Raftery M. Guilhaus M. Curmi P.M. Cavicchioli R. Mol. Microbiol. 2004; 53: 309-321Crossref PubMed Scopus (122) Google Scholar). In contrast, no Pyl-containing methyltransferases have been reported in eukaryotes. It is also not known whether the utilization of Pyl is restricted to methyltransferases or other Pyl-containing proteins exist. Although the mechanism of Pyl biosynthesis and incorporation into protein is not fully understood, the presence of a Methanosarcina tRNApyl gene (pylT) with the CUA anticodon and of class II aminoacyl-tRNA synthetase gene (pylS) argued for cotranslational incorporation of Pyl (2Srinivasan G. James C.M. Krzycki J.A. Science. 2002; 296: 1459-1462Crossref PubMed Scopus (449) Google Scholar). A recent study suggested that pylT and pylS are the only foreign genes necessary for translating UAG as Pyl in Escherichia coli, when the cells are supplemented with exogenous Pyl (5Blight S.K. Larue R.C. Mahapatra A. Longstaff D.G. Chang E. Zhao G. Kang P.T. Green-Church K.B. Chan M.K. Krzycki J.A. Nature. 2004; 431: 333-335Crossref PubMed Scopus (189) Google Scholar). In addition, it was reported that PylS could activate and ligate Pyl directly onto tRNApyl (5Blight S.K. Larue R.C. Mahapatra A. Longstaff D.G. Chang E. Zhao G. Kang P.T. Green-Church K.B. Chan M.K. Krzycki J.A. Nature. 2004; 431: 333-335Crossref PubMed Scopus (189) Google Scholar, 6Polycarpo C. Ambrogelly A. Bérubé A. Winbush S.M. McCloskey J.A. Crain P.F. Wood J.L. Söll D. Proc. Natl. Acad. Sci. 2004; 101: 12450-12454Crossref PubMed Scopus (153) Google Scholar) and that tRNApyl is directly recognized by the standard elongation factor EF-Tu (7Théobald-Dietrich A. Frugier M. Giegé R. Rudinger-Thirion J. Nucleic Acids Res. 2004; 32: 1091-1096Crossref PubMed Scopus (45) Google Scholar). Analysis of the genomic context of pylT and pylS identified pylB, pylC, and pylD, which were suggested to participate in Pyl biosynthesis or insertion into protein (2Srinivasan G. James C.M. Krzycki J.A. Science. 2002; 296: 1459-1462Crossref PubMed Scopus (449) Google Scholar). pylT, -S, -B, -C, and -D genes constitute a Pyl gene cluster (or Pyl operon), and pylT and pylS genes are considered as the Pyl utilization signature. Because Pyl is inserted in response to a codon that in most organisms functions as a terminator, there are three distinct possibilities for how Pyl insertion can be achieved: (i) redefinition of a subset of UAG stop codons by a cis-acting mRNA signal to encode Pyl; (ii) reassignment of all UAG codons to encode Pyl; and (iii) ambiguous meaning of UAG codons, e.g. a competition between read-through and termination such that a fraction of ribosomes translating the UAG codon incorporate Pyl, whereas the rest support termination (8Atkins J.F. Gesteland R. Science. 2002; 296: 1409-1410Crossref PubMed Scopus (83) Google Scholar). However, the attention of researchers has previously focused on the first possibility, because of the analogy between Pyl and selenocysteine (Sec) (9Namy O. Rousset J.P. Napthine S. Brierley I. Mol. Cell. 2004; 13: 157-168Abstract Full Text Full Text PDF PubMed Scopus (198) Google Scholar). Both Pyl and Sec are encoded by “termination” codons and are the only known additions to the pool of 20 universal, directly encoded, amino acids. Therefore, Sec and Pyl are known as the 21st and 22nd amino acids. The mechanism of Sec insertion is known in much detail (10Böck A. Forchhammer K. Heider J. Leinfelder W. Sawers G. Veprek B. Zinoni F. Mol. Microbiol. 1991; 5: 515-520Crossref PubMed Scopus (553) Google Scholar, 11Low S.C. Berry M.J. Trends Biochem. Sci. 1996; 21: 203-208Abstract Full Text PDF PubMed Scopus (392) Google Scholar, 12Rother M. Resch A. Wilting R. Böck A. Biofactors. 2001; 14: 75-83Crossref PubMed Scopus (73) Google Scholar, 13Hatfield D.L. Gladyshev V.N. Mol. Cell. Biol. 2002; 22: 3565-3576Crossref PubMed Scopus (544) Google Scholar). Incorporation of Sec requires the presence of selenocysteine insertion sequence (SECIS) element, a hairpin structure residing in 3′-untranslated regions (3′-UTRs) of selenoprotein mRNAs in eukaryota and archaea, or immediately downstream of Sec UGA codons in eubacteria (13Hatfield D.L. Gladyshev V.N. Mol. Cell. Biol. 2002; 22: 3565-3576Crossref PubMed Scopus (544) Google Scholar, 14Böck A. Forchhammer K. Heider J. Baron C. Trends Biochem. Sci. 1991; 16: 463-467Abstract Full Text PDF PubMed Scopus (304) Google Scholar, 15Berry M.J. Banu L. Harney J.W. Larsen P.R. EMBO J. 1993; 12: 3315-3322Crossref PubMed Scopus (348) Google Scholar). SECIS is essential for Sec insertion, whereas in its absence UGA serves as terminator (16Fourmy D. Guittet E. Yoshizawa S. J. Mol. Biol. 2002; 324: 137-150Crossref PubMed Scopus (48) Google Scholar). Several attempts have been made to search for analogous stem-loop structures in mRNAs encoding Pyl-containing proteins. A putative secondary structure was predicted 5–6 nucleotides downstream of the Pyl-encoding UAG codon in mtmB mRNAs and designated as pyrrolysine insertion sequence (PYLIS) element (9Namy O. Rousset J.P. Napthine S. Brierley I. Mol. Cell. 2004; 13: 157-168Abstract Full Text Full Text PDF PubMed Scopus (198) Google Scholar, 17Ibba M. Söll D. Genes Dev. 2004; 18: 731-738Crossref PubMed Scopus (137) Google Scholar). This predicted structure has not been tested experimentally for functional relevance. Identification of genes encoding Sec- and Pyl-containing proteins in genomic sequences is challenging, because standard annotation tools interpret UGA and UAG as stop signals. For example, most methylamine methyltransferases in Methanosarcina are incorrectly annotated. At present, no tools are available for prediction of Pyl-containing proteins, and previous in silico approaches were limited to manual analyses and BLAST searches (2Srinivasan G. James C.M. Krzycki J.A. Science. 2002; 296: 1459-1462Crossref PubMed Scopus (449) Google Scholar). In the case of Sec, tools have been developed and successfully used to identify selenoprotein genes by searching for SECIS elements (18Kryukov G.V. Kryukov V.M. Gladyshev V.N. J. Biol. Chem. 1999; 274: 33888-33897Abstract Full Text Full Text PDF PubMed Scopus (214) Google Scholar, 19Lescure A. Gautheret D. Carbon P. Krol A. J. Biol. Chem. 1999; 274: 38147-38154Abstract Full Text Full Text PDF PubMed Scopus (227) Google Scholar, 20Kryukov G.V. Castellano S. Novoselov S.V. Lobanov A.V. Zehtab O. Guigó R. Gladyshev V.N. Science. 2003; 300: 1439-1443Crossref PubMed Scopus (1848) Google Scholar) and Sec/Cys pairs in homologous sequences (21Kryukov G.V. Gladyshev V.N. EMBO Rep. 2004; 5: 538-543Crossref PubMed Scopus (179) Google Scholar, 22Castellano S. Novoselov S.V. Kryukov G.V. Lescure A. Blanco E. Krol A. Gladyshev V.N. Guigó R. EMBO Rep. 2004; 5: 71-77Crossref PubMed Scopus (87) Google Scholar). In this study, we used bioinformatics approaches to analyze Pyl-utilizing organisms and Pyl-containing proteins, and to examine possible mechanisms of Pyl insertion. Our data suggest that indiscriminate Pyl insertion at UAG may be tolerated in Pyl-utilizing archaea and that Pyl decoding processes are different from those of Sec. Sequence Databases and Resources—260 completely sequenced prokaryotic genomes were downloaded from the NCBI ftp server (ftp.ncbi.nih.gov/genomes/Bacteria). To analyze incompletely sequenced genomes, we used partial genomic sequences (contigs) from the NCBI data base of microbial genomes as well as a non-redundant nucleotide data base. Both web-based and local Blast programs (23Altschul S.F. Gish W. Miller W. Myers E.W. Lipman D.J. J. Mol. Biol. 1990; 215: 403-410Crossref PubMed Scopus (70762) Google Scholar) were used for sequence analysis (available at ftp.ncbi.nih.gov/blast and www.ncbi.nlm.nih.gov/BLAST). Identification of Pyl Gene Cluster Homologs and Known Pyl-containing Proteins—pylT and pylS sequences from M. barkeri (accession number AY064401) were used as queries to search genomic databases for possible homologs with an e value below 0.01. Candidate tRNAPyl sequences were further analyzed to identify structural features associated with known tRNAPyl, such as a 6-bp acceptor stem and a base between the D and acceptor stems (2Srinivasan G. James C.M. Krzycki J.A. Science. 2002; 296: 1459-1462Crossref PubMed Scopus (449) Google Scholar). Other genes in the Pyl gene cluster (pylB, -C, and -D) were similarly analyzed by comparative sequence analyses. We further examined whether these genes were organized in clusters. A tblastn program with default parameters was used to search for Pyl-containing methylamine methyltransferases in different organisms. Open reading frames (ORFs) and conservation of UAG-flanking regions were then examined manually. Multiple alignments and phylogenetic trees were generated with ClustalW (24Thompson J.D. Higgins D.G. Gibson T.J. Nucleic Acids Res. 1994; 22: 4673-4680Crossref PubMed Scopus (55767) Google Scholar). Analysis of Candidate PYLIS Elements in Methylamine Methyltransferase Genes—Sequences either downstream of in-frame UAG codons or in the putative 3′-UTR of methylamine methyltransferase gene mRNAs were analyzed manually to search for possible conserved structures and sequence features within these structures. RNA secondary structures were predicted with RNAfold 1.4, which is a part of the Vienna RNA package (available at www.tbi.univie.ac.at/~ivo/RNA/(25Hofacker I.L. Fontana W. Stadler P.F. Bonhoeffer S.M. Schuster P. Monatsh. Chem. 1994; 125: 167-188Crossref Scopus (1760) Google Scholar)). Analyses of UAG Codon Function—To characterize functions of UAG codons, a homology-based approach was developed and used to analyze UAG-flanking regions in four Pyl-utilizing organisms, M. acetivorans, M. mazei, M. burtonii, and D. hafniense. This procedure was implemented using simple Perl scripts (available upon request). First, genes terminating with UAG were extracted from the original annotation files and extended until the next non-UAG stop signal (UAA/UGA). ORFs translated from the elongated genes were analyzed by tblastn against non-redundant and microbial genome databases. We also screened for conservation of UAG codons in nucleotide sequences and of UAG-flanking regions in protein sequences. This procedure assigned each UAG codon to one of three categories as follows: (i) A UAG was interpreted as a terminator if an elongated sequence was sufficiently long (>30 nucleotides), and all of its homologs had a true stop signal (either a non-UAG terminator in Pyl-utilizing organisms or any termination signal in other organisms) that corresponded to the UAG codon. (ii) A UAG was interpreted as a candidate Pyl codon if an elongation was >30 nt, and >50% homologs extended beyond the UAG and terminated near the termination site of the elongated sequence. All identified sequences were then analyzed for conservation of UAG in Pyl-utilizing organisms with blastn and blastp. (iii) A UAG was not assigned a function if the two situations discussed above could not be satisfied (for example, if we observed short elongations beyond UAG codons, lack of sequence similarity between homologs in regions flanking UAG, or a small number of homologs extending beyond the UAG). A non-Pyl-utilizing archaeon, Methanococcus jannaschii, was also analyzed using the same approach. It served as the control in searches involving Pyl-utilizing archaea. Analysis of Overlaps between Elongated UAG-containing Genes and Downstream Genes—Overlaps between genes are common in prokaryotic genomes (26Fukuda Y. Nakayama Y. Tomita M. Gene (Amst.). 2003; 323: 181-187Crossref PubMed Scopus (60) Google Scholar). To examine how extensions of UAG-containing genes relate to the extent of the overlap, we analyzed overlapping genes before and after sequence elongation downstream of predicted stop codons in M. acetivorans and M. mazei. A simple Perl script was developed for this analysis (available upon request). We first identified overlapping genes in the original genome annotations, determined the number of overlaps in each genome, and measured overlap lengths. The longest overlap in a genome was defined as an overlap threshold. We then extended genes terminated at UAG until the next non-UAG stop signal using the approach described above and repeated the overlap analysis procedure. We reasoned that if no significant increase in the number of genes whose overlap was longer than the threshold would be observed, the situation would be consistent with the use of UAG as either a terminator or a Pyl codon. However, if the sequence extension procedure generated many genes with large overlaps with the downstream genes, the situation would be consistent with the use of UAG codon as terminator. In addition, the gene overlaps involving genes terminated at UAA and UGA codons were analyzed using the same strategy (e.g. before and after extension). These served as controls. Identification of Genes Associated with Pyl Utilization—All predicted ORFs in M. acetivorans and M. mazei genomes were searched for exclusive occurrence in genomes that utilize Pyl. The tblastn program was used to search these sequences against 260 completely sequenced prokaryotic genomes, non-redundant nucleotide data base and unfinished microbial contigs with an e value below 0.05. A simple script was developed to parse the tblastn output and examine presence/absence of homologs in analyzed genomes. A pairwise alignment tool, bl2seq, was then used with an e value cutoff set to 0.001 to cluster protein sequences into different families. The occurrence of these proteins in D. hafniense was then analyzed. Available completely and incompletely sequenced prokaryotic genomes were screened for tRNApyl (pylT) and pyrrolysyl-tRNA synthetase (pylS) sequences, and their patterns of occurrence were compared with those of other Pyl genes (pylB, -C, and -D). We found that the products of pylB (biotin synthase homolog) and pylC (carbamoyl-phosphate synthetase homolog) have close homologs in a wide variety of organisms. In contrast, pylS, pylT, and pylD (nucleoside-diphosphate sugar epimerase homolog) are specific for methanogenic archaea and D. hafniense (Fig. 1). In D. hafniense, pylSn and pylSc encode the N- and C-terminal parts of PylS (2Srinivasan G. James C.M. Krzycki J.A. Science. 2002; 296: 1459-1462Crossref PubMed Scopus (449) Google Scholar), and small overlaps occur between pylT and pylSc (4 nt) and between the pylB and pylC genes (52 nt). pylS, pylT, and pylD always cluster with pylB and pylC, and the overall Pyl gene cluster has identical mutual organization of these sequences, except that D. hafniense pylSn is located at the end of the cluster (Fig. 1). Thus, the five Pyl genes define the Pyl gene cluster, but only pylT and pylS (and perhaps pylD) sequences can be used as the signature for the Pyl trait. A search of completely and incompletely sequenced prokaryotic genomes for Pyl genes revealed only six organisms that could utilize Pyl, including four members of Methanosarcina genera, M. burtonii, and D. hafniense. Methanosarcina species and M. burtonii belong to Methanosarcinales, suggesting that Pyl is encoded by a UAG codon in a restricted group of phylogenetically related organisms that occupy a specific environmental niche. High conservation of the Pyl gene cluster and the small number of organisms that utilize Pyl suggest its relatively recent origin. In the 6 Pyl-utilizing organisms, a total of 29 Pyl-containing methylamine methyltransferase genes was identified (Table I). They are distributed in three enzyme families that do not share significant sequence similarity. Fig. 2 shows the occurrence of these genes in genomes and contigs. Only mtmB genes cluster with the Pyl operon genes (in three Methanosarcina organisms). In M. mazei, two distant duplicate mtmB genes are present. In M. barkeri, two duplicate mtmB1 genes cluster together and are on the opposite strands with the Pyl cluster.Table IDistribution of Pyl-containing methylamine methyltransferase genes in Pyl-utilizing organismsDomain of lifeOrganismMMA methyltransferase (mtmB)DMA methyltransferase (mtbB)TMA methyltransferase (mttB)ArchaeaM. acetivoransmtmB1, mtmB2mtbB1, mtbB2, mtbB3mttB1, mttB2M. barkerimtmB1 (2Srinivasan G. James C.M. Krzycki J.A. Science. 2002; 296: 1459-1462Crossref PubMed Scopus (449) Google Scholar), mtmB2mtbB1, mtbB2, mtbB3mttBM. mazeimtmB (2Srinivasan G. James C.M. Krzycki J.A. Science. 2002; 296: 1459-1462Crossref PubMed Scopus (449) Google Scholar)mtbB1, mtbB2, mtbB3mttB1, mttB2M. thermophilamttBM. burtoniimtmB1, mtmB2mtbB1, mtbB2mttB1, mttB2BacteriaD. hafniensemttB Open table in a new tab Further analyses of the three methylamine methyltransferase protein families revealed conservation of Pyl in MtmB and MtbB (i.e. no MtmB and MtbB homologs were detected, in which Pyl is replaced with other residues or in which the Pyl-encoding UAG codon is replaced with a non-UAG stop signal). On the other hand, multiple MttB homologs were detected, in which Pyl is not conserved and replaced with various amino acids (Fig. 3). This situation is in contrast to Sec, which is highly conserved. In addition, most selenoproteins have homologs, in which Sec is replaced with cysteine (Cys). In fact, the Sec/Cys pair in homologous sequences is a feature that is used for identification of selenoproteins in genomic databases (21Kryukov G.V. Gladyshev V.N. EMBO Rep. 2004; 5: 538-543Crossref PubMed Scopus (179) Google Scholar, 22Castellano S. Novoselov S.V. Kryukov G.V. Lescure A. Blanco E. Krol A. Gladyshev V.N. Guigó R. EMBO Rep. 2004; 5: 71-77Crossref PubMed Scopus (87) Google Scholar). Phylogenetic analyses of the three methylamine methyltransferase families as well as of the pylT and pylS genes typically placed D. hafniense genes as outliers (Fig. 4). In Pyl-utilizing archaea, all mtmB, mtbB, and mttB genes encode Pyl-containing proteins. Conversely, D. hafniense possessed mttB genes encoding proteins with and without Pyl. MttB homologs that did not have Pyl were broadly distributed in other bacteria. Although the use of Pyl appears to be prevalent in methanogenic archaea, from our data it could not be established with certainty whether the Pyl trait evolved in these organisms or in bacteria. Both Pyl operon sequences and mttB have optimal codon usage in organisms in which they are present (data not shown), arguing against a recent (traceable) lateral transfer of the Pyl trait between methanogenic archaea and D. hafniense. By analogy to SECIS, a stem-loop structure in selenoprotein mRNAs that reprograms a small fraction of UGA codons in a genome to serve in Sec insertion (11Low S.C. Berry M.J. Trends Biochem. Sci. 1996; 21: 203-208Abstract Full Text PDF PubMed Scopus (392) Google Scholar, 13Hatfield D.L. Gladyshev V.N. Mol. Cell. Biol. 2002; 22: 3565-3576Crossref PubMed Scopus (544) Google Scholar), the occurrence of putative PYLIS structures was proposed (9Namy O. Rousset J.P. Napthine S. Brierley I. Mol. Cell. 2004; 13: 157-168Abstract Full Text Full Text PDF PubMed Scopus (198) Google Scholar, 17Ibba M. Söll D. Genes Dev. 2004; 18: 731-738Crossref PubMed Scopus (137) Google Scholar). However, we found that the RNA structures downstream of UAG codons in several mtbB and mttB mRNAs that were recently suggested as putative PYLIS elements (17Ibba M. Söll D. Genes Dev. 2004; 18: 731-738Crossref PubMed Scopus (137) Google Scholar) are dissimilar and do not occur in mtmB. Manual analyses of sequences downstream of UAG codons, as well as sequences in UTRs revealed no obvious common structure shared by members of all three methylamine methyltransferase families. A possibility remains that structures downstream of UAG codons inhibit termination. Because UGA has a dual function in Sec-utilizing organisms (Sec insertion and translation termination), there is a possibility that UAG, in a similar fashion, also serves two functions (Pyl insertion and translation termination) in Pyl-utilizing organisms. If PYLIS is absent, what might be a mechanism for discriminating between the two functions of UAG codons? An alternative possibility would be a reassignment of all UAG codons from stop to Pyl. To address these possibilities, we analyzed the distribution of the three stop codons in Pyl-utilizing organisms (Table II). In D. hafniense, the percentage of the genes predicted to terminate at UAG is 22.5%, which is similar to the proportion of genes predicted to terminate at UGA (28.4%). The extensive utilization of UAG in this bacterium suggests that this codon likely has a dual role in protein synthesis.Table IIDistribution of annotated UAA, UAG, and UGA codons in the genomes of Pyl-utilizing organismsOrganismGenome size (nt)Annotated genesStop codonUAGUAAUGAArchaeaM. acetivorans5,751,4924,540224 (4.9%)2,217 (48.8%)2,099 (46.3%)M. mazei4,096,3453,371126 (3.7%)1,800 (53.4%)1,445 (42.9%)M. burtonii∼2.6 m (unfinished)2,782131 (4.7%)1,264 (45.4%)1,383 (49.9%)BacteriumD. hafniense∼6.1 m (unfinished)4,9991127 (22.5%)2,427 (48.5%)1,422 (28.4%) Open table in a new tab In contrast, the proportion of genes that are predicted to terminate at UAG in Pyl-containing archaea is 30 nt) beyond the UAG, and all of its homologs in other organisms have true stop signals that corresponded to the UAG codon (that is, only the sequence upstream of UAG is conserved, whereas sequence similarity is absent downstream of UAG), the UAG should be a terminator (Fig. 5A). Using this strategy, we could reliably identify UGA stop signals in Sec-utilizing organisms and distinguish them from UGA codons for Sec (data not shown). Conversely, if the extension is long (>30 nt), and most candidate homologs extend beyond the UAG to end near (or after) the site corresponding to the non-UAG stop codon in the elongated sequence, the UAG codon in the sequence of interest is considered a Pyl codon candidate. In addition to testing UAG function, this strategy could also be used for identification, in Pyl-encoding organisms, of candidate Pyl-containing proteins. To avoid the possibility of dealing with a sequencing error or a pseudogene, we required the presence of sequences encoding a candidate Pyl-containing protein in two or more genomes of Pyl-utilizing organisms (Fig. 5B). In this case, methylamine methyltransferases served as true positives, because they can be extended beyond their Pyl UAG codons, share homology with other proteins in sequences downstream of their UAG codons, and occur in at least 4 Pyl-utilizing organisms as Pyl-containing forms (Fig. 3). In other situations (see Fig. 5C for specific examples), we could not distinguish between Pyl-encoding functions and stop signals. However, if the UAG was followed with a nearby stop codon, either Pyl insertion or translation termination could presumably be tolerated. Surprisingly
Referência(s)