A Novel Family of Sequence-specific Endoribonucleases Associated with the Clustered Regularly Interspaced Short Palindromic Repeats
2008; Elsevier BV; Volume: 283; Issue: 29 Linguagem: Inglês
10.1074/jbc.m803225200
ISSN1083-351X
AutoresNatalia Beloglazova, Greg Brown, Matthew D. Zimmerman, Michael Proudfoot, Kira S. Makarova, M. Kudritska, Samvel Kochinyan, Shuren Wang, M. Chruszcz, W. Minor, Eugene V. Koonin, A.M. Edwards, Alexei Savchenko, Alexander F. Yakunin,
Tópico(s)Bacteriophages and microbial interactions
ResumoClustered regularly interspaced short palindromic repeats (CRISPRs) together with the associated CAS proteins protect microbial cells from invasion by foreign genetic elements using presently unknown molecular mechanisms. All CRISPR systems contain proteins of the CAS2 family, suggesting that these uncharacterized proteins play a central role in this process. Here we show that the CAS2 proteins represent a novel family of endoribonucleases. Six purified CAS2 proteins from diverse organisms cleaved single-stranded RNAs preferentially within U-rich regions. A representative CAS2 enzyme, SSO1404 from Sulfolobus solfataricus, cleaved the phosphodiester linkage on the 3′-side and generated 5′-phosphate- and 3′-hydroxyl-terminated oligonucleotides. The crystal structure of SSO1404 was solved at 1.6Å resolution revealing the first ribonuclease with a ferredoxin-like fold. Mutagenesis of SSO1404 identified six residues (Tyr-9, Asp-10, Arg-17, Arg-19, Arg-31, and Phe-37) that are important for enzymatic activity and suggested that Asp-10 might be the principal catalytic residue. Thus, CAS2 proteins are sequence-specific endoribonucleases, and we propose that their role in the CRISPR-mediated anti-phage defense might involve degradation of phage or cellular mRNAs. Clustered regularly interspaced short palindromic repeats (CRISPRs) together with the associated CAS proteins protect microbial cells from invasion by foreign genetic elements using presently unknown molecular mechanisms. All CRISPR systems contain proteins of the CAS2 family, suggesting that these uncharacterized proteins play a central role in this process. Here we show that the CAS2 proteins represent a novel family of endoribonucleases. Six purified CAS2 proteins from diverse organisms cleaved single-stranded RNAs preferentially within U-rich regions. A representative CAS2 enzyme, SSO1404 from Sulfolobus solfataricus, cleaved the phosphodiester linkage on the 3′-side and generated 5′-phosphate- and 3′-hydroxyl-terminated oligonucleotides. The crystal structure of SSO1404 was solved at 1.6Å resolution revealing the first ribonuclease with a ferredoxin-like fold. Mutagenesis of SSO1404 identified six residues (Tyr-9, Asp-10, Arg-17, Arg-19, Arg-31, and Phe-37) that are important for enzymatic activity and suggested that Asp-10 might be the principal catalytic residue. Thus, CAS2 proteins are sequence-specific endoribonucleases, and we propose that their role in the CRISPR-mediated anti-phage defense might involve degradation of phage or cellular mRNAs. Numerous prokaryotic genomes contain structures known as clustered regularly interspaced short palindromic repeats (CRISPRs), 3The abbreviations used are: CRISPR, clustered regularly interspaced short palindromic repeats; CAS, CRISPR-associated; PAA, polyacryalamide; PAGE, polyacrylamide gel electrophoresis; PNK, polynucleotide kinase; ssRNA, single-stranded RNA; TA, toxin-antitoxin; nt, nucleotide; r.m.s.d., root mean square deviation; MES, 4-morpholineethanesulfonic acid; CAPS, 3 -(cyclohexylamino)propanesulfonic acid. composed of 25-50-bp direct repeats separated by intervening sequence spacers (or inserts) of similar length (1Jansen R. Embden J.D. Gaastra W. Schouls L.M. Mol. Microbiol. 2002; 43: 1565-1575Crossref PubMed Scopus (1269) Google Scholar, 1Jansen R. Embden J.D. Gaastra W. Schouls L.M. Mol. Microbiol. 2002; 43: 1565-1575Crossref PubMed Scopus (1269) Google Scholar, 3Sorek R. Kunin V. Hugenholtz P. Nat. Rev. Microbiol. 2008; 6: 181-186Crossref PubMed Scopus (652) Google Scholar). CRISPRs represent the most widely distributed family of repeats in prokaryotes and are found in a great variety of bacteria and archaea, including many pathogens (such as Mycobacterium tuberculosis, Streptococcus, Yersinia, Neisseria, and Corynebacterium) (4Groenen P.M. Bunschoten A.E. van Soolingen D. van Embden J.D. Mol. Microbiol. 1993; 10: 1057-1065Crossref PubMed Scopus (311) Google Scholar, 5Mojica F.J. Diez-Villasenor C. Garcia-Martinez J. Soria E. J. Mol. Evol. 2005; 60: 174-182Crossref PubMed Scopus (1377) Google Scholar, 6Masepohl B. Gorlitz K. Bohme H. Biochim. Biophys. Acta. 1996; 1307: 26-30Crossref PubMed Scopus (38) Google Scholar, 7Hoe N. Nakashima K. Grigsby D. Pan X. Dou S.J. Naidich S. Garcia M. Kahn E. Bergmire-Sweat D. Musser J.M. Emerg. Infect. Dis. 1999; 5: 254-263Crossref PubMed Scopus (94) Google Scholar, 8Ishino Y. Shinagawa H. Makino K. Amemura M. Nakata A. J. Bacteriol. 1987; 169: 5429-5433Crossref PubMed Google Scholar). Most organisms possess multiple CRISPR clusters (2Mojica F.J. Diez-Villasenor C. Soria E. Juez G. Mol. Microbiol. 2000; 36: 244-246Crossref PubMed Scopus (513) Google Scholar, 3Sorek R. Kunin V. Hugenholtz P. Nat. Rev. Microbiol. 2008; 6: 181-186Crossref PubMed Scopus (652) Google Scholar, 4Groenen P.M. Bunschoten A.E. van Soolingen D. van Embden J.D. Mol. Microbiol. 1993; 10: 1057-1065Crossref PubMed Scopus (311) Google Scholar, 5Mojica F.J. Diez-Villasenor C. Garcia-Martinez J. Soria E. J. Mol. Evol. 2005; 60: 174-182Crossref PubMed Scopus (1377) Google Scholar, 6Masepohl B. Gorlitz K. Bohme H. Biochim. Biophys. Acta. 1996; 1307: 26-30Crossref PubMed Scopus (38) Google Scholar, 7Hoe N. Nakashima K. Grigsby D. Pan X. Dou S.J. Naidich S. Garcia M. Kahn E. Bergmire-Sweat D. Musser J.M. Emerg. Infect. Dis. 1999; 5: 254-263Crossref PubMed Scopus (94) Google Scholar, 8Ishino Y. Shinagawa H. Makino K. Amemura M. Nakata A. J. Bacteriol. 1987; 169: 5429-5433Crossref PubMed Google Scholar, 9Haft D.H. Selengut J. Mongodin E.F. Nelson K.E. Plos Comput. Biol. 2005; 1: e60Crossref PubMed Scopus (749) Google Scholar, 10Grissa I. Vergnaud G. Pourcel C. BMC Bioinformatics. 2007; 8: 172Crossref PubMed Scopus (750) Google Scholar, 11Lillestol R.K. Redder P. Garrett R.A. Brugger K. Archaea. 2006; 2: 59-72Crossref PubMed Scopus (214) Google Scholar, 12Pourcel C. Salvignol G. Vergnaud G. Microbiology. 2005; 151: 653-663Crossref PubMed Scopus (935) Google Scholar, 13Bolotin A. Quinquis B. Sorokin A. Ehrlich S.D. Microbiology. 2005; 151: 2551-2561Crossref PubMed Scopus (1174) Google Scholar, 14Horvath P. Romero D.A. Coute-Monvoisin A.C. Richards M. Deveau H. Moineau S. Boyaval P. Fremaux C. Barrangou R. J. Bacteriol. 2008; 190: 1401-1412Crossref PubMed Scopus (602) Google Scholar, 15Barrangou R. Fremaux C. Deveau H. Richards M. Boyaval P. Moineau S. Romero D.A. Horvath P. Science. 2007; 315: 1709-1712Crossref PubMed Scopus (3985) Google Scholar, 16Deveau H. Barrangou R. Garneau J.E. Labonte J. Fremaux C. Boyaval P. Romero D.A. Horvath P. Moineau S. J. Bacteriol. 2008; 190: 1390-1400Crossref PubMed Scopus (911) Google Scholar, 17Makarova K.S. Grishin N.V. Shabalina S.A. Wolf Y.I. Koonin E.V. Biol. Direct. 2006; 1: 7Crossref PubMed Scopus (862) Google Scholar, 18Tatusov R.L. Fedorova N.D. Jackson J.D. Jacobs A.R. Kiryutin B. Koonin E.V. Krylov D.M. Mazumder R. Mekhedov S.L. Nikolskaya A.N. Rao B.S. Smirnov S. Sverdlov A.V. Vasudevan S. Wolf Y.I. Yin J.J. Natale D.A. BMC Bioinformatics. 2003; 4: 41Crossref PubMed Scopus (3444) Google Scholar, 19Katz M.E. Wright C.L. Gartside T.S. Cheetham B.F. Doidge C.V. Moses E.K. Rood J.I. J. Bacteriol. 1994; 176: 2663-2669Crossref PubMed Google Scholar, 20Zhang R.G. Skarina T. Katz J.E. Beasley S. Khachatryan A. Vyas S. Arrowsmith C.H. Clarke S. Edwards A. Joachimiak A. Savchenko A. Structure (Lond.). 2001; 9: 1095-1106Abstract Full Text Full Text PDF PubMed Scopus (100) Google Scholar) that form tandem arrays containing from 4 to over 100 units (a repeat and a spacer) (1Jansen R. Embden J.D. Gaastra W. Schouls L.M. Mol. Microbiol. 2002; 43: 1565-1575Crossref PubMed Scopus (1269) Google Scholar, 9Haft D.H. Selengut J. Mongodin E.F. Nelson K.E. Plos Comput. Biol. 2005; 1: e60Crossref PubMed Scopus (749) Google Scholar, 10Grissa I. Vergnaud G. Pourcel C. BMC Bioinformatics. 2007; 8: 172Crossref PubMed Scopus (750) Google Scholar, 11Lillestol R.K. Redder P. Garrett R.A. Brugger K. Archaea. 2006; 2: 59-72Crossref PubMed Scopus (214) Google Scholar). Sequence analysis of the CRISPR spacers in various organisms revealed that some of the spacer sequences are homologous (up to 100% sequence identity) to sequences in phage genomes, plasmids, or in the chromosome and have either sense or antisense orientations (5Mojica F.J. Diez-Villasenor C. Garcia-Martinez J. Soria E. J. Mol. Evol. 2005; 60: 174-182Crossref PubMed Scopus (1377) Google Scholar, 12Pourcel C. Salvignol G. Vergnaud G. Microbiology. 2005; 151: 653-663Crossref PubMed Scopus (935) Google Scholar, 13Bolotin A. Quinquis B. Sorokin A. Ehrlich S.D. Microbiology. 2005; 151: 2551-2561Crossref PubMed Scopus (1174) Google Scholar, 14Horvath P. Romero D.A. Coute-Monvoisin A.C. Richards M. Deveau H. Moineau S. Boyaval P. Fremaux C. Barrangou R. J. Bacteriol. 2008; 190: 1401-1412Crossref PubMed Scopus (602) Google Scholar). Remarkably, phages and plasmids fail to infect the specific strains carrying the cognate spacers, implying a relationship between CRISPRs and immunity against specific foreign DNA. Recent experiments on Streptococcus thermophilus have directly demonstrated that, after a phage challenge, this bacterium adds new CRISPR spacers derived from the phage genome (15Barrangou R. Fremaux C. Deveau H. Richards M. Boyaval P. Moineau S. Romero D.A. Horvath P. Science. 2007; 315: 1709-1712Crossref PubMed Scopus (3985) Google Scholar, 16Deveau H. Barrangou R. Garneau J.E. Labonte J. Fremaux C. Boyaval P. Romero D.A. Horvath P. Moineau S. J. Bacteriol. 2008; 190: 1390-1400Crossref PubMed Scopus (911) Google Scholar). Removal or addition of the phage-specific spacers correlated with the cell resistance to the respective phage implying that resistance specificity is determined by the sequence identity between the phage genome and spacers (15Barrangou R. Fremaux C. Deveau H. Richards M. Boyaval P. Moineau S. Romero D.A. Horvath P. Science. 2007; 315: 1709-1712Crossref PubMed Scopus (3985) Google Scholar). On the chromosome, CRISPR loci are flanked by a large number of cas (CRISPR-associated) genes encoding uncharacterized proteins. A comprehensive bioinformatic analysis of the CAS system in sequenced genomes resulted in a refined classification with 25 gene families and at least nine types of the cas operon organization (9Haft D.H. Selengut J. Mongodin E.F. Nelson K.E. Plos Comput. Biol. 2005; 1: e60Crossref PubMed Scopus (749) Google Scholar, 17Makarova K.S. Grishin N.V. Shabalina S.A. Wolf Y.I. Koonin E.V. Biol. Direct. 2006; 1: 7Crossref PubMed Scopus (862) Google Scholar). Eight CAS protein families have been predicted to possess nuclease activity; nine families have been characterized as putative RNA-binding proteins (RAMP-domain proteins), and two families have been predicted to possess helicase and DNA/RNA polymerase activity (17Makarova K.S. Grishin N.V. Shabalina S.A. Wolf Y.I. Koonin E.V. Biol. Direct. 2006; 1: 7Crossref PubMed Scopus (862) Google Scholar). This analysis, combined with the data on the homology of some CRISPR spacer sequences to phage genes, led to the hypothesis that the CRISPRs and cas-encoded proteins comprise a system of defense against invading phages and plasmids and that this system might operate analogously to the eukaryotic RNA interference systems (17Makarova K.S. Grishin N.V. Shabalina S.A. Wolf Y.I. Koonin E.V. Biol. Direct. 2006; 1: 7Crossref PubMed Scopus (862) Google Scholar). The wide distribution of the CRISPR-CAS system among archaea and bacteria, its apparent importance for immunity of prokaryotes against infectious agents, and the predicted novel mechanism have recently made this system a subject of intense interest (3Sorek R. Kunin V. Hugenholtz P. Nat. Rev. Microbiol. 2008; 6: 181-186Crossref PubMed Scopus (652) Google Scholar). Two cas genes (cas1 and cas2) are always located near a CRISPR locus and are found only in species containing CRISPRs suggesting that these proteins play a central role in the CRISPR system (1Jansen R. Embden J.D. Gaastra W. Schouls L.M. Mol. Microbiol. 2002; 43: 1565-1575Crossref PubMed Scopus (1269) Google Scholar, 17Makarova K.S. Grishin N.V. Shabalina S.A. Wolf Y.I. Koonin E.V. Biol. Direct. 2006; 1: 7Crossref PubMed Scopus (862) Google Scholar). The members of the CAS2 superfamily are small, uncharacterized proteins (80-120 residues), which belong to COG1343 and COG3512 groups of the COG protein classification system (18Tatusov R.L. Fedorova N.D. Jackson J.D. Jacobs A.R. Kiryutin B. Koonin E.V. Krylov D.M. Mazumder R. Mekhedov S.L. Nikolskaya A.N. Rao B.S. Smirnov S. Sverdlov A.V. Vasudevan S. Wolf Y.I. Yin J.J. Natale D.A. BMC Bioinformatics. 2003; 4: 41Crossref PubMed Scopus (3444) Google Scholar). CAS2 proteins contain several conserved sequence motifs, in particular an N-terminal motif that consists of a string of hydrophobic residues (a predicted β-strand), and typically ends with an aspartate (17Makarova K.S. Grishin N.V. Shabalina S.A. Wolf Y.I. Koonin E.V. Biol. Direct. 2006; 1: 7Crossref PubMed Scopus (862) Google Scholar). The CAS2 protein sequences show some similarity to the sequences of the VapD family of uncharacterized proteins that are functionally linked to the VapBC toxin-antitoxin (TA) operon (19Katz M.E. Wright C.L. Gartside T.S. Cheetham B.F. Doidge C.V. Moses E.K. Rood J.I. J. Bacteriol. 1994; 176: 2663-2669Crossref PubMed Google Scholar). Based on the pattern of conserved amino acid residues, in particular the presence of a conserved aspartate after a predicted β-strand, and some functional clues on the TA systems, it has been hypothesized that both CAS2 and VapD might possess RNase activity (17Makarova K.S. Grishin N.V. Shabalina S.A. Wolf Y.I. Koonin E.V. Biol. Direct. 2006; 1: 7Crossref PubMed Scopus (862) Google Scholar). Here we report for the first time the results of biochemical and structural characterization of a family of CRISPR-associated enzymes, the CAS2 family proteins from five prokaryotes. We show that CAS2 proteins are endoribonucleases that are specific to single-stranded (ss)RNAs and preferentially cleave them within U-rich regions. The crystal structure of a representative CAS2 protein, SSO1404 from Sulfolobus solfataricus, was solved to a 1.6 Å resolution and revealed a ferredoxin-like fold with the double split β-α-β motif, as well as the putative active site. Protein Overexpression, Purification, and Site-directed Mutagenesis—The cloning of the genes encoding SSO1404 and other CAS2 proteins (SSO8090, TM1796, AF1876, MTH1083, and NE0845) into the modified pET15b was carried out as described previously (20Zhang R.G. Skarina T. Katz J.E. Beasley S. Khachatryan A. Vyas S. Arrowsmith C.H. Clarke S. Edwards A. Joachimiak A. Savchenko A. Structure (Lond.). 2001; 9: 1095-1106Abstract Full Text Full Text PDF PubMed Scopus (100) Google Scholar). The proteins were expressed as a fusion with an N-terminal His6 tag in Escherichia coli strain BL21 (DE3) and purified to more than 95% homogeneity using metal-chelate affinity chromatography on nickel affinity resin and gel filtration on a Superdex 200 26/60 column (Amersham Biosciences) as described before (20Zhang R.G. Skarina T. Katz J.E. Beasley S. Khachatryan A. Vyas S. Arrowsmith C.H. Clarke S. Edwards A. Joachimiak A. Savchenko A. Structure (Lond.). 2001; 9: 1095-1106Abstract Full Text Full Text PDF PubMed Scopus (100) Google Scholar, 21Proudfoot M. Sanders S.A. Singer A. Zhang R. Brown G. Binkowski A. Xu L. Lukin J.A. Murzin A.G. Joachimiak A. Arrowsmith C.H. Edwards A.M. Savchenko A.V. Yakunin A.F. J. Mol. Biol. 2008; 375: 301-315Crossref PubMed Scopus (42) Google Scholar). Site-directed mutagenesis of SSO1404 was performed as described previously (21Proudfoot M. Sanders S.A. Singer A. Zhang R. Brown G. Binkowski A. Xu L. Lukin J.A. Murzin A.G. Joachimiak A. Arrowsmith C.H. Edwards A.M. Savchenko A.V. Yakunin A.F. J. Mol. Biol. 2008; 375: 301-315Crossref PubMed Scopus (42) Google Scholar) using a protocol based on the QuikChange site-directed mutagenesis kit (Stratagene). Preparation of RNA Substrates—The short RNA substrates (Table 1) were purchased from IDT. The oligonucleotides were 5′-end-labeled with [γ-32P]ATP (6,000 Ci/mmol; Amersham Biosciences) and T4 polynucleotide kinase (PNK) (Fermentas) and then purified by denaturing PAGE (15% polyacrylamide, 8 m urea gel). The labeled oligonucleotides were eluted from the gel, precipitated with 2% LiClO4 in acetone, washed with acetone, dried, and dissolved in diethyl pyrocarbonate-treated Milli-Q water. The long RNA substrates were synthesized using the Ambion T7 RNA polymerase MAXIscript transcription kit. For the synthesis of the mouse β-actin mRNA (304 nt) fragment, pTRI-actin-mouse DNA (MAXIscript kit, Ambion) was used as a template. To produce long CRISPR RNA substrates, a DNA template was prepared by cloning the 1,292-bp-long 5′-fragment of the S. solfataricus CRISPR cluster-2 DNA into the pBluescript SK+ vector (Stratagene). During the transcription reaction (MAXIscript kit, Ambion), this DNA template (linearized by EcoRV) directed the synthesis of a 270-nt-long RNA substrate containing the CRISPR cluster-2 upstream region, repeat-1, spacer-1, repeat-2, and a 27-nt fragment of spacer-2. Long RNA substrates were uniformly labeled during transcription using [α-32P]UTP (3,000 Ci/mmol). The labeled RNA substrates were purified by denaturing 8% PAA, 8 m urea gel, eluted, ethanol-precipitated, and stored at -70 °C. Double-stranded RNA substrates were prepared by the incubation of equimolar amounts (10 μm) of the 5′-32P-labeled sense and unlabeled antisense RNA oligoribonucleotides (RNA5, RNA25, and RNA26) in 50 mm Tris-HCl (pH 7.0), 100 mm KCl at 90 °C for 1 min and slow cooling at room temperature for 1.5 h. The formation of RNA duplexes was verified by electrophoresis in 10% nondenaturing PAA gels.TABLE 1Nucleotide sequences of ssRNA substrates and SSO1404 cleavage sitesSubstrateNucleotide sequence (5′ → 3′) and cleavage sitesaCleavage sites are shown by arrows (boldface arrows indicate the preferred cleavage sites). The substrates correspond to CRISPR DNA sequences from the several organisms. S. solfataricus CRISPR cluster-2 is as follows: RNA1, repeat-1; RNA2, repeat-2; RNA3, spacer-1; RNA4, spacer-2; RNA5, spacer-3; RNA18, RNA21, RNA22, RNA23, and RNA24, 5′- and/or 3′ -truncated RNA5; RNA25, RNA1 containing a 6-nt fragment of the spacer-1; RNA26, RNA1 containing a 16-nt fragment of RNA1. A. fulgidus CRISPR cluster-2 is as follows: RNA6, spacer-1. S. thermophilus LMG18311 CRISPR cluster-1; RNA7, repeat-1; RNA8, spacer-1; RNA9, spacer-2; RNA10, spacer-3; RNA20, the 5′-fragment (16 nt) of RNA9. T. maritima CRISPR cluster-8: RNA14, spacer-5; RNA15, spacer-6. M. jannaschii CRISPR cluster-6 is as follows: RNA16, spacer-11; RNA17, spacer-10. Scrambled RNA substrates are as follows: RNA11, RNA12, and RNA13; RNA19, 5′-fragment of RNA12.RNA1 (24 nt)UUUCAAU ↓ UCCUUUUAGGAUUAAUCRNA2 (25 nt)CUUUCAA ↓ U ↓ UCCUUUUAGGAUUAAUCRNA3 (38 nt)UUGAAGAU ↓ AG ↓ AG ↓ UU ↓ AAAUGAACUUUUGAUGACACAAAARNA4 (41 nt)GUCUACA ↓ C ↓ C ↓ UCUUGCGUCUAUUAUGAUAUCAUAUUUGUCAARNA5 (39 nt)AAAUACG ↓ U ↓ U ↓ UUCUCCAUUGUCAUAUUGCGCAUAAGUUGARNA6 (39 nt)CUAUUCU ↓ CCAAAU ↓ AAAAU ↓ U ↓ AU ↓ AUAACGCAUUAACAUCAARNA7 (36 nt)GUUUUUGUACU ↓ CUCAAGA ↓ UUUAAGUAACUGUACAACRNA8 (30 nt)GAGCUACCAGCU ↓ ACCCCGUAUGUCAGAGAGRNA9 (30 nt)CGUUCCUU ↓ UUUUCAAGGUAAUCUUUGAAAGRNA10 (30 nt)AAGUCCGUAAGCACCAG ↓ UUCCAAUCGUCAURNA11 (35 nt)AUCA ↓ U ↓ UUUCAUGUCACUUUUAGUGACAUGUUUUACRNA12 (32 nt)CAUUU ↓ UCUU ↓ AACACUG ↓ UACUUAACUUUUACUARNA13 (32 nt)AUCA ↓ U ↓ UUUGUUAAGUACAGUCUUAAGU ↓ U ↓ UUCCRNA14 (36 nt)GUUCUCUCAAUG ↓ UCACGU ↓ CUCAUCACCUCUUCUUCGRNA15 (37 nt)AGAAAGU ↓ AGACGACAAGUUCGUUGAUUGUCUCGUCGCRNA16 (36 nt)ACGAGGAGUU ↓ U ↓ UACAUGUUUUAUGAGUCUAAUUUAGRNA17 (36 nt)GUUUAUAA ↓ UGGGA ↓ UA ↓ U ↓ UGGAGAAAAUGGAGAUAACARNA18 (16 nt)AAAUACGUU ↓ UUCUCCARNA19 (16 nt)CAUUUUCU ↓ UAACACUGRNA20 (16 nt)CGUUCCUU ↓ UUUUCAAGRNA21 (13 nt)UACGUU ↓ UUCUCCARNA22 (10 nt)UACGUU ↓ UUCURNA23 (8 nt)CGUUUUCURNA24 (6 nt)GUUUUCRNA25 (30 nt)UUUCAAUU ↓ C ↓ CUUUUAGGAUUAAUCUUGAAGRNA26 (40 nt)UUUCAA ↓ U ↓ UCCUUUUAGGAUUAAUCU ↓ UGAAGAUAGAGUUAAa Cleavage sites are shown by arrows (boldface arrows indicate the preferred cleavage sites). The substrates correspond to CRISPR DNA sequences from the several organisms. S. solfataricus CRISPR cluster-2 is as follows: RNA1, repeat-1; RNA2, repeat-2; RNA3, spacer-1; RNA4, spacer-2; RNA5, spacer-3; RNA18, RNA21, RNA22, RNA23, and RNA24, 5′- and/or 3′ -truncated RNA5; RNA25, RNA1 containing a 6-nt fragment of the spacer-1; RNA26, RNA1 containing a 16-nt fragment of RNA1. A. fulgidus CRISPR cluster-2 is as follows: RNA6, spacer-1. S. thermophilus LMG18311 CRISPR cluster-1; RNA7, repeat-1; RNA8, spacer-1; RNA9, spacer-2; RNA10, spacer-3; RNA20, the 5′-fragment (16 nt) of RNA9. T. maritima CRISPR cluster-8: RNA14, spacer-5; RNA15, spacer-6. M. jannaschii CRISPR cluster-6 is as follows: RNA16, spacer-11; RNA17, spacer-10. Scrambled RNA substrates are as follows: RNA11, RNA12, and RNA13; RNA19, 5′-fragment of RNA12. Open table in a new tab Enzymatic Assays—The reaction mixture for RNase assays (10 μl) contained 0.1 μm [32P]RNA, 50 mm Tris-HCl (pH 8.5), 100 mm KCl, 5 mm MgCl2, 1 mm dithiothreitol, and 0.01-0.1 μg of enzyme. The pH dependence of SSO1404 was characterized using three buffers: MES-K (pH 5.5 to 6.5), Tris-HCl (pH 7.0 to 9.0), and CAPS-K (pH 9.4 to 11.0). The reaction mixture was incubated at 37 °C for the indicated period of time and quenched by the addition of equal volume of formamide loading buffer (80% formamide, 0.025% bromphenol blue, 0.025% xylene cyanol, and 10 mm EDTA (pH 8.0)). The reaction products were resolved by electrophoresis in 15% PAA, 8 m urea gels using TBE (10 mm Tris borate (pH 8.3), and 2 mm EDTA) as a running buffer. As nucleotide size markers, an imidazole ladder or a G-ladder produced by partial RNA cleavage by 2 m imidazole or RNase T1, respectively, was used (22Mironova N.L. Pyshnyi D.V. Shtadler D.V. Fedorova A.A. Vlassov V.V. Zenkova M.A. Nucleic Acids Res. 2007; 35: 2356-2367Crossref PubMed Scopus (29) Google Scholar, 23Nishikawa S. Morioka H. Kim H.J. Fuchimura K. Tanaka T. Uesugi S. Hakoshima T. Tomita K. Ohtsuka E. Ikehara M. Bio-chemistry. 1987; 26: 8620-8624Google Scholar). For the analysis of the RNA product 5′-end, after RNase reaction, RNA products were precipitated by 2% LiClO4, washed by acetone, dried, dissolved in Milli-Q water, and phosphorylated with [γ-32P]ATP and T4 PNK using conditions for forward or phosphate exchange reaction according to the manufacturer's protocol (Fermentas). After the PNK reaction, RNA products were analyzed using denaturing 15% PAA, 8 m urea gels as described above for RNase assays. The reaction mixtures for DNase assays (40 μl) contained 50 mm HEPES-K buffer (pH 7.5), 100 mm KCl, 5 mm MgCl2, 1 mm dithiothreitol, 0.3 μg of λ DNA (double-stranded DNA) or 0.75 μg of M13 DNA (ssDNA), and 1-4 μg of enzyme. After 1 h of incubation at 37 °C, the reactions were quenched by the addition of 6× DNA loading dye (Fermentas) and analyzed on EtBr-stained 1% agarose gels. Protein Crystallization and Structure Determination—SSO1404 crystals were grown using the hanging drop vapor diffusion method with the drops containing a mixture of 2 μl of 10 mg/ml purified selenomethionine-incorporated SSO1404 protein and 2 μl of reservoir buffer (0.2 m NaI, 20% w/v PEG 3350, and 2% v/v isopropyl alcohol). For diffraction studies, the crystals were stabilized with the crystallization buffer supplemented with 20% ethylene glycol as a cryoprotectant and flash-frozen in liquid nitrogen. A single crystal of selenomethionine-incorporated SSO1404 was used to collect diffraction data at beamline 19-BM of the Structural Biology Center of the Advanced Photon Source (24Rosenbaum G. Alkire R.W. Evans G. Rotella F.J. Lazarski K. Zhang R.G. Ginell S.L. Duke N. Naday I. Lazarz J. Molitsky M.J. Keefe L. Gonczy J. Rock L. Sanishvili R. Walsh M.A. Westbrook E. Joachimiak A. J. Synchrotron Radiat. 2006; 13: 30-45Crossref PubMed Scopus (137) Google Scholar) and was maintained at a temperature of 100 K. A single-wavelength anomalous diffraction dataset was collected at a wavelength of 0.9794 Å. Crystallographic data collection and model refinement statistics are summarized in Table 2. Reflection data were collected, indexed, integrated, and scaled with HKL-3000 (25Minor W. Cymborowski M. Otwinowski Z. Chruszcz M. Acta Crystallogr. Sect. D. Biol. Crystallogr. 2006; 62: 859-866Crossref PubMed Scopus (1586) Google Scholar).TABLE 2Crystallographic data collection and refinement statisticsData collectionSSO1404 (Protein Data Bank 2i8e)Resolution range (Å)50-1.59 (1.65-1.59)Wavelength (Å)0.9794Space groupP62Unit cell parametersa = b = 64.3 Å, c = 39.5 Å α = β = 90°, γ = 120°No. of reflections12,619 (1219)Rmerge0.052 (0.309)Rmerge,aaRmerge,a is calculated for unmerged anomalous pairs.0.038 (0.293)Mean I/σI63.8 (8.2)Completeness (%)99.5 (96.2)Redundancy11.7 (10.3)Wilson B factor (Å2)27.3RefinementR (%)18.8Rfree (%)22.7Protein atoms/AU782Heterogen (iodide) atoms/AU4Solvent atoms/AU75Mean B factor (Å2)20.9r.m.s.d. bond lengths (Å)0.016r.m.s.d. angles (°)1.69Ramachandran plotFavored regions (%)100.0Allowed regions (%)0.0Outliers (%)0.0a Rmerge,a is calculated for unmerged anomalous pairs. Open table in a new tab A two-site selenium substructure was determined; the structure was phased by single-wavelength anomalous diffraction, and an initial model was built. All structure solution and initial model building was performed by HKL-3000, which is integrated with SHELXD, SHELXE, MLPHARE, DM, O, COOT, SOLVE, RESOLVE, and ARP/wARP (25Minor W. Cymborowski M. Otwinowski Z. Chruszcz M. Acta Crystallogr. Sect. D. Biol. Crystallogr. 2006; 62: 859-866Crossref PubMed Scopus (1586) Google Scholar, 26Jones T.A. Zou J.Y. Cowan S.W. Kjeldgaard M. Acta Crystallogr. Sect. A. 1991; 47: 110-119Crossref PubMed Scopus (13014) Google Scholar, 27Terwilliger T.C. Berendzen J. Acta Crystallogr. Sect. D. Biol. Crystallogr. 1999; 55: 849-861Crossref PubMed Scopus (3220) Google Scholar, 28Terwilliger T.C. Acta Crystallogr. Sect. D. Biol. Crystallogr. 2002; 58: 1937-1940Crossref PubMed Scopus (283) Google Scholar, 29Perrakis A. Morris R. Lamzin V.S. Nat. Struct. Biol. 1999; 6: 458-463Crossref PubMed Scopus (2565) Google Scholar, 30Emsley P. Cowtan K. Acta Crystallogr. Sect. D. Biol. Crystallogr. 2004; 60: 2126-2132Crossref PubMed Scopus (23628) Google Scholar, 31Morris R.J. Perrakis A. Lamzin V.S. Methods Enzymol. 2003; 374: 229-244Crossref PubMed Scopus (475) Google Scholar, 32Schneider T.R. Sheldrick G.M. Acta Crystallogr. Sect. D. Biol. Crystallogr. 2002; 58: 1772-1779Crossref PubMed Scopus (1581) Google Scholar). The initial model was improved by iterative cycles of manual rebuilding in COOT, followed by maximum likelihood refinement with REF-MAC5 (33Murshudov G.N. Vagin A.A. Dodson E.J. Acta Crystallogr. Sect. D. Biol. Crystallogr. 1997; 53: 240-255Crossref PubMed Scopus (13914) Google Scholar). In later stages of refinement, a multigroup TLS model generated by the TLSMD web server was used to further improve the model. The final model was validated using Mol-probity (34Lovell S.C. Davis I.W. Arendall III, W.B. de Bakker P.I. Word J.M. Prisant M.G. Richardson J.S. Richardson D.C. Proteins. 2003; 50: 437-450Crossref PubMed Scopus (3892) Google Scholar), SFCHECK (35Vaguine A.A. Richelle J. Wodak S.J. Acta Crystallogr. Sect. D Biol. Crystallogr. 1999; 55: 191-205Crossref PubMed Scopus (861) Google Scholar), and PROCHECK. The atomic coordinates and structure factors for SSO1404 have been deposited in the Protein Data Bank with the accession code 2i8e. Enzymatic Activity of CAS2 Proteins—To characterize the biochemical activity of CAS2 proteins, we cloned and purified six members of this family from different organisms as follows: SSO1404 and SSO8090 from S. solfataricus, AF1876 from Archaeoglobus fulgidus, TM1796 from Thermotoga maritima, MTH1083 from Methanobacterium thermoautotrophicum, and NE0845 from Nitrosomonas europaea. Given the prediction that CAS2 proteins might possess nuclease activity (17Makarova K.S. Grishin N.V. Shabalina S.A. Wolf Y.I. Koonin E.V. Biol. Direct. 2006; 1: 7Crossref PubMed Scopus (862) Google Scholar), the purified proteins were tested for the presence of DNase or RNase activity against single-stranded (ss) DNA (M13 DNA), double-stranded (ds) DNA (λ DNA), or ssRNA as follows: RNA5, RNA6, and RNA14 (Table 1), which are identical in sequence to the sense strands of the CRISPR repeats or spacers from S. solfataricus, A. fulgidus, or T. maritima (36 or 39 nt). No nuclease activity was found against either of the DNA substrates, but all proteins degraded the ssRNAs (Fig. 1). With the tested ssRNA substrates, the CAS2 proteins generated a limited number (one to five) of products of various lengths (7-29 nt) indicating that they cleave ssRNAs endonucleolytically. A similar but not identical pattern of products was observed, and most cleavage sites contained one or two U (Fig. 1). This observation suggests that the CAS2 ribonucleases recognize similar RNA sequences but also display some difference in substrate preference. SSO1404 also showed detectable cleavage of a long model RNA substrate (the 304-nt transcript of the 5′-fragment of the mouse β-actin gene) and generated several products of various lengths (18-200 nt), but the activity was lower than that against short oligoribonucleotides (data not shown). Thus, the CAS2 family proteins are ssRNA-specific endoribonucleases. Reaction Requirements and RNA Cleavage Products of SSO1404—SSO1404 exhibited RNase activity over a broad pH range (7.0-10.0) with maximum activity at pH 8.5-9.0 (Fig. 2A). Very little cleavage of RNA5 by SSO1404 (between U20 and C21) was observed in the absence of b
Referência(s)