Artigo Acesso aberto Revisado por pares

Evidence for a major role of antisense RNAs in cyanobacterial gene regulation

2009; Springer Nature; Volume: 5; Issue: 1 Linguagem: Inglês

10.1038/msb.2009.63

ISSN

1744-4292

Autores

Jens Georg, Björn Voß, Ingeborg Scholz, Jan Mitschke, Annegret Wilde, Wolfgang R. Hess,

Tópico(s)

Genomics and Phylogenetic Studies

Resumo

Article15 September 2009Open Access Evidence for a major role of antisense RNAs in cyanobacterial gene regulation Jens Georg Jens Georg Faculty of Biology and Freiburg Initiative in Systems Biology, University of Freiburg, Freiburg, Germany Search for more papers by this author Björn Voß Björn Voß Faculty of Biology and Freiburg Initiative in Systems Biology, University of Freiburg, Freiburg, Germany Search for more papers by this author Ingeborg Scholz Ingeborg Scholz Faculty of Biology and Freiburg Initiative in Systems Biology, University of Freiburg, Freiburg, Germany Search for more papers by this author Jan Mitschke Jan Mitschke Faculty of Biology and Freiburg Initiative in Systems Biology, University of Freiburg, Freiburg, Germany Search for more papers by this author Annegret Wilde Annegret Wilde Justus-Liebig University Giessen, Institute of Microbiology and Molecular Biology, Heinrich-Buff-Ring, Giessen, Germany Search for more papers by this author Wolfgang R Hess Corresponding Author Wolfgang R Hess Faculty of Biology and Freiburg Initiative in Systems Biology, University of Freiburg, Freiburg, Germany Search for more papers by this author Jens Georg Jens Georg Faculty of Biology and Freiburg Initiative in Systems Biology, University of Freiburg, Freiburg, Germany Search for more papers by this author Björn Voß Björn Voß Faculty of Biology and Freiburg Initiative in Systems Biology, University of Freiburg, Freiburg, Germany Search for more papers by this author Ingeborg Scholz Ingeborg Scholz Faculty of Biology and Freiburg Initiative in Systems Biology, University of Freiburg, Freiburg, Germany Search for more papers by this author Jan Mitschke Jan Mitschke Faculty of Biology and Freiburg Initiative in Systems Biology, University of Freiburg, Freiburg, Germany Search for more papers by this author Annegret Wilde Annegret Wilde Justus-Liebig University Giessen, Institute of Microbiology and Molecular Biology, Heinrich-Buff-Ring, Giessen, Germany Search for more papers by this author Wolfgang R Hess Corresponding Author Wolfgang R Hess Faculty of Biology and Freiburg Initiative in Systems Biology, University of Freiburg, Freiburg, Germany Search for more papers by this author Author Information Jens Georg1, Björn Voß1, Ingeborg Scholz1, Jan Mitschke1, Annegret Wilde2 and Wolfgang R Hess 1 1Faculty of Biology and Freiburg Initiative in Systems Biology, University of Freiburg, Freiburg, Germany 2Justus-Liebig University Giessen, Institute of Microbiology and Molecular Biology, Heinrich-Buff-Ring, Giessen, Germany *Corresponding author. Faculty of Biology and Freiburg Initiative in Systems Biology, University of Freiburg, Schänzlestr. 1, Freiburg 79104, Germany. Tel.: +49 761 2032796; Fax: +49 761 2036996; E-mail: [email protected] Molecular Systems Biology (2009)5:305https://doi.org/10.1038/msb.2009.63 PDFDownload PDF of article text and main figures. ToolsAdd to favoritesDownload CitationsTrack CitationsPermissions Figures & Info Information on the numbers and functions of naturally occurring antisense RNAs (asRNAs) in eubacteria has thus far remained incomplete. Here, we screened the model cyanobacterium Synechocystis sp. PCC 6803 for asRNAs using four different methods. In the final data set, the number of known noncoding RNAs rose from 6 earlier identified to 60 and of asRNAs from 1 to 73 (28 were verified using at least three methods). Among these, there are many asRNAs to housekeeping, regulatory or metabolic genes, as well as to genes encoding electron transport proteins. Transferring cultures to high light, carbon-limited conditions or darkness influenced the expression levels of several asRNAs, suggesting their functional relevance. Examples include the asRNA to rpl1, which accumulates in a light-dependent manner and may be required for processing the L11 r-operon and the SyR7 noncoding RNA, which is antisense to the murF 5′ UTR, possibly modulating murein biosynthesis. Extrapolated to the whole genome, ∼10% of all genes in Synechocystis are influenced by asRNAs. Thus, chromosomally encoded asRNAs may have an important function in eubacterial regulatory networks. Synopsis In addition to regulatory proteins, bacteria, as well as eukaryotes, possess a significant number of regulatory RNAs. In bacteria, the majority of regulatory RNAs appears to be encoded at genomic locations far away from their target genes and exhibit only partial base complementarity to their mRNA targets. However, a small number of regulatory RNAs is transcribed from the reverse complementary strand of an annotated gene and hence these exhibit full or partial overlaps with their potential targets (cis-encoded regulatory RNAs, asRNAs). It was known early on that such asRNAs control phage development and plasmid replication in bacteria (Wagner and Simons, 1994), yet recent work has much more advanced on trans-encoded regulatory RNAs, leaving information on the numbers, functions and systemic relevance of chromosomally encoded asRNAs behind. There are three main technical problems in dealing with antisense transcription in bacteria: (i) the general lack of robust algorithms to predict them; (ii) the high risk of measuring experimental artifacts generated during cDNA synthesis in microarray analyses (Perocchi et al, 2007); (iii) a low level of transcription reported to occur virtually throughout the entire genome (Selinger et al, 2000), making it difficult to differentiate asRNAs with a regulatory function from transcriptional noise. Here, we have tried to overcome all three obstacles by (i) rigorously interrogating all predictions made in a computational approach using tiled microarrays. To overcome the problem of unintended second strand synthesis (ii) we labeled RNA samples directly before their hybridization on the microarray and finally (iii) we focused predominantly on very highly expressed asRNAs. A tiling microarray was developed, covering all genes and intergenic regions for which a terminator, and thus a candidate asRNA or ncRNA, was computationally predicted. The arrays were hybridized in quadruplicates with pooled RNA from nine different conditions, to detect also those transcripts, which are only induced under specific conditions. As a positive control, the asRNA IsrR (Duehring et al, 2006) was detected as one contiguous segment of the array (Figure 1). In the 20 kb genomic region, that also gives rise to the IsrR/isiA transcript pair, two further asRNAs were detected. The affected genes (as_sll1586 and as_ndhH) code for an unknown protein and NADH dehydrogenase subunit 7, respectively (Figure 1). 432 of 646 transcripts above the expression threshold of +1.0 corresponded to mono-, di-, and polycistronic mRNAs, whereas 60 originated from intergenic regions and were considered ncRNAs and 73 at least partially overlap sense transcripts and therefore were designated asRNAs. Earlier mathematical modeling of sRNA-based gene regulation suggested a particular niche for regulatory RNA in allowing cells to transition quickly yet reliably between distinct states, consistent with the widespread appearance of bacterial sRNAs in stress regulatory networks (Mehta et al, 2008). To derive functional and quantitative data in an efficient way, we constructed a second microarray for measuring changes in expression levels of mRNAs together with their cognate asRNAs and derived the expression ratio as a proxy for the possible impact of the putative riboregulator. In detail, we show that transfer of cultures to stress conditions, which are highly relevant for a photosynthetic organism, causes distinct and characteristic changes in this ratio. For six selected asRNA/mRNA pairs and for the SyR7 ncRNA, we confirmed the changes in expression levels further by Northern blot hybridization (Figure 6). The ncRNA SyR7 overlaps with the 5′ UTR of the murF gene over its full length. The level of SyR7 was more than 20 times higher than that of the murF mRNA under three different conditions. However, the SyR7/murF ratio declined dramatically to ∼1 on a shift to HL (Figure 6A). The enzyme encoded by murF is required for murein biosynthesis. Therefore, we assume that the translation of murF is controlled by SyR7 and that under HL de novo synthesis of MurF is required for accelerated cell wall biosynthesis. Similar characteristic changes were also obtained for the other asRNA/mRNA pairs studied in more detail (Figure 6B and C). These selected examples show that a multitude of asRNA functions and mechanisms appear possible. It is well established that asRNAs and their cis-targets can form RNA–RNA duplexes, which are degraded by dsRNA-specific RNases (Hernandez et al, 2005; Duehring et al, 2006; Darfeuille et al, 2007; Kawano et al, 2007; Fozo et al, 2008). Hence antisense transcription is a powerful natural tool in repressing gene expression. There is a growing number of examples which support the idea of bacterial asRNAs serving as novel types of transcriptional terminators such as the 427 nt asRNA RNAβ in Vibrio anguillarum (Stork et al, 2007). Another possible level of regulation is represented by asRNAs, which directly modulate transcriptional activity. There is strong evidence to suggest that divergently located promoters can interfere with each other (Prescott and Proudfoot, 2002), and the length of transcripts generated from the divergently located promoter (Sneppen et al, 2005) is one important factor for this interaction. Here, we observed ∼180 nt as the average ncRNA length, whereas the lengths of the asRNAs ranged from 65 nt to 700 nt, with many asRNAs longer than 300 nt, lending support to the idea that some of them may have a function in transcriptional interference. An example of the transcriptional interference mechanism is an 1000 nt long asRNA involved in the sulfur-dependent expression of the ubiG operon in Clostridium acetobutylicum (Andre et al, 2008). Extrapolated to the whole genome, we estimated the total number of chromosomally encoded asRNAs in Synechocystis to be at least 300. Chromosomally encoded cis-asRNAs are much more frequent than originally thought and seem to outnumber intergenic ncRNAs. Antisense RNAs may affect 8–10% of all genes in Synechocystis, a number that lies within the range of asRNAs in eukaryotic genomes. It is very likely that chromosomally encoded asRNAs constitute an important component of another, not yet fully appreciated, level of gene regulation in bacteria. Introduction Bacteria, as well as eukaryotes, possess a significant number of regulatory RNAs. Eubacterial regulatory RNAs mainly control mRNA translation or decay, but some also bind proteins and thereby modify protein function (for reviews see Gottesman, 2004; Urban and Vogel, 2007). The majority of eubacterial regulatory RNAs are encoded at genomic locations far away from their target genes and exhibit only partial base complementarity to their mRNA targets. However, a small number of regulatory RNAs are transcribed from the reverse complementary strand of an annotated gene and hence these fully or partially overlap with their potential targets (cis-encoded regulatory RNAs). It was known early on that such natural antisense RNAs (asRNAs) control phage development and plasmid replication in bacteria (Wagner and Simons, 1994), yet recent work has made much more progress on trans-encoded regulatory RNAs. In several eukaryotic model organisms, it was found that the main transcriptional output from their genomes is noncoding RNA (ncRNA). Sense/antisense transcript pairs occur frequently in mammalian genomes (Katayama et al, 2005) and asRNAs were found opposite 1555 genes during high-resolution transcript screening of the yeast genome (David et al, 2006). It is now estimated that asRNAs or overlapping transcripts from adjacent transcriptional units exist for ∼22–26% of annotated genes in the human genome (Yelin et al, 2003; Chen et al, 2004; Zhang et al, 2006), for 14.9–29% of mouse genes (Okazaki et al, 2002; Kiyosawa et al, 2003; Katayama et al, 2005; Zhang et al, 2006), 15.4–16.8% of Drosophila genes (Zhang et al, 2006), and 8.9% of Arabidopsis thaliana genes (Jen et al, 2005; Wang et al, 2005). Despite the earlier reported examples of antisense transcripts in prokaryotes, experimental evidence for a more general role of chromosomally encoded asRNAs in eubacteria has remained scarce. Using a tiled microarray and a protocol optimized for detection of sRNAs, two asRNAs to transposase genes, and three ncRNAs overlapping a substantial part of an mRNA or of another ncRNA were reported in Caulobacter (Landt et al, 2008). On the other hand, Selinger et al (2000) found a very high number of potential asRNAs in Escherichia coli by using Affymetrix microarrays with an inverted probe set capable of detecting antisense transcription. Although not corroborated by independent experiments, this array detected antisense transcription for ∼3000–4000 genes, suggesting that there is a low level of transcription virtually throughout the E. coli genome (Selinger et al, 2000). More recently, evidence for 127 putative asRNAs in Vibrio cholera was obtained through parallel sequencing (Liu et al, 2009) but these asRNAs were not further studied. There is only one publication describing the biocomputational prediction of asRNAs in bacteria (Yachie et al, 2006). On the basis of a combination of promoter and rho-independent terminator prediction, 87 ncRNA and 46 asRNA candidates were predicted for E. coli. Of these, eight ncRNAs and four asRNAs could be verified experimentally. In cyanobacteria, evidence from earlier work indicated a function of chromosomal cis-encoded asRNAs in the regulation of gene expression. The asRNA IsrR in Synechocystis sp. PCC 6803 (from here: Synechocystis) regulates the accumulation of the isiA mRNA, and thereby controls the amount of IsiA protein and finally, protein–chromophore light harvesting complexes in cyanobacterial cells under iron limitation and redox stress (Duehring et al, 2006). A transcript complementary to the transcription factor furA mRNA was found in the filamentous cyanobacterium Anabaena PCC 7120. The furA asRNA originates by read-through from the adjacent gene alr1690 encoding a putative cell wall protein (Hernandez et al, 2005) and covers furA over its full length. Interrupting read-through from alr1690 resulted in an increased expression of FurA, thus the asRNA contributed in determining cellular levels of the protein. Other, less characterized, examples of asRNAs in cyanobacteria include a cis-encoded asRNA starting from the 3′ end of the gas vesicle gene gvpB and ending within the gvpA gene of the filamentous Calothrix PCC 7601 (Csiszar et al, 1987), and 24 asRNAs found by microarray hybridization in the marine unicellular Prochlorococcus MED4 (Steglich et al, 2008). In addition, there is a growing number of publications that hint at the impact of regulatory RNA in cyanobacteria without providing molecular details (Nakamura et al, 2007; Dienst et al, 2008; Voss et al, 2009). Here, a computational search was implemented for the 3.6 Mb genome of Synechocystis to find such RNAs. To test the existence of predicted candidates efficiently, a tiling microarray was designed, in which all genome regions containing predicted regulatory RNAs were covered, together with a control set of the same size. Focusing on high scoring as well as on randomly selected candidates for asRNAs, 28 asRNAS were verified independently by 5′ RACE (rapid amplification of cDNA ends) and Northern blot analysis (Table I). Among the targets possibly influenced by these asRNAs are mRNAs for ribosomal proteins, mRNAs for enzymes of primary metabolism as well as for proteins that are involved in signal transduction and electron transfer. Table 1. Top-scoring antisense RNAs from the prediction and microarray analysis Array segment Mean S Annotation/reference TSS Northern FC Start Stop asRNA mRNA Dark HL −CO2 2269032 2269133 7.835 0.971 − Internal as_slr0320 (t), cf. Figure 5 c2269144 100 − − − 1518029 1518214 7.003 5.878 + Internal as_isiA (IsrR; Duehring et al, 2006) (t) 1518034 177 − − − 89735 89820 6.302 1.557 + Internal as_sll1049, cf. Figure 5 neg. 90 − − − 2706752 2706984 5.14 1.51 − as_rlpA (slr0423) (5′overlap), cf. Figure 5 c2706939 160 − −3.04±0.15 − 2859925 2860056 4.334 2.28 − as_ndhF1 (slr0844) 2 segments (3′overlap), cf. Figure 5 c2860111 and c2860313 700 +3.21±0.27 − − 2859617 2859910 2.017 924638 924780 4.201 4.365 + as_rpl1 (sll1744) (5′overlap) (t), cf. Figure 8 924448 150 − −3.84±0.23 − 166860 167110 4.191 2.21 + Internal as_sll0217 (t), cf. Figure 5 166849 250 − − − 1504253 1504333 4.01 0.447 + Internal as_sll1586, cf. Figure 1 1504239 90 − − − 2163153 2163249 3.787 −0.308 − Internal as_slr0408 c2163253 130 −2.37±0.15 − −2.06±0.15 2823667 2824068 3.699 0.497 − as_slr0580 (5′overlap), cf. Figure 5 c2823987 600 − − +1.95±0.12 136748 136889 3.597 2.527 − Internal as_infB (slr0744) c136871 160 − − − 1992384 1992520 3.542 1.003 − Internal as_pknA (slr1697), cf. Figure 5 c1992722 65 − −2.0±1.26 − 3465167 3465285 3.257 0.376 + as_ppx (sll1546) (5′overlap), cf. Figure 5 neg. 250 − −1.89±0.27 +1.9±0.05 3565927 3566261 3.17 4.747 − Internal as_lepA (slr0604), cf. Figure 7 c3566241 380 − − − 3439416 3439581 2.99 −0.224 + Internal as_sll0723, cf. Figure 5 3439412 170 − − − 865872 865924 2.907 2.099 + hik3 (as_sll1124) (5′overlap) 865932 700 − − − 1510816 1511161 2.844 2.78 − as_ndhH (slr0261) (3′overlap), cf. Figure 1 c1511138 220 − − − 1283031 1283175 2.705 0.015 + as_sppA (sll1703) (5′overlap) 1283002 >500 +1.90±0.21 − − 2512367 2512605 2.594 0.146 + as_rfbA (sll0207) (3′overlap) (t), cf. Figure 5 2512327 550 − − − 1143957 1144210 2.253 0.506 − as_slr0882 (3′overlap), cf. Figure 6 c1144439 450 − +2.54±0.37 − 3198747 3198980 2.179 1.656 − as_hemE (slr0536) (5′overlap) c3198959 500 − − − 695587 695865 2.162 0.759 + Internal as_sll1289, cf. Figure 6 695567 250 − − − 1768770 1769050 1.853 0.518 + as_ribA (sll1894) (3′overlap) neg. >1000 − −1.86±0.44 − 198819 199396 1.54 0.816 − Internal as_slr1102, cf. Figure 5 neg. 400 − − − 819499 819977 1.494 5.452 + as_tktA (sll1070) (5′overlap), cf. Figure 7 819725 200 − −2.08±0.03 − 2216528 2217002 1.27 0.342 + as_sll1864 (3′overlap)+orf 2215955 >650 − −4.33±0.46 +2.4±0.08 3207282 3207602 1.22 −0.164 + as_sll0503 (3′overlap), cf. Figure 7 3207223 380 − − +2.72±0.83 2422099 2422678 1.1 0.169 + Internal as_sarA (sll0750), cf. Figure 5 2422099 and 2422045 350 − − − The start and stop positions of hybridizing segments within the Synechocystis chromosome are based on tiling microarray data and are shown together with the average expression (mean) for the asRNA and for the respective mRNA calculated from the hybridization of RNA pooled from nine different conditions in quadruplicates. The orientation of the asRNA locus in the genome is given (S) and the annotation, including the classification as internal, 5′ or 3′ overlapping asRNA. ‘t’ indicates that this asRNA was predicted based on a possible terminator structure. Precise 5′ ends of asRNAs were determined by 5′ RACE analysis (TSS). Size of major asRNA bands in Northern hybridizations are in nt. Moreover, fold changes (FC) are indicated under three different conditions from the expression arrays (if no FC: −). See supplementary Table S1 for the complete list of asRNAs. Figure 1.Example for verification of microarray-detected asRNAs in a 20 kb region of the Synechocystis genome, from coordinate 1 500 000–1 520 000. (A) Individual probes are indicated by dots, sets of probes with similar absolute expression levels were joined into contiguous segments, separated from each other and from regions not covered by the array by vertical lines (for the full data set see Supplementary information ‘Segmentation2500_final.pdf’). Annotated protein-coding genes are represented by blue boxes. At least three clearly detectable asRNAs (segments in red) originate in this region: IsrR (Duehring et al, 2006), an ∼90 nt asRNA to sll1586 and an asRNA to ndhH (slr0261). (B) Northern blot hybridizations based on high-resolution polyacrylamide gels and agarose gels. For each asRNA the hybridization (H), the corresponding lane in the RNA electrophoresis (R) and a molecular mass marker (M) is shown. As an additional experimental control, 5′ ends of the two new asRNAs were mapped by 5′ RACE to positions 1504239 (as_sll1586) and c1511138 (as_ndhH), providing a third line of evidence for the existence of these asRNAs (see also Table I). Download figure Download PowerPoint Results Large-scale analysis using a tiling microarray A tiling microarray was developed, covering all genes and intergenic regions for which a terminator, and thus a candidate asRNA or ncRNA, was predicted. As a control set, probes were designed for genes and intergenic regions without a prediction, covering approximately the same total size. The resulting 102 739 probes amount to an accumulated length of 1 441 146 nt in tiled probes in both orientations, which represent ∼40% of the chromosome. The arrays were hybridized in quadruplicates with pooled RNA from nine different conditions, such as exponential and stationary growth phase and different stress conditions (high light (HL), low light, 12 h incubation in the dark, iron and nitrogen depletion, heat and cold stress), to detect those transcripts, which are only induced under specific conditions. To avoid labeling artifacts from reverse transcription and second strand synthesis during cDNA synthesis (Perocchi et al, 2007), we labeled the RNA directly for microarray hybridization. Two additional microarrays were hybridized with genomic DNA and used for the normalization of signal intensities from individual probes as described by Huber et al (2006). The mapping of transcribed segments was carried out according to Huber et al (2006) yielding ∼2500 transcript segments with arbitrary expression values from −5 to +10 (see Supplementary information ‘Segmentation2500_final.pdf’). As evidence for low-level transcription of virtually every part of a bacterial genome has been provided (Selinger et al, 2000), we established a robust threshold at +1.0, leaving 646 transcript segments for closer inspection. As a positive control, IsrR (Duehring et al, 2006) was detected as one contiguous segment of the array (Figure 1 and Table I). The mapped 5′ end of IsrR is located 5 nt from the 5′ end of the transcript segment identified in the microarray, whereas its 3′ end is located 4 nt before the end of the last responding probe. These numbers yield a segment length of 186 nt compared with the fine-mapped asRNA length of 177 nt (Duehring et al, 2006), which is an excellent correlation for the chosen tiling factor. In the 20 kb genomic region, which also gives rise to the IsrR/isiA transcript pair, two further asRNAs were detected. The affected genes (as_sll1586 and as_ndhH) code for an unknown protein and NADH dehydrogenase subunit 7, respectively (Figure 1). From the 646 transcript segments above the expression threshold of +1.0, 432 corresponded to mono-, di-, and multicistronic mRNAs, whereas 60 originated from intergenic regions and were considered ncRNAs and 73 at least partially overlap sense transcripts and therefore were designated asRNAs (see Supplementary Table S1 for details). We also detected transcripts, which likely represent short mRNAs (labeled ‘new ORF’ in Supplementary Table S1) and are not included in the numbers of the candidate asRNAs and ncRNAs, nor the segments representing putative 5′ and 3′ UTRs (Figure 2). In all, 28 asRNA candidates (Table I) and seven putative ncRNAs (Table II) were chosen for further analysis by Northern blot hybridization and 5′ RACE. Furthermore, we determined the distribution of medium-level-expressed segments (expression value from +0.99 to 0.0). This group contains 542 segments, among them 389 mRNA segments, 51 UTRs, 84 putative asRNAs and 18 putative ncRNAs (Figure 2). Figure 2.Composition of the population of high- and medium-scoring transcript segments. Distribution of the 646 segments with a mean expression value in the top third group of expression signals and 542 medium scoring segments among different classes of RNA molecules. For details of the annotation of these segments see Supplementary Table S1. Download figure Download PowerPoint Table 2. Selected new or confirmed ncRNAs Array segment TSS Annotation Prediction Mean Strand FC References Start Stop Dark HL −CO2 2960896 2960952 2960898 Yfr1 t, c 9.4183 + − − − Voss et al (2007) 1832218 1832334 1832234 SRP RNA ffs None 9.3047 + − − − RFAM 3138669 3138773 c3138743 SyR5 t 8.7715 − − − − This study 2730501 2730626 2730523 Yfr2b c 8.3092 + +2.0±0.02 − − Voss et al (2009) 1671897 1672056 1671919 SyR1 t, c 7.1891 + − +8.8±0.59 +3.8±0.1 Voss et al (2009) 1816523 1816625 c1816602 SyR6 t 6.9568 − − +3.0±0.2 − This study 1518643 1518856 c1518816 and c1518836 5′UTR isiA (sll0247) and ncRNA None 6.6058 − − − − Duehring et al (2006) and this study 1431936 1431981 1431853 SyR2 t, c 5.9175 + −2.6±1.6 −7.0±0.2 −3.9±0.25 Voss et al (2009) 2512366 2512425 c2512423 SyR9 None 5.5599 − − −1.9±0.17 − This study 1748948 1749130 c1749138 SyR7 t 5.3804 − − −7.66±1.85 − This study 106687 106838 c106808 SyR8 t 3.9673 − − − − This study 727707 728258 c728041 SyR4 t, c 1.1912 − − − +1.9±0.22 This study 727492 728273 727885 and 728053 SyR3 t 0.4828 + − − − This study The start and stop positions of hybridizing segments within the Synechocystis chromosome are based on tiling microarray data, TSS as determined by 5′ RACE or taken from the references. The list has been sorted according to the average expression signals in the tiling microarray experiment (mean). Prediction: ‘t’ indicates prediction based on possible terminator, ‘c’, predicted in comparative analysis (Voss et al, 2009). The fold changes (FC) under three conditions were calculated from the expression microarray. See supplementary Table ‘Synarray.xls’ for the complete overview. Synechocystis transcripts expression levels The 15 most highly accumulating mRNAs (see Supplementary Table S1) in our tiling microarray originate from an intron-located endonuclease gene (slr0915), the photosynthetic genes psaAB (slr1834/slr1835), psbD2 (slr0927), psbD (sll0849), psbT (smr0001), and rbcL (slr0009), the cell division cycle gene slr0374, the groESL operon (slr2075_slr2076), the genes slr0742, sll0524, sll0623, and slr1667, the RNA-binding protein A gene rbpA (sll0517), the molybdopterin biosynthesis gene moeA (slr0900), as well as the iron-stress-induced protein A gene isiA (sll0247). We found 14 ncRNAs and 4 asRNAs within the same range of expression levels. These asRNAs are opposite to isiA, slr0320, sll1121, and sll1049 (Supplementary Table S1). Finding stress-induced genes such as isiA among the top-expressed genes is not an artifact, but results from the fact that we hybridized pooled RNA samples from cultures grown under nine different conditions. Assessing the reliability of the prediction strategy The transcription of many bacterial genes, and thus also of ncRNAs and asRNAs, finishes at a rho-independent terminator, which can be computationally predicted (see Materials and methods). Our terminator prediction identified 713 putative transcripts within all non-annotated sequences (intergenic and antisense). Assuming an average transcript length of 300 nt, ∼20% were completely intergenic (ncRNA candidates), whereas ∼80% were antisense to an annotated gene. The iron stress regulated asRNA IsrR (Duehring et al, 2006), as well as the small ncRNAs Yfr1 (Axmann et al, 2005; Voss et al, 2007), SyR1, and SyR2 (Voss et al, 2009), were among the predicted transcripts, indicating the reliability of this procedure. To evaluate the performance of the prediction strategy further, we compared its outcome against the results from the tiling microarrays. As the segmentation procedure could be erroneous in itself, we took the following approach: for each predicted terminator, we computed the mean normalized expression of probes within four 100 nt long segments, starting from the 5′ end of the terminator. For expression cut-offs ranging from 0 to 9, the number of terminators passing it was computed. Two background sets (one antisense-only, and one freely distributed) of randomly chosen segments of size 100 nt were handled the same way. Altogether, the analyses showed that there is a clear tendency of regions close to predicted terminators to have a higher mean expression. This is even more pronounced in the antisense-only analyses (Supplementary Figure S1). In absolute numbers, 11 out of 73 asRNAs and 27 out of 60 intergenic ncRNAs with a microarray expression level of at least +1, have been predicted here, based on the presence of a rho-independent terminator (Table II; Supplementary Table S1), including five ncRNAs reported earlier in a comparative genomics study (Voss et al, 2009). Examples for false-negatives include SyR9, the 5′ UTR of the isiA gene that accumulates in large quantities as an ∼160 nt small RNA (Duehring et al, 2006) and ffs, the ncRNA of the signal recognition particle (Table II). If all 60 segments identified in the array were real ncRNAs, the true-positive rate of the terminator-based prediction for this class of RNA molecules would be ∼45%. The higher true-positive rate for ncRNAs is reflected in their better terminator scores. In Figure 3, the free energy of the stem-loop (ΔGS) and the hybridization energy of the DNA/RNA-hybrid (ΔGH) in the transcribing RNA polymerase holoenzyme are plotted against each other for all predicted termi

Referência(s)