Artigo Acesso aberto Revisado por pares

Unraveling the features of somatic transposition in the Drosophila intestine

2021; Springer Nature; Volume: 40; Issue: 9 Linguagem: Inglês

10.15252/embj.2020106388

ISSN

1460-2075

Autores

Katarzyna Siudeja, Marius van den Beek, Nick Riddiford, Benjamin Boumard, Annabelle Wurmser, Marine Stefanutti, Sonia Lameiras, Allison J. Bardin,

Tópico(s)

CRISPR and Genetic Engineering

Resumo

Article26 February 2021Open Access Source DataTransparent process Unraveling the features of somatic transposition in the Drosophila intestine Katarzyna Siudeja Corresponding Author Katarzyna Siudeja [email protected] orcid.org/0000-0002-2522-7776 Institut Curie, CNRS, UMR 3215, INSERM U934, Stem Cells and Tissue Homeostasis Group, PSL Research University, Paris, France Sorbonne Universités, UPMC Univ Paris 6, Paris, France Search for more papers by this author Marius van den Beek Marius van den Beek orcid.org/0000-0002-9676-7032 Institut Curie, CNRS, UMR 3215, INSERM U934, Stem Cells and Tissue Homeostasis Group, PSL Research University, Paris, France Sorbonne Universités, UPMC Univ Paris 6, Paris, France Search for more papers by this author Nick Riddiford Nick Riddiford orcid.org/0000-0002-4739-4233 Institut Curie, CNRS, UMR 3215, INSERM U934, Stem Cells and Tissue Homeostasis Group, PSL Research University, Paris, France Sorbonne Universités, UPMC Univ Paris 6, Paris, France Search for more papers by this author Benjamin Boumard Benjamin Boumard Institut Curie, CNRS, UMR 3215, INSERM U934, Stem Cells and Tissue Homeostasis Group, PSL Research University, Paris, France Sorbonne Universités, UPMC Univ Paris 6, Paris, France Search for more papers by this author Annabelle Wurmser Annabelle Wurmser Institut Curie, CNRS, UMR 3215, INSERM U934, Stem Cells and Tissue Homeostasis Group, PSL Research University, Paris, France Sorbonne Universités, UPMC Univ Paris 6, Paris, France Search for more papers by this author Marine Stefanutti Marine Stefanutti Institut Curie, CNRS, UMR 3215, INSERM U934, Stem Cells and Tissue Homeostasis Group, PSL Research University, Paris, France Sorbonne Universités, UPMC Univ Paris 6, Paris, France Search for more papers by this author Sonia Lameiras Sonia Lameiras ICGex Next-Generation Sequencing Platform, Institut Curie, PSL Research University, Paris, France Search for more papers by this author Allison J Bardin Corresponding Author Allison J Bardin [email protected] orcid.org/0000-0002-0200-4465 Institut Curie, CNRS, UMR 3215, INSERM U934, Stem Cells and Tissue Homeostasis Group, PSL Research University, Paris, France Sorbonne Universités, UPMC Univ Paris 6, Paris, France Search for more papers by this author Katarzyna Siudeja Corresponding Author Katarzyna Siudeja [email protected] orcid.org/0000-0002-2522-7776 Institut Curie, CNRS, UMR 3215, INSERM U934, Stem Cells and Tissue Homeostasis Group, PSL Research University, Paris, France Sorbonne Universités, UPMC Univ Paris 6, Paris, France Search for more papers by this author Marius van den Beek Marius van den Beek orcid.org/0000-0002-9676-7032 Institut Curie, CNRS, UMR 3215, INSERM U934, Stem Cells and Tissue Homeostasis Group, PSL Research University, Paris, France Sorbonne Universités, UPMC Univ Paris 6, Paris, France Search for more papers by this author Nick Riddiford Nick Riddiford orcid.org/0000-0002-4739-4233 Institut Curie, CNRS, UMR 3215, INSERM U934, Stem Cells and Tissue Homeostasis Group, PSL Research University, Paris, France Sorbonne Universités, UPMC Univ Paris 6, Paris, France Search for more papers by this author Benjamin Boumard Benjamin Boumard Institut Curie, CNRS, UMR 3215, INSERM U934, Stem Cells and Tissue Homeostasis Group, PSL Research University, Paris, France Sorbonne Universités, UPMC Univ Paris 6, Paris, France Search for more papers by this author Annabelle Wurmser Annabelle Wurmser Institut Curie, CNRS, UMR 3215, INSERM U934, Stem Cells and Tissue Homeostasis Group, PSL Research University, Paris, France Sorbonne Universités, UPMC Univ Paris 6, Paris, France Search for more papers by this author Marine Stefanutti Marine Stefanutti Institut Curie, CNRS, UMR 3215, INSERM U934, Stem Cells and Tissue Homeostasis Group, PSL Research University, Paris, France Sorbonne Universités, UPMC Univ Paris 6, Paris, France Search for more papers by this author Sonia Lameiras Sonia Lameiras ICGex Next-Generation Sequencing Platform, Institut Curie, PSL Research University, Paris, France Search for more papers by this author Allison J Bardin Corresponding Author Allison J Bardin [email protected] orcid.org/0000-0002-0200-4465 Institut Curie, CNRS, UMR 3215, INSERM U934, Stem Cells and Tissue Homeostasis Group, PSL Research University, Paris, France Sorbonne Universités, UPMC Univ Paris 6, Paris, France Search for more papers by this author Author Information Katarzyna Siudeja *,1,2,†, Marius Beek1,2,†, Nick Riddiford1,2, Benjamin Boumard1,2, Annabelle Wurmser1,2, Marine Stefanutti1,2, Sonia Lameiras3 and Allison J Bardin *,1,2 1Institut Curie, CNRS, UMR 3215, INSERM U934, Stem Cells and Tissue Homeostasis Group, PSL Research University, Paris, France 2Sorbonne Universités, UPMC Univ Paris 6, Paris, France 3ICGex Next-Generation Sequencing Platform, Institut Curie, PSL Research University, Paris, France †These authors contributed equally to this work *Corresponding author. Tel: +33 1 56 24 65 62; E-mail: [email protected] *Corresponding author. Tel: +33 1 56 24 65 80; E-mail: [email protected] The EMBO Journal (2021)40:e106388https://doi.org/10.15252/embj.2020106388 PDFDownload PDF of article text and main figures. Peer ReviewDownload a summary of the editorial decision process including editorial decision letters, reviewer comments and author responses to feedback. ToolsAdd to favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures & Info Abstract Transposable elements (TEs) play a significant role in evolution, contributing to genetic variation. However, TE mobilization in somatic cells is not well understood. Here, we address the prevalence of transposition in a somatic tissue, exploiting the Drosophila midgut as a model. Using whole-genome sequencing of in vivo clonally expanded gut tissue, we have mapped hundreds of high-confidence somatic TE integration sites genome-wide. We show that somatic retrotransposon insertions are associated with inactivation of the tumor suppressor Notch, likely contributing to neoplasia formation. Moreover, applying Oxford Nanopore long-read sequencing technology we provide evidence for tissue-specific differences in retrotransposition. Comparing somatic TE insertional activity with transcriptomic and small RNA sequencing data, we demonstrate that transposon mobility cannot be simply predicted by whole tissue TE expression levels or by small RNA pathway activity. Finally, we reveal that somatic TE insertions in the adult fly intestine are enriched in genic regions and in transcriptionally active chromatin. Together, our findings provide clear evidence of ongoing somatic transposition in Drosophila and delineate previously unknown features underlying somatic TE mobility in vivo. SYNOPSIS Technical challenges with detecting somatic transposable element (TE) insertions have hampered determination of the extent of somatic transposition in adult tissues. The current study provides genomic evidence for endogenous retrotransposition events in the Drosophila intestine and uncovers underlying features of somatic TE mobility in vivo. A combination of short-read and long-read DNA sequencing reveals tissue-specific differences in genome-wide retrotransposon mobility. Somatic TE insertions in the gut are enriched in genic regions, as well as in active, enhancer-like chromatin. TE insertions in the tumor suppressor gene Notch in intestinal stem cells may contribute to the formation of gut neoplasia. Introduction Transposable elements (TEs) are DNA sequences that shape evolution through their capacity to amplify and mobilize, thereby altering the structural and regulatory landscape of the genome. Numerous mechanisms restrict the mobility of TEs and therefore their mutagenic potential. In germline and somatic cells, TE silencing is achieved by chromatin modifications and small RNA-directed degradation of TE transcripts (Molaro & Malik, 2016; Deniz et al, 2019; Cosby et al, 2019). The escape of TEs from silencing allows their propagation in the genome. While de novo TE insertions in the germline are relatively easy to detect as they result in heritable genomic changes that can be detected through sequencing, TE mobility in somatic cells is more difficult to study. Indeed, the heterogeneity of transposition events within somatic tissues imposes technical challenges as rare TE insertion events affecting a subpopulation of cells often fall below the limits of detection. Thus, the degree to which TEs evade silencing and contribute to somatic genome alteration is much less well understood in developing and adult tissues. Nonetheless, evidence for active somatic transposition has been recently mounting. Reporters of transposon activity suggested TE mobility in neuronal lineages in human, mouse, and Drosophila (Muotri et al, 2005; Coufal et al, 2009; Li et al, 2013; Macia et al, 2017; Chang et al, 2019). Additionally, recent use of an engineered gypsy retrotransposon trapping cassette in flies suggested that somatic transposition could also occur in non-neuronal tissues such as the fat body (Jones et al, 2016; Wood et al, 2016) or the intestine (Sousa-Victor et al, 2017). Interestingly, increased TE expression in many organisms has been linked to normal tissue aging as well as pathologic conditions of neurodegeneration. Evidence suggests that TE transcription may be linked to disease pathology, however, it remains unknown to what extent TE insertional activity contributes to these phenotypes (Dubnau, 2018; Burns, 2020). Nevertheless, the gypsy retrotransposon reporter activity was shown to increase in aging Drosophila brain, fat body, and gut (Li et al, 2013; Jones et al, 2016; Wood et al, 2016; Sousa-Victor et al, 2017; Chang et al, 2019), correlating in some cases with increased DNA damage, and suggesting that TE insertional activity could indeed play a role in age-related deterioration of somatic tissues. However, a major drawback of using engineered reporters is that reporter cassettes could be inactivated by other means than a TE insertion. In addition, the available transgenic lines only report a limited number of TE families. Finally, results obtained with engineered reporters may not necessarily reflect the activity of endogenous elements encoded in the genome. Genomic sequencing has provided some direct evidence for endogenous somatic retrotransposition though it has almost exclusively focused on the retrotransposition of LINE1 (L1) elements in human cancers (Lee et al, 2012; Solyom et al, 2012; Tubio et al, 2014; Rodić et al, 2015; Doucet-O'Hare et al, 2016; Tang et al, 2017; Rodriguez-Martin et al, 2020) or in human and rodent neuronal tissues (Baillie et al, 2011; Evrony et al, 2012; Upton et al, 2015). However, the first reports of high L1 transposition frequencies in mammalian brains were later shown to be overestimated due to artifacts of sequencing methodology and data analysis (Evrony et al, 2016). Similarly, in Drosophila, endogenous somatic TE mobility remains controversial as sequencing performed on populations of adult fly neurons failed to identify true insertions among multiple technical artifacts (Perrat et al, 2013; Treiber & Waddell, 2017). Thus, the true extent to which diverse classes of TEs affect genomes of somatic tissues remains to be addressed. Moreover, due to low numbers of somatic insertions recovered thus far from non-cancerous conditions, integration site enrichments of TEs in normal tissues in vivo are not well understood. Finally, a genetically amenable model system to reliably study somatic transposition is currently lacking. We have previously established the Drosophila midgut as a model system to address the prevalence of somatic mutation in an adult self-renewing tissue (Siudeja et al, 2015). The fly midgut is maintained by a population of intestinal stem cells (ISCs) that divide to self-renew and give rise to two differentiated cell types: absorptive enterocytes (ECs) and secretory enteroendocrine cells (EEs) (Micchelli & Perrimon, 2006; Ohlstein & Spradling, 2006). Our previous study demonstrated that ISCs acquire genetic mutations including deletions and complex rearrangements, which have important physiological impact on the tissue (Siudeja et al, 2015). Here, we make use of the fly intestine to demonstrate the contribution of TEs to the somatic genetic variation of an adult tissue. Using whole-genome sequencing of clonally expanded gut neoplasia, we reveal ongoing somatic retrotransposition in the fly midgut. We identify de novo TE insertions in the tumor suppressor gene Notch, likely contributing to its inactivation and neoplasia formation. Additionally, we apply Oxford Nanopore long-read sequencing of non-clonal healthy adult tissues to provide evidence of tissue-specific differences in retrotransposition. Based on hundreds of high-confidence de novo transposition events, we uncover non-random distribution of somatic TE insertion sites in the gut tissue. Transposition occurs throughout the genome and somatic insertions are enriched in genic regions as well as active, enhancer-like chromatin. Overall, by providing direct DNA sequencing-based evidence for de novo somatic TE insertions, we uncover novel features of their in vivo biology. Results Somatic TE insertions in the Notch gene identified in spontaneous intestinal neoplasia We have previously shown that somatic mutations occur frequently in intestinal stem cells (ISCs) and that the spontaneous inactivation of a tumor suppressor Notch in male adult ISCs drives the clonal expansion of mutant cells and formation of gut neoplasia (Siudeja et al, 2015). Since Notch is located on the X chromosome and as such is present in a single copy in males, a single "hit" can lead to its inactivation (Fig 1A). In contrast, females, harboring two copies of Notch, do not or very rarely develop similar spontaneous Notch inactivation events. Male neoplasia can be easily distinguished by the clonal accumulation of two intestinal cell types: ISCs expressing Delta and enteroendocrine cells (EEs) marked by Prospero. Our initial sequencing analysis of clonal neoplasia isolated from ProsGAL4 UAS-2xGFP (hereafter abbreviated as Pros > 2xGFP) male flies, revealed inactivation of Notch by large deletions or complex genomic rearrangements (Siudeja et al, 2015). In order to expand this analysis and better characterize distinct types of somatic mutations that impact adult ISCs, we generated a large dataset of whole-genome paired-end Illumina sequencing of an additional 30 clonal neoplasia from the same genetic background, as well as four clonal neoplasia from DeltaGAL4 UAS-nlsGFP male flies (hereafter abbreviated as Delta > nlsGFP), for a total of 37 clonal samples and matched control head DNA sequenced with an average of 47x coverage (Fig 1A and Table EV1). In our analysis, we compared clonal gut samples to their respective head controls to identify somatic events arising only in the gut samples. These data are also analyzed by companion paper that addresses structural variation in the same model system (see Materials and Methods and also preprint: Riddiford et al, 2020). As expected, a majority of clonal samples showed evidence for inactivation of the Notch pathway by somatic deletions or complex rearrangements (for details see preprint: Riddiford et al, 2020). Interestingly, four samples (P15, P47, P51, and D5) did not harbor any other mutation that could explain the clonal expansion, but showed evidence of somatic TE sequence inserted in Notch (Fig 1B and C). Due to very limited sample material, we could not perform simultaneous RNA expression analysis in order to directly demonstrate the effect of TE inserts on Notch expression. However, as we did not detect evidence of other genetic alteration of Notch or Notch pathway components, we concluded that the TE insertions were most likely causative of the clonal expansion and Notch mutant phenotype. Strikingly, in sample P15, we observed two integrations within Notch (Fig 1C), with one of the two events having more sequencing reads supporting the insertion than the other, suggesting that the first insertion inactivated Notch, while the second one occurred later during the clonal expansion. All candidate insertions were supported both by clipped reads mapping partially to a TE and partially to Notch, and paired-end reads where one mate-pair is TE anchored and the other is mapped to Notch (Fig 1B). Among the five candidate insertions identified, three were within the UTR regions of the gene and two TE integrations were in intronic sequences (Fig 1C). For all cases described, no read evidence was found for an insertion in the matched head DNA controls. Thus, TE insertions appeared specific for the clonal gut DNA, suggesting they occurred in somatic gut tissue (Fig 1B). Figure 1. Somatic TE insertions in Notch in spontaneous male neoplasia A. The fly intestine is maintained by the Intestinal Stem Cells (ISCs). In male flies, carrying one X chromosome, the tumor suppressor gene Notch is present in a single copy. Inactivation of Notch in a stem cell (in green) leads to a clonal expansion of the mutant cell and neoplasia. The neoplastic gut region was microdissected together with the head of the same fly. DNA isolated from both tissues was subjected to whole-genome paired-end sequencing. B. An Integrative Genomics Viewer (IGV) screenshot of the Notch de novo TE insertion site from sample P47 (clonal neoplasia) and its head control, sample P48. Bars represent sequencing reads. Reads supporting the TE insertion are colored according to homology to a specific TE insertion sequence. Multiple colors at a putative insert site frequently indicate homology to different reference copies of the same TE family. Two types of supporting reads can be seen: soft-clipped reads spanning the insertion site and mapping partially to the reference genome and partially to the TE, and mate-pair support reads—flanking the insertion site and mapping to the reference genome but with mates (not seen) mapping to a TE. C. The Notch locus and the identified somatic TE insertion sites indicated with vertical arrows. Black bars represent exons. Insertions in Notch were identified in three out of 33 clonal samples from the Pros > 2xGFP genetic background and in one out of 4 Dl > nslGFP samples. D. PCR validation of four somatic, neoplasia-specific TE insertions. Primers were designed to target regions flanking the insertion sites. Yellow arrowheads indicate PCR products containing an insertion amplified in the clonal DNA but not in the neighboring gut tissue (non-clonal), head or thorax for the same fly. Short wild-type amplicon was detectable in all samples. Thorax DNA sample was not available for sample P15. E. Sanger sequencing of the TE insertion breakpoints in the 3'UTR of Notch from sample P47. The rover LTR element was inserted in a reverse orientation to Notch. The 5' LTR sequence was truncated by 13 bp. Vertical dashed lines indicate insertion breakpoints. LTR—long terminal repeat. Source Data for Figure 1D [embj2020106388-sup-0011-SDataFig1D.png] Download figure Download PowerPoint To validate the Notch TE insertions, we designed primer pairs flanking the identified insertion sites and performed a full-length or one-sided PCR amplification using the original genomic DNA as a template (Fig 1D and Appendix Fig S1). Out of the five candidate Notch TE insertions, all were PCR validated (4 full-length and 1 one-sided validation). All insertions were amplified only from the clonal neoplastic DNA and not the DNA of matched control tissues from the same fly, confirming that these were true neoplasia-specific somatic TE insertions. Finally, all insertions were partially or fully sequenced-verified by Sanger sequencing (Fig 1E and Appendix Fig S1). Altogether, these data revealed that TEs actively transpose in the adult midguts. Importantly, TEs can insert into the Notch tumor suppressor gene in stem cells, likely driving neoplastic growth in male flies. Retrotransposition occurs genome-wide in the fly midgut Having identified that TEs are mobile in the fly midgut and likely inactivate the Notch locus, we then aimed to address the prevalence of somatic transposition on a genome-wide scale. To precisely map somatic TE insertions from our short-read sequencing data, we developed a dedicated pipeline (Fig 2A, details in Materials and Methods) and applied it to neoplastic and matched control samples. Figure 2. Retrotransposition occurs genome-wide in the fly midgut A. The bioinformatic pipeline used to identify somatic TE insertions in short-read sequencing datasets. Two types of supporting reads are identified genome-wide: mate support reads, where one of the paired-end reads is mapped to the reference genome, while the other mate (not shown) is associated with a TE, and clipped reads, which span the insertion site and map partially to the reference genome and partially to a TE. Isolated reads were then clustered and assembled to map individual insertion sites. Only insertions with a valid target site duplication (TSD) were retained, sample-specific calls were filtered, and manual validation of each candidate insertion was performed on IGV. B. The frequency of gut-specific somatic insertion sites in the Pros > 2xGFP and Delta > nlsGFP genetic backgrounds. C. The distribution of TE classes active in the two genetic backgrounds studied. TEs were categorized in four main classes: LTR—long terminal repeat retrotransposons, LINE-like—non-LTR retrotransposons, TIR—terminal inverted repeat DNA transposons, and FB—foldback element. D. TSD length distribution for somatic insertions of most frequent TE families. Insertions from both genotypes were pooled. E. The target site motif found around (± 10bp) rover LTR insertion sites recovered from the clonal gut samples. (E-value was calculated with MEME (Bailey et al, 2009), where E-value < 0.05 is considered statistically significant). Data information: In (B and D), red lines represent median values. Source data are available online for this figure. Source Data for Figure 2 [embj2020106388-sup-0012-SDataFig2.xlsx] Download figure Download PowerPoint For further analysis, we retained only insertions bearing a target site duplication (TSD) as a footprint of transposition-dependent events. TSDs are short, identical, duplicated sequences generated on both sides of a TE insertion as a consequence of a staggered endonuclease cut of the target DNA (Feng et al, 1996). We identified a total of 674 (median of 15 per clonal genome) somatic insertions with TSDs from the Pros > 2xGFP background and 97 (median of 23 per clonal genome) integrations in the Delta > nlsGFP samples, all of which were private to gut clonally amplified samples and not present in the matched control DNA, or any of the controls (Fig 2B, Table EV2). In both genetic backgrounds, a great majority of identified insertions were retrotransposons (Fig 2C), suggesting that this TE class is the most mobile in the gut tissue. In the Pros > 2xGFP background, the most abundant were insertions of rover elements (487 insertions), followed by copia (102 insertions), diver (7 insertions), blood (5 insertions), roo (4 insertions), and sporadic insertions of other LTR TE families (Fig 2C). Among non-LTR retroelements, we identified de novo integrations of LINE-like retrotransposons, including 32 de novo insertions of I-elements. Insertions of terminal inverted repeat (TIR) DNA elements and foldback elements were infrequent (Fig 2C). In the Delta > nlsGFP background, we mapped 16 insertions of copia elements, 10 roo integrations, followed by 297 (9 insertions), opus (8 insertions), mdg1 and Tabor (7 insertions each), and other LTR TE families (Fig 2C). Rare integrations of LINE-like elements and DNA TIR class TEs were also found. Although we observed varying levels of transposition, there were no striking differences in the types of mobile TEs between samples of the same genetic background, suggesting that active TEs did not differ greatly between individuals (Fig EV1). In contrast, differences in mobile TE families were evident between the two genotypes, suggesting that the repertoire of somatically mobile TEs likely depends on the genetic background. However, we cannot exclude that some observed differences in mobility may have resulted from the differences in cell-type-specific clone composition between the two genotypes, with either an enrichment of enteroendocrine cells (ProsGal4 driven GFP) or intestinal stem cells (DeltaGal4-driven GFP). Click here to expand this figure. Figure EV1. Class distribution of mobile TE families in all clonal samples sequenced TEs were categorized in four main classes: LTR—long terminal repeat retrotransposons (in red), LINE-like—non-LTR retrotransposons (in blue), TIR—terminal inverted repeat DNA transposons (in green), and FB—foldback element (in orange). For each class, one or two most active families are highlighted with dark colors. Download figure Download PowerPoint To further confirm whether the identified TE insertions were indeed true transposition events, we analyzed somatic TSDs for all TE families which produced at least six de novo insertions and compared these with known germline TSDs. Most LTR elements generated short TSDs with a median length below 10 base-pairs (5 bp for rover, copia, roo, 297 and diver; and 3 bp for blood), consistent with TSD lengths reported previously for germline insertions of LTR elements (Dunsmuir et al, 1980; Linheiro & Bergman, 2012) (Fig 2D). Three LTR elements, opus, Tabor and mdg1, produced unexpectedly long TSDs with a median of 23, 26, and 25 bp, respectively, in contrast to 4 bp reported previously (Linheiro & Bergman, 2012). However, with relatively low numbers of somatic insertions of these TE families, it is difficult to conclude if this discrepancy with previously published reports could be biologically relevant. TSDs generated by LINE-like elements were, in general, less strictly defined but centered above 10bp (median of 12, 25, and 11 for I-, F-, and Doc-elements, respectively, Fig 2D), in agreement with previous reports (Bucheton et al, 1984; Sang et al, 1984; Driver et al, 1989; Berezikov et al, 2000). Finally, we searched for target site motifs of the most represented TEs. A highly significant (AT)-rich target site sequence motif around insertions sites was identified for the rover LTR element reflecting non-random integration (Fig 2E). Although there are no previous reports about target site preferences of rover elements, TEs from closely related classes (such as 297 or 17.6) show similar (AT)-rich target motives (Whalen & Grigliatti, 1998; Bowen & McDonald, 2001; Linheiro & Bergman, 2012). The second most mobile element in our datasets, copia, did not show target site preference, which is consistent with previous reports from germline analyses (Dunsmuir et al, 1980). Altogether, our data show that genome-wide somatic TE integration sites have similar characteristics to germline insertions. This lends further support to the detected TE insertions in the gut being true somatic transposition events, rather than random DNA integrations or products of chimeric reads. Notably, using our detection criteria, we identified only rare somatic TE insertions in the head samples of both genotypes sequenced (median of 2 insertion/sample in Pros > 2xGFP heads and 2.5 insertions/sample in Dl > nlsGFP heads, Table EV3 and Appendix Fig S2). However, the frequency of transposition between gut and head samples cannot be directly compared in this assay. Indeed, the head is a heterogeneous cell population, and therefore, somatic transposition in a few cells of the head would be below the detection level in our analyses. In contrast, the intestinal neoplasia are clonal expansions of single ISC genomes, increasing likelihood of detecting TE insertions. Accordingly, the rare somatic insertions identified in head samples had only a few clipped and mate-pair supporting reads, reflecting that these were likely rare events present in limited numbers of cells (Appendix Fig S2C). This difficulty to detect TE insertions in non-clonal fly head DNA is also in agreement with recently published data (Treiber & Waddell, 2017). Because single cell insertions are unlikely to be detectable in our assay, we believe that the identified head insertions probably occurred during brain development leading to a small clone of cells harboring the TE insert, rather than in an adult fly brain, which is post-mitotic. Alternatively, they could represent rare but recurrent insertions arising independently in multiple cells of the adult fly brain. Overall, we conclude that somatic retrotransposition in the fly midgut is not limited to the Notch locus, but occurs genome-wide. LTR elements are the most active, while LINE-like retrotransposons mobilize less frequently. Although TE families identified as the most mobile can differ between fly strains, our data suggest that retrotransposons are frequently active in gut tissue. TE insertions arise before and after the clonal expansion To better understand when somatic transposition occurs in the fly gut, we then used allele frequencies to estimate the timing of genome-wide de novo integrations identified in clonal samples relative to the event inactivating Notch and initiating the clonal expansion (Fig 3A). The allele frequency is the ratio of sequencing reads supporting and opposing any given insertion. Assuming the observed allele frequency represents the true allele frequency in the cell population, it can be used as an estimate of transposition timing. A TE inser

Referência(s)