Advances in methods for tRNA sequencing and quantification
2023; Elsevier BV; Volume: 40; Issue: 3 Linguagem: Inglês
10.1016/j.tig.2023.11.001
ISSN1362-4555
AutoresNigam H. Padhiar, Upendra Katneni, Anton A. Komar, Yuri Motorin, Chava Kimchi‐Sarfaty,
Tópico(s)Genomics and Phylogenetic Studies
ResumoKey challenges in tRNA next-generation sequencing (NGS) library preparations include efficient adapter ligation and overcoming extensive reverse transcriptase-blocking modifications and a highly stable RNA secondary structure.Computational challenges include accounting for the potential modifications in each tRNA and uniquely mapping each read to a reference, which can comprise hundreds of highly similar tRNA genes.Several recent applications forgo tRNA-seq in favor of hybridization-based approaches; however, hybridization-based approaches continued to be developed and improved.Validation approaches remain an issue among tRNA-seq studies – it is difficult to directly compare tRNA NGS datasets given their varied library preparation approaches, and many non-NGS based methods of corroboration are relatively low throughput. In the past decade tRNA sequencing (tRNA-seq) has attracted considerable attention as an important tool for the development of novel approaches to quantify highly modified tRNA species and to propel tRNA research aimed at understanding the cellular physiology and disease and development of tRNA-based therapeutics. Many methods are available to quantify tRNA abundance while accounting for modifications and tRNA charging/acylation. Advances in both library preparation methods and bioinformatic workflows have enabled developments in next-generation sequencing (NGS) workflows. Other approaches forgo NGS applications in favor of hybridization-based approaches. In this review we provide a brief comparative overview of various tRNA quantification approaches, focusing on the advantages and disadvantages of these methods, which together facilitate reliable tRNA quantification. In the past decade tRNA sequencing (tRNA-seq) has attracted considerable attention as an important tool for the development of novel approaches to quantify highly modified tRNA species and to propel tRNA research aimed at understanding the cellular physiology and disease and development of tRNA-based therapeutics. Many methods are available to quantify tRNA abundance while accounting for modifications and tRNA charging/acylation. Advances in both library preparation methods and bioinformatic workflows have enabled developments in next-generation sequencing (NGS) workflows. Other approaches forgo NGS applications in favor of hybridization-based approaches. In this review we provide a brief comparative overview of various tRNA quantification approaches, focusing on the advantages and disadvantages of these methods, which together facilitate reliable tRNA quantification. tRNAs are implicated in many processes, including translation [1.Doherty J. Guo M. Transfer RNA.in: Bradshaw R.A. Stahl P. Encyclopedia of Cell Biology. Academic Press, 2016: 309-340Crossref Scopus (3) Google Scholar], amino acid metabolism [2.Dong J. et al.Uncharged tRNA activates GCN2 by displacing the protein kinase moiety from a bipartite tRNA-binding domain.Mol. Cell. 2000; 6: 269-279Abstract Full Text Full Text PDF PubMed Scopus (364) Google Scholar,3.Pavlova N.N. et al.Translation in amino-acid-poor environments is limited by tRNA.eLife. 2020; 9e62307Crossref PubMed Scopus (25) Google Scholar], metabolite synthesis [4.Katz A. et al.Non-canonical roles of tRNAs and tRNA mimics in bacterial cell biology.Mol. Microbiol. 2016; 101: 545-558Crossref PubMed Scopus (45) Google Scholar], priming the reverse transcription (RT) of viral RNAs [5.Marquet R. et al.tRNAs as primer of reverse transcriptases.Biochimie. 1995; 77: 113-124Crossref PubMed Scopus (206) Google Scholar], and tRNA cleavage leading to the production of regulatory noncoding RNAs termed tRNA-derived small RNAs (tsRNAs) [6.Akiyama Y. Ivanov P. tRNA-derived RNAs: biogenesis and roles in translational control.Wiley Interdiscip. Rev. RNA. 2023; 14e1805Crossref PubMed Scopus (3) Google Scholar]. Given its multifaceted involvement in cellular physiology, tRNA dysregulation affecting their abundance and function may lead to various human diseases including different types of cancers [7.Orellana E.A. et al.tRNA dysregulation and disease.Nat. Rev. Genet. 2022; 23: 651-664Crossref PubMed Scopus (46) Google Scholar], leukodystrophy [8.Schaffer A.E. et al.tRNA metabolism and neurodevelopmental disorders. Ann. Rev.Genomics Hum. Genet. 2019; 20: 359-387Crossref PubMed Scopus (57) Google Scholar], cell-cycle arrest [9.Aharon-Hefetz N. et al.Manipulation of the human tRNA pool reveals distinct tRNA sets that act in cellular proliferation or cell cycle arrest.eLife. 2020; 9e58461Crossref PubMed Google Scholar], and a wide array of diseases related to cellular energetics [7.Orellana E.A. et al.tRNA dysregulation and disease.Nat. Rev. Genet. 2022; 23: 651-664Crossref PubMed Scopus (46) Google Scholar], as well to as diseases of other species such as the deadly rice fungus Magnaporthe oryzae [10.Li G. et al.Unconventional secretion of Magnaporthe oryzae effectors in rice cells is regulated by tRNA modification and codon usage control.Nat. Microbiol. 2023; 8: 1706-1716Crossref PubMed Scopus (6) Google Scholar]. Thus, methods allowing reliable and accurate tRNA quantification are becoming of immense importance for understanding both the normal functioning of the cell and various pathophysiological conditions. Tools for the quantification of tRNAs can be broadly divided into NGS-based approaches and non-NGS approaches (mostly hybridization). Regardless of the approach, there are many challenges for the accurate assessment of tRNA levels, given the distinctive highly conserved secondary structure and chemical modifications of tRNA molecules. tRNAs are heavily modified, and feature several base and ribose methylations, as well as pseudouridine (ψ) and other complex modifications [11.Boccaletto P. et al.MODOMICS: a database of RNA modification pathways. 2021 update.Nucleic Acids Res. 2021; 50: D231-D235Crossref Scopus (341) Google Scholar]. These RNA modifications can lead to premature RT stops in the process of complementary DNA (cDNA) synthesis [12.Pinkard O. et al.Quantitative tRNA-sequencing uncovers metazoan tissue-specific tRNA regulation.Nat. Commun. 2020; 11: 4104Crossref PubMed Scopus (73) Google Scholar,13.Motorin Y. et al.Identification of modified residues in RNAs by reverse transcription-based methods.Methods Enzymol. 2007; 425: 21-53Crossref PubMed Scopus (173) Google Scholar] in NGS applications and can also hinder hybridization in hybridization-based approaches as a result of impaired Watson–Crick base-pairing with oligonucleotide probes [14.Orioli A. tRNA biology in the omics era: stress signalling dynamics and cancer progression.BioEssays. 2017; 391600158Crossref PubMed Scopus (27) Google Scholar,15.Hiley S.L. et al.Detection and discovery of RNA modifications using microarrays.Nucleic Acids Res. 2005; 33e2Crossref Scopus (41) Google Scholar]. In addition, owing to the variable nature of modifications and the overall diversity of highly similar but still unique tRNA species, which can exceed 400 unique molecules in humans [16.Behrens A. et al.High-resolution quantitative profiling of tRNA abundance and modification status in eukaryotes by mim-tRNAseq.Mol. Cell. 2021; 81: 1802-1815Abstract Full Text Full Text PDF PubMed Scopus (89) Google Scholar], special considerations must be taken into account during quantification to ensure accuracy in differentiating between isodecoders or in mapping to the genome. In the following we review select protocols in the tRNA quantification space with a particular focus on NGS approaches, and discuss the associated challenges. These include demethylase-thermostable group II intron RT tRNA sequencing (DM-TGIRT-seq) [17.Zheng G. et al.Efficient and quantitative high-throughput tRNA sequencing.Nat. Methods. 2015; 12: 835-837Crossref PubMed Scopus (351) Google Scholar], AlkB-facilitated RNA methylation sequencing (ARM-seq) [18.Cozen A.E. et al.ARM-seq: AlkB-facilitated RNA methylation sequencing reveals a complex landscape of modified tRNA fragments.Nat. Methods. 2015; 12: 879-884Crossref PubMed Scopus (294) Google Scholar], Hydro-tRNAseq [19.Gogakos T. et al.Characterizing expression and processing of precursor and mature human tRNAs by Hydro-tRNAseq and PAR-CLIP.Cell Rep. 2017; 20: 1463-1475Abstract Full Text Full Text PDF PubMed Scopus (134) Google Scholar], Y-shaped adapter-ligated mature tRNA sequencing (YAMAT-seq) [20.Shigematsu M. et al.YAMAT-seq: an efficient method for high-throughput sequencing of mature transfer RNAs.Nucleic Acids Res. 2017; 45e70PubMed Google Scholar], long hairpin oligonucleotide-based tRNA high-throughput sequencing (LOTTE-tRNAseq) [21.Erber L. et al.LOTTE-seq (long hairpin oligonucleotide based tRNA high-throughput sequencing): specific selection of tRNAs with 3′-CCA end for high-throughput sequencing.RNA Biol. 2020; 17: 23-32Crossref PubMed Scopus (15) Google Scholar], quantitative mature tRNA sequencing (QuantM-tRNAseq) [12.Pinkard O. et al.Quantitative tRNA-sequencing uncovers metazoan tissue-specific tRNA regulation.Nat. Commun. 2020; 11: 4104Crossref PubMed Scopus (73) Google Scholar], modification-induced misincorporation tRNA sequencing (mim-tRNAseq) [16.Behrens A. et al.High-resolution quantitative profiling of tRNA abundance and modification status in eukaryotes by mim-tRNAseq.Mol. Cell. 2021; 81: 1802-1815Abstract Full Text Full Text PDF PubMed Scopus (89) Google Scholar], Nano-tRNAseq [22.Lucas M.C. et al.Quantitative analysis of tRNA abundance and modifications by nanopore RNA sequencing.Nat. Biotechnol. 2023; (Published online April 6, 2023. https://doi.org/10.1038/s41587-023-01743-6)Crossref PubMed Scopus (28) Google Scholar], and adapter-ligated libraries of tRNA-derived sequences (ALL-tRNAseq) [23.Scheepbouwer C. et al.ALL-tRNAseq enables robust tRNA profiling in tissue samples.Genes Dev. 2023; 37: 243-257Crossref PubMed Scopus (8) Google Scholar]. In the following we present an overview of library preparation strategies in tRNA-seq; these are summarized in Figure 1 (Key figure) . The first essential step for tRNA quantification analysis is the isolation of RNA. Depending on the properties of the biological material, extraction of total RNA can be achieved either by general extraction protocols, such as phenol extraction used for bacteria or TRIzolTM extraction that is common for higher eukaryotic cells in culture [24.Rio D.C. et al.Purification of RNA using TRIzol (TRI Reagent).Cold Spring Harbor Protoc. 2010; 2010 (pdb.prot5439)Crossref Scopus (595) Google Scholar]. Difficult-to-disrupt cells such as Saccharomyces cerevisiae require harsher procedures such as acid phenol extraction [25.Green M.R. Sambrook J. Total RNA extraction from Saccharomyces cerevisiae using hot acid phenol.Cold Spring Harbor Protoc. 2021; 2021 (pdb.prot101691)Crossref Scopus (3) Google Scholar]. For many bacterial cells, simple TRIzol extraction often results in a highly enriched tRNA fraction [26.Galvanin A. et al.Mapping and quantification of tRNA 2′-O-methylation by RiboMethSeq.in: Wajapeyee N. Gupta R. Epitranscriptomics: Methods and Protocols. Springer, 2019: 273-295Crossref Scopus (12) Google Scholar]. Few selected protocols discussed in this review have a specific focus on tRNA in the RNA isolation step. DM-TGIRT-seq, ARM-seq, and ALL-tRNAseq utilize the mirVanaTM miRNA isolation kit which is centered on an efficient glass fiber filter (GFF)-based approach. Other protocols initially rely on common TRIzolTM or TRIsureTM extraction and later employ specific tRNA isolation techniques. An appropriate library preparation protocol plays an essential role in preparing input RNA for tRNA quantification by deep sequencing. In some protocols the input RNA is treated before proceeding with adapter ligation. Hydro-tRNAseq uniquely employs limited alkaline hydrolysis of the purified tRNA to isolate fragments of 19–35 nt that will have a less complex secondary structure and fewer modifications that could induce RT stops [19.Gogakos T. et al.Characterizing expression and processing of precursor and mature human tRNAs by Hydro-tRNAseq and PAR-CLIP.Cell Rep. 2017; 20: 1463-1475Abstract Full Text Full Text PDF PubMed Scopus (134) Google Scholar]. However, fragments carrying RT hindering modifications can be overlooked in this approach owing to generation of short abortive cDNAs (see later) [14.Orioli A. tRNA biology in the omics era: stress signalling dynamics and cancer progression.BioEssays. 2017; 391600158Crossref PubMed Scopus (27) Google Scholar,16.Behrens A. et al.High-resolution quantitative profiling of tRNA abundance and modification status in eukaryotes by mim-tRNAseq.Mol. Cell. 2021; 81: 1802-1815Abstract Full Text Full Text PDF PubMed Scopus (89) Google Scholar]. DM-TGIRT-seq, ALL-tRNAseq, and ARM-seq all incorporate demethylation of input tRNA as key steps in their protocols, and the first two use a combination of wild-type Escherichia coli alkylation B (wtAlkB) and D135S mutant AlkB, and the latter only uses wtAlkB. This mutant AlkB is able to demethylate N2,N2-dimethylguanosine (m2,2G), which is not an ideal substrate for wtAlkB [27.Wang Y. et al.A high-throughput screening method for evolving a demethylase enzyme with improved and new functionalities.Nucleic Acids Res. 2021; 49e30Google Scholar]. Altogether, these modifications improve RT read-through if a highly processive enzyme such as MarathonRT or TGIRT is not used, but will eliminate the potential to detect such tRNA modifications through misincorporation or RT-stop analysis, unless an untreated sample is also sequenced for comparison (which the authors of DM-TGIRT-seq and ARM-seq did). These protocols can also introduce biased representation since only a subset of such modified residues are effectively removed [28.Clark W.C. et al.tRNA base methylation identification and quantification via high-throughput sequencing.RNA. 2016; 22: 1771-1784Crossref PubMed Scopus (127) Google Scholar]. As a result of the maturation pathways of tRNAs, they mostly have 5′-P and 3′-OH extremities [29.Berg M.D. Brandl C.J. Transfer RNAs: diversity in form and function.RNA Biol. 2021; 18: 316-339Crossref PubMed Scopus (35) Google Scholar] that are directly compatible with ligation of adapters required for sequencing. It is important to note that efficient ligation to 3′-OH extremities requires preliminary deacylation of tRNAs because the presence of an attached amino acid at the tRNA 3′-CCA end compromises the ligation of adapter [30.Czech A. Deep sequencing of tRNA's 3′-termini sheds light on CCA-tail integrity and maturation.RNA. 2020; 26: 199-208Crossref PubMed Scopus (10) Google Scholar]. The adapter ligation protocols in our cohort can be split into four broad categories, as follows: strategy A, separate ligation of 3′ and 5′ adapters; strategy B, 3′ adapter ligation followed by RT, which is then followed by single-stranded (ss)DNA ligation of DNA oligonucleotide to the 3′ extremity of the cDNA after RT; strategy C, 3′ adapter ligation followed by circularization of the cDNA; and strategy D, incorporation of both 5′ and 3′ adapters (and priming sites) into a double-stranded oligonucleotide (Figure 2). ALL-tRNAseq, Hydro-tRNAseq, and ARM-seq employ strategy A; LOTTE-seq uses strategy B; DM-TGIRT-seq, mim-tRNAseq, and QuantM-tRNAseq use strategy C (although DM-TGIRT-seq does not use ligation enzymes, its overall protocol best fits with this strategy); YAMAT-seq uses strategy D; and Nano-tRNAseq uses a combination of strategies B and D because it includes a second primer on a double-stranded oligonucleotide and a final adapter ligation after the RT step. Within these broad categorizations, there are differences in the choice of ligation enzyme, adapter construction, and RT enzyme, the details of which can be found in Figure 1, Figure 2. In the vast majority of available protocols, the 3′ DNA adapter is 5′-pre-adenylated (avoiding the necessity of ATP for ligation) and 3′-blocked to avoid the formation of concatemers. Key enzymes for 3′ adapter ligation include T4 RNA ligase 2 (RNL2), truncated T4 RNA ligase 2 (RNL2tr), and truncated K227Q mutant T4 RNA ligase 2 (RNL2trkq) – the truncated form has increased affinity for joining the 5′ ends of adenylated adapters to the 3′ ends of RNA, and the mutated form further reduces unwanted ligation side products [31.Viollet S. et al.T4 RNA ligase 2 truncated active site mutants: improved tools for RNA analysis.BMC Biotechnol. 2011; 11: 72Crossref PubMed Scopus (46) Google Scholar]. LOTTE-seq and Nano-tRNAseq are the only protocols to make use of DNA ligase (DNL), and the first uses it for ligation of a DNA hairpin adapter to tRNA, whereas the second uses it to anneal a double-stranded (ds)DNA oligonucleotide containing Oxford Nanopore Technology (ONT) RT adapters to a RNA–DNA hybrid Y-shaped splint adapter (which itself was originally ligated with RNL2). DNL is more specific than RNL, thus reducing the production of undesirable ligation products [32.Kershaw C.J. O'Keefe R.T. Splint ligation of RNA with T4 DNA ligase.in: Conn G.L. Recombinant and In Vitro RNA Synthesis: Methods and Protocols. Humana Press, 2013: 257-269Crossref Scopus (33) Google Scholar]. A commonality in the protocols using double-stranded adapter ligation strategies (all of which are grouped in different strategies) – YAMAT-seq, QuantM-tRNAseq, Nano-tRNAseq, and LOTTE-seq – is the use of ribonucleotide and/or deoxyribonucleotide overhangs to better hybridize with mature tRNA species. ALL-tRNAseq, while not using double-stranded ligation, uniquely employs a randomized 4 nt 5′ end for its 3′ adapter, which could aid in ligation to tRNA fragments (and not exclusively to mature 3′-CCA tRNAs). Adapter ligation to tRNA 5′ ends, which is only carried out as a separate step for protocols that employ strategy A, is typically conducted by T4 RNA ligase 1 (RNL1). RNL1 is well suited to ligating ssRNA molecules, which is the structure of the 5′ adapter [33.Marintcheva B. Viral tools for in vitro manipulations of nucleic acids: molecular cloning.in: Harnessing the Power of Viruses. Academic Press, 2018: 27-67Crossref Google Scholar]. Similarly to its 3′ adapter, ALL-tRNAseq also has a randomized tetranucleotide at the 3′ end of its 5′ adapter, for the same purpose as above. CircLigase circularization (strategy C) and second adapter ligation after RT (strategy B) are alike in that the 3′ end of cDNA is only ligated to an adapter after RT, which has the benefit of enabling incomplete transcription products (as generated by RT stops) to be included in the final library preparation. Notably, mim-tRNAseq and QuantM-tRNAseq both use CircLigase I, whereas DM-TGIRT-seq uses CircLigase II. In at least one study, CircLigase I was found to have higher circularization efficiency than CircLigase II [34.Heyer E.E. et al.An optimized kit-free method for making strand-specific deep sequencing libraries from RNA fragments.Nucleic Acids Res. 2015; 43e2Crossref Scopus (37) Google Scholar]. A different study compared circularization strategies with standard TruSeqTM small RNA preparation, and found that higher numbers of unique reads were observed – this study employed only CircLigase II [35.Chu Y. et al.Intramolecular circularization increases efficiency of RNA sequencing and enables CLIP-Seq of nuclear RNA from human cells.Nucleic Acids Res. 2015; 43e75Crossref Scopus (12) Google Scholar]. Since the RT step is almost inevitable in any standard protocol for sequencing library preparation, the presence of modified nucleotides in tRNAs is a well-recognized source of strong bias towards over-representation of less extensively modified tRNA species [22.Lucas M.C. et al.Quantitative analysis of tRNA abundance and modifications by nanopore RNA sequencing.Nat. Biotechnol. 2023; (Published online April 6, 2023. https://doi.org/10.1038/s41587-023-01743-6)Crossref PubMed Scopus (28) Google Scholar,36.Warren J.M. et al.Combining tRNA sequencing methods to characterize plant tRNA expression and post-transcriptional modification.RNA Biol. 2021; 18: 64-78Crossref PubMed Scopus (17) Google Scholar]. Moreover, although some tRNA modifications are silent in the RT reaction and only mildly affect cDNA synthesis [such as 5-methylcytosine (m5C), 7-methylguanosine (m7G), ψ, and others that do not alter Watson–Crick base-pairing], other tRNA modifications either pause or simply arrest RT primer extension. These are particularly frequent in eukaryotic tRNAs [N1-methyladenosine (m1A), m2,2G, 3-methylcytosine (m3C)] or are common to all living species [m1G, N6-isopentenyl adenosine (i6A)/2-methyl-thio-N6-isopentenyladenosine (ms2i6A), etc.] [14.Orioli A. tRNA biology in the omics era: stress signalling dynamics and cancer progression.BioEssays. 2017; 391600158Crossref PubMed Scopus (27) Google Scholar,37.Motorin Y. Marchand V. Analysis of RNA modifications by second- and third-generation deep sequencing: 2020 update.Genes (Basel). 2021; 12: 278Crossref PubMed Scopus (35) Google Scholar,38.Motorin Y. Helm M. Methods for RNA modification mapping using deep sequencing: established and new emerging technologies.Genes (Basel). 2019; 10: 35Crossref PubMed Scopus (79) Google Scholar]. When such nucleotides are encountered by the RT enzyme, the cDNA extension is either aborted or the enzyme passes through with possible incorporation of a mismatched nucleotide in the cDNA. This RT signature depends not only on the properties of the enzyme [39.Werner S. et al.Machine learning of reverse transcription signatures of variegated polymerases allows mapping and discrimination of methylated purines in limited transcriptomes.Nucleic Acids Res. 2020; 48: 3734-3746Crossref PubMed Scopus (6) Google Scholar] but also on the nature of the RT-arresting nucleotide, its sequence context, and the composition of the reaction buffer (namely Mg2+/Mn2+ ions) [14.Orioli A. tRNA biology in the omics era: stress signalling dynamics and cancer progression.BioEssays. 2017; 391600158Crossref PubMed Scopus (27) Google Scholar,40.Kristen M. et al.Manganese ions individually alter the reverse transcription signature of modified ribonucleosides.Genes. 2020; 11: 950Crossref PubMed Scopus (13) Google Scholar,41.Khoddami V. et al.Transcriptome-wide profiling of multiple RNA modifications simultaneously at single-base resolution.Proc. Natl. Acad. Sci. U. S. A. 2019; 116: 6784-6789Crossref PubMed Scopus (140) Google Scholar]. The TGIRT enzyme, when included in sequencing library preparation, can facilitate readthrough of tRNA modifications including a subset of Watson–Crick base-pairing modifications [17.Zheng G. et al.Efficient and quantitative high-throughput tRNA sequencing.Nat. Methods. 2015; 12: 835-837Crossref PubMed Scopus (351) Google Scholar,42.Qin Y. et al.High-throughput sequencing of human plasma RNA by using thermostable group II intron reverse transcriptases.RNA. 2016; 22: 111-128Crossref PubMed Scopus (77) Google Scholar]. In mim-tRNAseq, modifications to the TGIRT reaction conditions were made to further improve its efficiency and reduce the number of premature RT stops [16.Behrens A. et al.High-resolution quantitative profiling of tRNA abundance and modification status in eukaryotes by mim-tRNAseq.Mol. Cell. 2021; 81: 1802-1815Abstract Full Text Full Text PDF PubMed Scopus (89) Google Scholar]. In the recently developed ALL-tRNAseq method, a highly processive group II intron maturase MarathonRT was used which was touted to overperform TGIRT for the sequencing of long and structured RNAs. In general, tRNA-seq bioinformatic workflows begin with raw FASTQ files, proceed with some preprocessing steps, include curation of a reference, and finally align reads to references. A detailed overview of these steps in each protocol is given in Table 1.Table 1Different computational methods in tRNA-seqaPreprocessing, reference curation, and alignment steps of bioinformatics pipelines corresponding to tRNAseq protocols. In some cases, incomplete or missing data were noted. Protocols are ordered from left to right in the order that they were published., bAbbreviation: N/A, not available.DM-TGIRT-seqARM-seqYAMAT-seqHydro-tRNAseqLOTTE-seqQuantM-tRNAseqmim-tRNAseqNano-tRNAseqALL-tRNAseqPreprocessing – trimming toolTrimmomatic'Seqprep' custom scriptNot foundCutadaptBBDukCutadaptCutadaptN/AsRNAbenchPreprocessing – additional steps (post-trimming)Minimum read length of 15 ntMinimum read length of 15 ntInformation not foundN/A50–100 nt read lengthN/AMinimum read length of 15 ntAdapters added to the referenceN/AReference curation – databaseGtRNAdb tRNAstRNAscan-SE output and whole genomesGtRNAdb genomic tRNAs and whole genomeGtRNAdb tRNAstRNAscan-SE output (pre-tRNAs and mature), tRNA-masked whole genomeGtRNAdbGtRNAdb, MODOMICs modification indexingGtRNAdbGtRNAdb, mitotRNAdbReference curation – additional steps3′-CCA added, identical isodecoders consolidated (462 remaining)3′-CCA and 5′-G added, and introns removed3′-CCA added (total of 632 genes) in conjunction with whole GRCh37 assemblyGtRNAdb tRNAs were used, supplemented by custom annotation and curation. Both genomic and mature tRNAs were used3′-CCA tails addedIdentical tRNA sequences consolidatedClustering is applied with a similarity threshold to group reference tRNAs together. A centroid cluster sequence is obtainedInitial set of tRNAs is reduced to 42 sequences corresponding to individual isoacceptorsN/AAlignment – alignerBowtie2Bowtie2SHRiMP2BWASegemehlBowtie2GNSNPBWAsRNAbench (Bowtie)Alignment – additional settings/stepsOne mismatch allowed.Default mismatch tolerance used10% mismatch rateHierarchical mapping scheme, with each round of mapping increasing error tolerance (up to two mismatches). These mismatches were used to identify modified residues80% accuracy threshold for first-round mapping to pre-tRNA and tRNA-masked genome. This was followed by remapping with 85% accuracy to a mature tRNA genomeMin-score value of G,1,8 and seed mismatches set at 1. Isodecoders were quantified by identifying reads with MAPQ >10Alignment to centroid cluster sequence with –SNP-tolerance flag enabled and 0.1 mismatches per read (outside pre-indexed sites). Remapping after SNP index is updated with new sites from first-round mappingbwa mem -W13 -k6 -xont2d -T20 was used for alignment, which implies minimum initial seed length (k) of 6 and seeded bases discarded if not at least 13 nt in lengthSecond-round mapping with a Smith–Waterman alignment with a higher mismatch tolerance was used to recover additional potentially modified tRNA readsAnticodon-level resolution?YesYesYesYesYesYesYesYesYesIsodecoder resolution?NoIsodecoder data are presented, but there is no dedicated protocol for isodecoder resolutionNoIsodecoder data are presented, but there is no dedicated protocol for isodecoder resolutionNoYesYesNoNo% Mapped reads (of total reads)76.73–82.06%10.1–13.2%92.6–95.1%16.8–60%97%90%87.3–90.6%58.62%62%% Uniquely mapped readsUnspecifiedUnspecifiedUnspecifiedUnspecifiedUnspecified~40% of mapped~96.8–100% of mapped93% of mappedUnspecifieda Preprocessing, reference curation, and alignment steps of bioinformatics pipelines corresponding to tRNAseq protocols. In some cases, incomplete or missing data were noted. Protocols are ordered from left to right in the order that they were published.b Abbreviation: N/A, not available. Open table in a new tab Raw FASTQ reads typically undergo some type of adapter trimming before moving forward in a tRNA-seq analysis pipeline, although the specific tools and criteria used can vary. In summary, the tools used were Trimmomatic [43.Bolger A.M. et al.Trimmomatic: a flexible trimmer for Illumina sequence data.Bioinformatics. 2014; 30: 2114-2120Crossref PubMed Scopus (35980) Google Scholar] (DM-TGIRT-seq), cutadapt [44.Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads.EMBnet. J. 2011; 17: 10Crossref Google Scholar] (Hydro-tRNAseq, QuantM-tRNAseq, mim-tRNAseq), BBDuk (LOTTE-seq), a custom script entitled Seqprep (ARM-seq), and sRNAbench (All-tRNAseq). YAMAT-seq and Nano-tRNAseq did not specify an adapter trimming or preprocessing protocol. Two publications, which focused on the release of 'fastp' [45.Chen S. et al.fastp: an ultra-fast all-in-one FASTQ preprocessor.Bioinformatics. 2018; 34: i884-i890Crossref PubMed Scopus (9226) Google Scholar] and 'Atria' [46.Chuan J. et al.Atria: an ultra-fast and accurate trimmer for adapter and quality trimming.Gigabyte. 2021; 2021gigabyte31Crossref PubMed Scopus (6) Google Scholar], completed a comparison of two of the aforementioned packages – Trimmomatic and cutadapt (which is used within a wrapper script entitled 'Trim Galore'), alongside many other unmentioned packages. Although cutadapt was found to be faster in both studies, it was reported to have a higher mismatch rate in the fastp study, and a much lower %PPV (positive predictive value) in the Atria study: 40.05% compared with 99.29%. Perhaps more important than the choice of tool is how it is applied and whether that application is justified. Some studies have pointed out that adapter trimming is not necessary in many cases and should be applied with caution, although these do not mention small-RNA-seq data (which are often considered to include tRNA-seq data) specifically. However, studies which do focus on small RNA adapter trimming emphasize its importance and some include a recommended lower limit of inclusion for post-trimmed reads of ~15 nt [47.Bezuglov V. et al.Approaches for sRNA analysis of human RNA-seq data: comparison, benchmarking.Int. J. Mol. Sci. 2023; 24: 4195Crossref PubMed Scopus (2) Google Scholar,48.Zhong X. et al.Accurate adapter information is crucial for reproducibility and reusability in small RNA seq studies.Noncoding RNA. 2019; 5: 49PubMed Google Scholar]. Indeed, this is the lower limit followed by two of four
Referência(s)