Pré-print Acesso aberto Revisado por pares

BlobTools: Interrogation of genome assemblies

2017; Faculty of 1000; Volume: 6; Linguagem: Inglês

10.12688/f1000research.12232.1

ISSN

2046-1402

Autores

Dominik R. Laetsch, Mark Blaxter,

Tópico(s)

Protist diversity and phylogeny

Resumo

The goal of many genome sequencing projects is to provide a complete representation of a target genome (or genomes) as underpinning data for further analyses. However, it can be problematic to identify which sequences in an assembly truly derive from the target genome(s) and which are derived from associated microbiome or contaminant organisms. We present BlobTools, a modular command-line solution for visualisation, quality control and taxonomic partitioning of genome datasets. Using guanine+cytosine content of sequences, read coverage in sequencing libraries and taxonomy of sequence similarity matches, BlobTools can assist in primary partitioning of data, leading to improved assemblies, and screening of final assemblies for potential contaminants. Through simulated paired-end read dataset,s containing a mixture of metazoan and bacterial taxa, we illustrate the main BlobTools workflow and suggest useful parameters for taxonomic partitioning of low-complexity metagenome assemblies.

Referência(s)