Artigo Acesso aberto Revisado por pares

cljam: a library for handling DNA sequence alignment/map (SAM) with parallel processing

2016; BioMed Central; Volume: 11; Issue: 1 Linguagem: Inglês

10.1186/s13029-016-0058-6

ISSN

1751-0473

Autores

Toshiki Takeuchi, Atsuo Yamada, Takashi Aoki, Kunihiro Nishimura,

Tópico(s)

RNA and protein synthesis mechanisms

Resumo

Next-generation sequencing can determine DNA bases and the results of sequence alignments are generally stored in files in the Sequence Alignment/Map (SAM) format and the compressed binary version (BAM) of it. SAMtools is a typical tool for dealing with files in the SAM/BAM format. SAMtools has various functions, including detection of variants, visualization of alignments, indexing, extraction of parts of the data and loci, and conversion of file formats. It is written in C and can execute fast. However, SAMtools requires an additional implementation to be used in parallel with, for example, OpenMP (Open Multi-Processing) libraries. For the accumulation of next-generation sequencing data, a simple parallelization program, which can support cloud and PC cluster environments, is required.We have developed cljam using the Clojure programming language, which simplifies parallel programming, to handle SAM/BAM data. Cljam can run in a Java runtime environment (e.g., Windows, Linux, Mac OS X) with Clojure.Cljam can process and analyze SAM/BAM files in parallel and at high speed. The execution time with cljam is almost the same as with SAMtools. The cljam code is written in Clojure and has fewer lines than other similar tools.

Referência(s)