Fast-bonito: A faster deep learning based basecaller for nanopore sequencing
2021; Elsevier BV; Volume: 1; Linguagem: Inglês
10.1016/j.ailsci.2021.100011
ISSN2667-3185
AutoresZhimeng Xu, Yuting Mai, Denghui Liu, Wenjun He, Xinyuan Lin, Chi Xu, Lei Zhang, Xin Meng, Joseph Mafofo, Walid Abbas Zaher, Ashish Koshy, Yi Li, Nan Qiao,
Tópico(s)Advanced biosensing and bioanalysis techniques
ResumoNanopore sequencing from Oxford Nanopore Technologies (ONT) is a promising third-generation sequencing (TGS) technology that generates relatively longer sequencing reads compared to the next-generation sequencing (NGS) technology. A basecaller is a piece of software that translates the original electrical current signals into nucleotide sequences. The accuracy of the basecaller is crucially important to downstream analysis. Bonito is a deep learning-based basecaller recently developed by ONT. Its neural network architecture is composed of a single convolutional layer followed by three stacked bidirectional gated recurrent unit (GRU) layers. Although Bonito has achieved state-of-the-art base calling accuracy, its speed is too slow to be used in production. We therefore developed Fast-Bonito, by using the neural architecture search (NAS) technique to search for a brand-new neural network backbone, and trained it from scratch using several advanced deep learning model training techniques. The new Fast-Bonito model balanced performance in terms of speed and accuracy. Fast-Bonito was 153.8% faster than the original Bonito on NVIDIA V100 GPU. When running on HUAWEI Ascend 910 NPU, Fast-Bonito was 565% faster than the original Bonito. The accuracy of Fast-Bonito was also slightly higher than that of Bonito. We have made Fast-Bonito open source, hoping it will boost the adoption of TGS in both academia and industry.
Referência(s)