Capítulo de livro Produção Nacional Revisado por pares

CellHeap: A Workflow for Optimizing COVID-19 Single-Cell RNA-Seq Data Processing in the Santos Dumont Supercomputer

2021; Springer Science+Business Media; Linguagem: Inglês

10.1007/978-3-030-91814-9_4

ISSN

1611-3349

Autores

Vanessa S. Silva, Maiana O. C. Costa, Maria Clicia Castro, Helena S. I. L. Silva, Maria Emília M. T. Walter, Alba Cristina Magalhães Alves de Melo, Kary Ocaña, Marcelo Trindade dos Santos, Marisa Nicolás, Anna Cristina Calçada Carvalho, Andrea Henriques‐Pons, Fabrício Alves Barbosa da Silva,

Tópico(s)

Cancer Genomics and Diagnostics

Resumo

Currently, several hundreds of Terabytes of COVID-19 single-cell RNA-seq (scRNA-seq) data are available in public repositories. This data refers to multiple tissues, comorbidities, and conditions. We expect this trend to continue, and it is realistic to predict amounts of COVID-19 scRNA-seq data increasing to several Petabytes in the coming years. However, thoughtful analysis of this data requires large-scale computing infrastructures, and software systems optimized for such platforms to generate biological knowledge. This paper presents CellHeap, a portable and robust workflow for scRNA-seq customizable analyses, with quality control throughout the execution steps and deployable on supercomputers. Furthermore, we present the deployment of CellHeap in the Santos Dumont supercomputer for analyzing COVID-19 scRNA-seq datasets, and discuss a case study that processed dozens of Terabytes of COVID-19 scRNA-seq raw data.

Referência(s)