
BioinfoPortal: A scientific gateway for integrating bioinformatics applications on the Brazilian national high-performance computing network
2020; Elsevier BV; Volume: 107; Linguagem: Inglês
10.1016/j.future.2020.01.030
ISSN1872-7115
AutoresKary Ocaña, Marcelo Galheigo, Carla Osthoff, Luiz Gadelha, Fábio Porto, Antônio Tadeu A. Gomes, Daniel de Oliveira, Ana Tereza Ribeiro de Vasconcelos,
Tópico(s)Distributed and Parallel Computing Systems
ResumoScience gateways have gained increasing attention in the last years from diverse communities. Science gateways are software solutions that bring out the integration of reusable data and specialized techniques via Web servers while hiding the complexity of the underlying high-performance computing resources. Several projects and initiatives have been started worldwide to develop frameworks that support the broad range of key scientific domains. Biological sciences are undergoing a revolution since novel technologies, such as next-generation sequencing, allow data generation in exascale dimensions. Bioinformatics covers a wide range of important applications in health, diversity, and life sciences with the understanding of the high-performance computing culture to accelerate the transition of computational simulations of biological systems at all scales. The article introduces the BioinfoPortal gateway, its architecture, functionalities, and the integration to the CSGrid middleware used to manage the high-performance computing environment of the Brazilian National High-Performance Computing System, SINAPAD, including the Santos Dumont supercomputer . We present a discussion about the challenges of integrating BioinfoPortal and CSGrid framework, which considers the general process of the installation, configuration, and deployment. Finally, we present the findings of the performance analysis of high-performance computing applications, presenting how machine learning was applied to optimize the functionality of BioinfoPortal based on recommending predictive models for the efficient allocation of resources obtained over 75% of performance efficiency. • Features and functionalities of BioinfoPortal architecture integrated in SINAPAD. • SINAPAD optimizes computational resources usage of the Santos Dumont supercomputer . • Performance analyses of HPC applications can lead to optimizations in BioinfoPortal. • A comprehensive report of HPC gateways for bioinformatics and computer scientists.
Referência(s)