Artigo Acesso aberto Revisado por pares

Rapid and sensitive detection of genome contamination at scale with FCS-GX

2024; BioMed Central; Volume: 25; Issue: 1 Linguagem: Inglês

10.1186/s13059-024-03198-7

ISSN

1474-760X

Autores

Alexander Astashyn, Eric S. Tvedte, Deacon Sweeney, Victor Sapojnikov, Nathan Bouk, Victor Joukov, Eyal Mozes, Pooja K Strope, Pape M. Sylla, Lukas Wagner, Shelby Bidwell, Larissa C. Brown, Karen Clark, Emily W. Davis, Brian Smith-White, Wratko Hlavina, Kim D. Pruitt, Valérie Schneider, Terence D. Murphy,

Tópico(s)

Molecular Biology Techniques and Applications

Resumo

Assembled genome sequences are being generated at an exponential rate. Here we present FCS-GX, part of NCBI's Foreign Contamination Screen (FCS) tool suite, optimized to identify and remove contaminant sequences in new genomes. FCS-GX screens most genomes in 0.1-10 min. Testing FCS-GX on artificially fragmented genomes demonstrates high sensitivity and specificity for diverse contaminant species. We used FCS-GX to screen 1.6 million GenBank assemblies and identified 36.8 Gbp of contamination, comprising 0.16% of total bases, with half from 161 assemblies. We updated assemblies in NCBI RefSeq to reduce detected contamination to 0.01% of bases. FCS-GX is available at https://github.com/ncbi/fcs/ or https://doi.org/10.5281/zenodo.10651084 .

Referência(s)