Artigo Acesso aberto Revisado por pares

Alignment of BLAST high-scoring segment pairs based on the longest increasing subsequence algorithm

2003; Oxford University Press; Volume: 19; Issue: 11 Linguagem: Inglês

10.1093/bioinformatics/btg168

ISSN

1367-4811

Autores

Hongyu Zhang,

Tópico(s)

Glycosylation and Glycoproteins Research

Resumo

Abstract Motivation:The popular BLAST algorithm is based on a local similarity search strategy, so its high-scoring segment pairs (HSPs) do not have global alignment information. When scientists use BLAST to search for a target protein or DNA sequence in a huge database like the human genome map, the existence of repeated fragments, homologues or pseudogenes in the genome often makes the BLAST result filled with redundant HSPs. Therefore, we need a computational strategy to alleviate this problem. Results: In the gene discovery group of Celera Genomics, I developed a two-step method, i.e. a BLAST step plus an LIS step, to align thousands of cDNA and protein sequences into the human genome map. The LIS step is based on a mature computational algorithm, Longest Increasing Subsequence (LIS) algorithm. The idea is to use the LIS algorithm to find the longest series of consecutive HSPs in the BLAST output. Such a BLAST+LIS strategy can be used as an independent alignment tool or as a complementary tool for other alignment programs like Sim4 and GenWise. It can also work as a general purpose BLAST result processor in all sorts of BLAST searches. Two examples from Celera were shown in this paper. Contact: me@hongyu.org * Present address: Ceres Inc., 3007 Malibu Canyon Road, Malibu, CA 90265, USA.

Referência(s)
Altmetric
PlumX