Artigo Revisado por pares

Normalized Feature Vectors: A Novel Alignment-Free Sequence Comparison Method Based on the Numbers of Adjacent Amino Acids

2013; Institute of Electrical and Electronics Engineers; Volume: 10; Issue: 2 Linguagem: Inglês

10.1109/tcbb.2013.10

ISSN

2374-0043

Autores

De-Shuang Huang, Hongjie Yu,

Tópico(s)

RNA and protein synthesis mechanisms

Resumo

Based on all kinds of adjacent amino acids (AAA), we map each protein primary sequence into a 400 by (L-1) matrix M. In addition, we further derive a normalized 400-tuple mathematical descriptors D, which is extracted from the primary protein sequences via singular values decomposition (SVD) of the matrix. The obtained 400-D normalized feature vectors (NFVs) further facilitate our quantitative analysis of protein sequences. Using the normalized representation of the primary protein sequences, we analyze the similarity for different sequences upon two data sets: 1) ND5 sequences from nine species and 2) transferrin sequences of 24 vertebrates. We also compared the results in this study with those from other related works. These two experiments illustrate that our proposed NFV-AAA approach does perform well in the field of similarity analysis of sequence.

Referência(s)