PROKARYOTE PHYLOGENY WITHOUT SEQUENCE ALIGNMENT: FROM AVOIDANCE SIGNATURE TO COMPOSITION DISTANCE
2004; Imperial College Press; Volume: 02; Issue: 01 Linguagem: Inglês
10.1142/s0219720004000442
ISSN1757-6334
Autores Tópico(s)Machine Learning in Bioinformatics
ResumoThis is a review of a new and essentially simple method of inferring phylogenetic relationships from complete genome data without using sequence alignment. The method is based on counting the appearance frequency of oligopeptides of a fixed length (up to K=6) in the collection of protein sequences of a species. It is a method without fine adjustment and choice of genes. Applied to prokaryotic genomes it has led to results comparable with the bacteriologists' systematics as reflected in the latest 2002 outline of the Bergey's Manual of Systematic Bacteriology. The method has also been used to compare chloroplast genomes and to the phylogeny of Coronaviruses including human SARS-CoV. A key point in our approach is subtraction of a random background from the original counts by using a Markov model of order K-2 in order to highlight the shaping role of natural selection. The implications of the subtraction procedure is specially analyzed and further development of the new approach is indicated.
Referência(s)