DNA Composition, Codon Usage and Exon Prediction
1997; Elsevier BV; Linguagem: Inglês
10.1016/b978-012101625-8/50005-0
Autores Tópico(s)Genomics and Phylogenetic Studies
ResumoThis chapter reviews the sequence-based measures indicative of protein-coding function in genomic DNA. A coding statistic can be defined as a function that computes given a DNA sequence a real number related to the likelihood that the sequence is coding for a protein. Model-dependent coding statistics are likely to capture more of the specific features of coding DNA since they are dependent on more parameters. It is suggested that model-dependent coding statistics may be more powerful in discriminating coding from noncoding DNA. A DNA sequence can be partitioned in a sequence of consecutive nonoverlapping codons in three different ways depending on the nucleotide in the sequence on which the grouping of nucleotides into codons starts. It is found that amino acid usage and codon preference carry a lot of information about coding function, and neither of these measures appears to be as discriminant as codon usage. The distribution of base frequencies at codon positions can be assumed to describe statistically a prototypical codon. The measures based on base compositional bias between codon positions are also elaborated.
Referência(s)