The Effects of Evolutionary Distance on TWINSCAN, an Algorithm for Pair-wise Comparative Gene Prediction
2003; Cold Spring Harbor Laboratory Press; Volume: 68; Linguagem: Inglês
10.1101/sqb.2003.68.125
ISSN1943-4456
AutoresM. WANG, Jeremy Buhler, Michael R. Brent,
Tópico(s)Machine Learning in Bioinformatics
ResumoAlthough the human genome sequence is finished, complete delineation of all human protein-coding genes remains a distant prospect. There currently are only about13,000 genes (loci) for which at least one complete openreading frame is known with high confidence (http://mgc.nci.nih.gov/,http://www.ncbi.nlm.nih.gov/LocusLink/RSstatistics.html) (Pruitt and Maglott 2001; Strausberg etal. 2002), out of an estimated total of at least 20,000 (RoestCrollius et al. 2000; Waterston et al. 2002). Thus, we are inurgent need of improved techniques for delineating complete gene structures. One way in which such improvements have come about in the last few years is throughcomparison of the human genome to other sequenced vertebrate genomes. New gene modeling programs were developed to exploit information in alignments between themouse and human genomes (Bafna and Huson 2000; Korfet al. 2001; Alexandersson et al. 2003; Flicek et al. 2003;Parra et al. 2003), and these are now being used to obtaincDNA sequence via hypothesis-driven RT-PCR and sequencing experiments (Guigó et al. 2003; Wu et al. 2004).One of the first comparative gene modeling programs toachieve substantial improvements over the previous stateof the art was TWINSCAN, which can annotate a targetgenome by exploiting alignments from an informantgenome even if the informant sequences are unassembledwhole-genome shotgun reads (Flicek et al. 2003)...
Referência(s)