Artigo Acesso aberto Revisado por pares

The CATH database: an extended protein family resource for structural and functional genomics

2003; Oxford University Press; Volume: 31; Issue: 1 Linguagem: Inglês

10.1093/nar/gkg062

ISSN

1362-4962

Autores

Frances M. G. Pearl, C. F. Bennett, James E. Bray, Andrew Harrison, Nigel Martin, Adrian J. Shepherd, Ian Sillitoe, Janet M. Thornton, Christine Orengo,

Tópico(s)

Genomics and Phylogenetic Studies

Resumo

The CATH database of protein domain structures ( http://www.biochem.ucl.ac.uk/bsm/cath_new ) currently contains 34 287 domain structures classified into 1383 superfamilies and 3285 sequence families. Each structural family is expanded with domain sequence relatives recruited from GenBank using a variety of efficient sequence search protocols and reliable thresholds. This extended resource, known as the CATH-protein family database (CATH-PFDB) contains a total of 310 000 domain sequences classified into 26 812 sequence families. New sequence search protocols have been designed, based on these intermediate sequence libraries, to allow more regular updating of the classification.

Referência(s)