Artigo Acesso aberto Revisado por pares

‘Genome design’ model and multicellular complexity: golden middle

2006; Oxford University Press; Volume: 34; Issue: 20 Linguagem: Inglês

10.1093/nar/gkl773

ISSN

1362-4962

Autores

Alexander E. Vinogradov,

Tópico(s)

Genomics and Phylogenetic Studies

Resumo

Human tissue-specific genes were reported to be longer than housekeeping genes (both in coding and intronic parts). The competing neutralist and adaptationist models were proposed to explain this observation. Here I show that in human genome the longest are genes with the intermediate expression pattern. From the standpoint of information theory, the regulation of such genes should be most complex. In the genomewide context, they are found here to have the higher informational load on all available levels: from participation in protein interaction networks, pathways and modules reflected in Gene Ontology categories through transcription factor regulatory sets and protein functional domains to amino acid tuples (words) in encoded proteins and nucleotide tuples in introns and promoter regions. Thus, the intermediately expressed genes have the higher functional and regulatory complexity that is reflected in their greater length (which is consistent with the 'genome design' model). The dichotomy of housekeeping versus tissue-specific entities is more pronounced on the modular level than on the molecular level. There are much lesser intermediate-specific modules (modules overrepresented in the intermediately expressed genes) than housekeeping or tissue-specific modules (normalized to gene number). The dichotomy of housekeeping versus tissue-specific genes and modules in multicellular organisms is probably caused by the burden of regulatory complexity acted on the intermediately expressed genes.

Referência(s)