Artigo Acesso aberto Produção Nacional Revisado por pares

The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome

2001; National Academy of Sciences; Volume: 98; Issue: 21 Linguagem: Inglês

10.1073/pnas.201182798

ISSN

1091-6490

Autores

Anamaria A. Camargo, Helena P. B. Samaia, Emmanuel Dias‐Neto, Daniel Simão, Italo A. Migotto, Marcelo R. S. Briones, Fernando Ferreira Costa, María Aparecida Nagai, Sérgio Verjovski-Almeida, Marco A. Zago, Luís Eduardo Coelho Andrade, Helaine Carrer, Hamza El‐Dorry, Enilza Maria Espreáfico, Angelita Habr‐Gama, Daniel Giannella‐Neto, Gustavo H. Goldman, Arthur Gruber, Christine Hackel, Edna Teruko Kimura, Rui M. B. Maciel, Suely Kazue Nagahashi Marie, Elizabeth A. L. Martins, Marina P. Nóbrega, Maria Luisa Paçó‐Larson, Maria Inês de Moura Campos Pardini, Gonçalo G. Pereira, João Bosco Pesquero, Vanderlei Rodrigues, Sílvia Regina Rogatto, Ismael D. C. G. da Silva, Mari Cleide Sogayar, María de Fátima Sonati, Eloíza H. Tajara, Sandro Roberto Valentini, Fernando Alberto, M.E.J. Amaral, Ivy Aneas, Liliane A. T. Arnaldi, Ângela Maria de Assis, Mário Henrique Bengtson, Nádia Aparecida Bérgamo, Vanessa Bombonato, Maria E. R. de Camargo, Renata de Azevedo Canevari, Dirce Maria Carraro, Janete M. Cerutti, Maria Lúcia Corrêa‐Giannella, Rosana F. R. Corrêa, María Costa, Cyntia Curcio, Paula de Oliveira Montandon Hokama, Ari J. S. Ferreira, Gilberto K. Furuzawa, Tsieko Gushiken, Paulo Lee Ho, Elza Kimura, José Eduardo Krieger, Luciana C. C. Leite, Paromita Majumder, Mozart Marins, Everaldo R. Marques, Analy Salles de Azevedo Melo, Mônica Barbosa de Melo, Carlos Alberto Mestriner, Elisabete Miracca, Daniela C. Miranda, Ana L. T. O. Nascimento, Francisco G. Nóbrega, Elida P.B. Ojopi, J. R. C. Pandolfi, Luciana Gilbert Pessoa, Aline C. Prevedel, Paula Rahal, Cláudia Aparecida Rainho, Eduardo M. Reis, Marcelo Lima Ribeiro, Nancy da Rós, Renata Guerra de Sá, Magaly M. Sales, Simone Cristina Sant'anna, Mariana Lopes dos Santos, Aline Maria da Silva, Neusa P. da Silva, Wilson A. Silva, Rosana Antunes da Silveira, Josane F. Sousa, Daniella Stecconi, Fernando Tsukumo, Valéria Valente, Fernando Augusto Soares, Eloísa S. Moreira, Diana Noronha Nunes, Ricardo G. Correa, Heloisa Zalcberg, Alex F. Carvalho, Luiz F. L. Reis, Helena Brentani, Andrew J.G. Simpson, Sandro J. de Souza,

Tópico(s)

Molecular Biology Techniques and Applications

Resumo

Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription–PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning.

Referência(s)