Drosophila Arginase Is Produced from a Nonvital Gene That Contains the elav Locus within Its Third Intron
2000; Elsevier BV; Volume: 275; Issue: 40 Linguagem: Inglês
10.1074/jbc.m001346200
ISSN1083-351X
Autores Tópico(s)Amino Acid Enzymes and Metabolism
ResumoA Drosophila gene encoding a 351-amino acid-long predicted arginase (40% identity with vertebrate arginases) is reported. Interestingly, the third intron of thearginase gene includes the elav locus, whose coding sequence is on the complementary DNA strand to that of thearginase. Terrestrial vertebrates produce two arginases from duplicated genes. One form, essentially present in the liver, is a key enzyme of the urea cycle and eliminates excess ammonia through the excretion of urea. The function of the extrahepatic arginase, more ubiquitous, is not well understood. In macrophages, arginase competes with nitric-oxide synthase, which converts arginine into nitric oxide. Most organisms, including insects, produce only one type of arginase, whose function is not centered on ammonia detoxification. ADrosophila cDNA encoding a predicted arginase was isolated. It produces a 1.3-kilobase transcript present with highest levels toward the end of embryogenesis and thereafter. During embryogenesis, the arginase transcripts localize to the fat body. The first mutant allele of the Drosophila arginasegene was identified. It is predicted to produce a 199-amino acid-long C-terminally truncated protein, likely to be inactive. Preliminary characterization of the mutation shows that this recessive allele causes a developmental delay but does not affect viability. A Drosophila gene encoding a 351-amino acid-long predicted arginase (40% identity with vertebrate arginases) is reported. Interestingly, the third intron of thearginase gene includes the elav locus, whose coding sequence is on the complementary DNA strand to that of thearginase. Terrestrial vertebrates produce two arginases from duplicated genes. One form, essentially present in the liver, is a key enzyme of the urea cycle and eliminates excess ammonia through the excretion of urea. The function of the extrahepatic arginase, more ubiquitous, is not well understood. In macrophages, arginase competes with nitric-oxide synthase, which converts arginine into nitric oxide. Most organisms, including insects, produce only one type of arginase, whose function is not centered on ammonia detoxification. ADrosophila cDNA encoding a predicted arginase was isolated. It produces a 1.3-kilobase transcript present with highest levels toward the end of embryogenesis and thereafter. During embryogenesis, the arginase transcripts localize to the fat body. The first mutant allele of the Drosophila arginasegene was identified. It is predicted to produce a 199-amino acid-long C-terminally truncated protein, likely to be inactive. Preliminary characterization of the mutation shows that this recessive allele causes a developmental delay but does not affect viability. embryonic lethal abnormal visual system kilobase(s) arginase I (hepatic form) arginase II (nonhepatic form) polymerase chain reaction open reading frame expressed sequence tag Large scale eucaryotic genome sequencing and the generation of expressed sequence tags, including those from the yeastSaccharomyces cerevisiae (1Mewes H.W. Albermann K. Bahr M. Frishman D. Gleissner A. Hani J. Heumann K. Kleine K. Maierl A. Oliver S.G. Pfeiffer F. Zollner A. Nature. 1997; 387 (suppl.): 7-65Crossref Scopus (461) Google Scholar), the nematodeCaenorhabditis elegans (2C. elegans Genome Sequencing Consortium Trends Genet. 1999; 15: 51-58Abstract Full Text Full Text PDF PubMed Scopus (78) Google Scholar, 3C. elegans Genome Sequencing Consortium Science. 1998; 282: 2012-2018Crossref PubMed Scopus (3548) Google Scholar), and the fruit flyDrosophila melanogaster (4Adams M.D. Celniker S.E. Holt R.A. Evans C.A. Gocayne J.D. Amanatides P.G. Scherer S.E. Li P.W. Hoskins R.A. Galle R.F. George R.A. Lewis S.E. Richards S. Ashburner M. Henderson S.N. et al.Science. 2000; 287: 2185-2195Crossref PubMed Scopus (4744) Google Scholar, 5Ashburner M. Misra S. Roote J. Lewis S.E. Blazej R. Davis T. Doyle C. Galle R. George R. Harris N. Hartzell G. Harvey D. Hong L. Houston K. Hoskins R. Johnson G. Martin C. Moshrefi A. Palazzolo M. Reese M.G. Spradling A. Tsang G. Wan K. Whitelaw K. Kimmel B. Celniker S. Rubin G.M. Genetics. 1999; 153: 179-219PubMed Google Scholar, 6Rubin G.M. Hong L. Brokstein P. Evans-Holm M. Frise E. Stapleton M. Harvey D.A. Science. 2000; 287: 2222-2224Crossref PubMed Scopus (300) Google Scholar), are providing a great deal of information about gene organization and genome evolution. Twenty-three years ago, with the discovery of splicing, the basic idea of a gene as a "block" of DNA was challenged. Today, exciting new observations are being made. What initially seemed peculiar oddities, such as operons in nematodes and genes nested within genes in fruit flies, are turning out to be more general phenomena. About 25% ofC. elegans genes are in operons (for a review see Ref. 7Blumenthal T. Bioessays. 1998; 20: 480-487Crossref PubMed Scopus (146) Google Scholar), and one estimate suggests that 7% of D. melanogaster genes may be nested within others (5Ashburner M. Misra S. Roote J. Lewis S.E. Blazej R. Davis T. Doyle C. Galle R. George R. Harris N. Hartzell G. Harvey D. Hong L. Houston K. Hoskins R. Johnson G. Martin C. Moshrefi A. Palazzolo M. Reese M.G. Spradling A. Tsang G. Wan K. Whitelaw K. Kimmel B. Celniker S. Rubin G.M. Genetics. 1999; 153: 179-219PubMed Google Scholar). In this paper, I describe a Drosophila gene coding for an arginase that contains within its third intron the entire locuselav,1 a well characterized 13-kb-long gene that encodes an RNA-binding protein specifically present in all neurons (for a review see Ref. 8Yao K.-M. Samson M.-L. Reeves R. White K. J. Neurobiol. 1993; 24: 723-739Crossref PubMed Scopus (147) Google Scholar). The first identified so-called nested gene (Pcp) encodes a Drosophila pupal cuticule protein and maps within the intron of the adenosine 3 gene encoding guanine-adenosine ribosyl transferase (9Henikoff S. Keene M.A. Fechtel K. Fristrom J.W. Cell. 1986; 44: 33-42Abstract Full Text PDF PubMed Scopus (186) Google Scholar). Since then, more than 30 nested genes have been identified in Drosophila (10FlyBase Consortium Nucleic Acid Res. 1999; 27: 85-88Crossref PubMed Scopus (337) Google Scholar). The recent sequencing of 2.7 megabases of DNA in the Adhregion revealed 17 new ones (5Ashburner M. Misra S. Roote J. Lewis S.E. Blazej R. Davis T. Doyle C. Galle R. George R. Harris N. Hartzell G. Harvey D. Hong L. Houston K. Hoskins R. Johnson G. Martin C. Moshrefi A. Palazzolo M. Reese M.G. Spradling A. Tsang G. Wan K. Whitelaw K. Kimmel B. Celniker S. Rubin G.M. Genetics. 1999; 153: 179-219PubMed Google Scholar). Nested genes are not specific toDrosophila, and examples are known in humans (11Xu G.F. O'Connell P. Viskochil D. Cawthon R. Robertson M. Culver M. Dunn D. Stevens J. Gesteland R. White R. Cell. 1990; 62: 599-608Abstract Full Text PDF PubMed Scopus (888) Google Scholar, 12Levinson B. Kenwrick S. Lakich D. Hammonds G.J. Gitschier J. Genomics. 1990; 7: 1-11Crossref PubMed Scopus (151) Google Scholar) and in mice (13Cawthon R.M. O'Connell P. Buchberg A.M. Viskochil D. Weiss R.B. Culver M. Stevens J. Jenkins N.A. Copeland N.G. White R. Genomics. 1990; 7: 555-565Crossref PubMed Scopus (112) Google Scholar). Among the 49 identified Drosophila nested genes, 37 are transcribed in the direction opposite to the direction of transcription of the gene into which they are inserted (5Ashburner M. Misra S. Roote J. Lewis S.E. Blazej R. Davis T. Doyle C. Galle R. George R. Harris N. Hartzell G. Harvey D. Hong L. Houston K. Hoskins R. Johnson G. Martin C. Moshrefi A. Palazzolo M. Reese M.G. Spradling A. Tsang G. Wan K. Whitelaw K. Kimmel B. Celniker S. Rubin G.M. Genetics. 1999; 153: 179-219PubMed Google Scholar). The gene that is described in this paper encodes a predicted polypeptide that shares 40–41% amino acid identity with human arginases (for a review see Ref. 14Jenkinson C.P. Grody W.W. Cederbaum S.D. Comp. Biochem. Physiol. B. Biochem. Mol. Biol. 1996; 114: 107-132Crossref PubMed Scopus (498) Google Scholar). It will be referred to as thearginase gene, or arg. Arginases convert arginine into urea and ornithine. Sequence analysis of arginase and arginase-like sequences suggests that arginase was probably present in the primordial ancestor, before the divergence of eucaryotes and procaryotes, and that a gene duplication occurred before the divergence of mammals and amphibians from their most recent common ancestor (14Jenkinson C.P. Grody W.W. Cederbaum S.D. Comp. Biochem. Physiol. B. Biochem. Mol. Biol. 1996; 114: 107-132Crossref PubMed Scopus (498) Google Scholar, 15Patterton D. Shi Y.B. J. Biol. Chem. 1994; 269: 25328-25334Abstract Full Text PDF PubMed Google Scholar, 16Ouzounis C.A. Kyrpides N.C. J. Mol. Evol. 1994; 39: 101-104Crossref PubMed Scopus (69) Google Scholar). In terrestrial vertebrates (ureotelic), two forms of the enzyme are produced, arginase I (A-I), a cytosolic form specific for liver and red cells, and arginase II (A-II), a ubiquitous mitochondrial form found in particular in kidneys and brain. A-I is a key enzyme for the hydrolysis of arginine in the urea cycle, producing urea that enables the elimination of excess ammonia. The A-I form contributes 98% of the arginase activity in liver and is absent in clinical arginemia. Patients with arginemia have low levels of arginase activity in red cells, accumulate arginine in their blood and spinal fluid, and have higher than normal blood levels of ammonia. Accumulation of ammonia has very toxic effects in animals, and although patients with moderate decreases of A-I levels respond well to dietary treatments, severe A-I deficiencies cause severe mental and psychomotor retardation, mental disorders, epileptic seizures, coma, and early death. The effect of alteration in the activity of extrahepatic A-II has not been described. It has been proposed to be involved in the synthesis of polyamines, amino acids, and neurotransmittors and competes with nitric-oxide synthase in macrophages, but its role remains poorly defined (for reviews see Refs. 14Jenkinson C.P. Grody W.W. Cederbaum S.D. Comp. Biochem. Physiol. B. Biochem. Mol. Biol. 1996; 114: 107-132Crossref PubMed Scopus (498) Google Scholar and 17Boucher J.L. Moali C. Tenu J.P. Cell Mol. Life Sci. 1999; 55: 1015-1028Crossref PubMed Scopus (250) Google Scholar). It has been suggested that the mitochondrial A-II is the surviving form of the ancestral arginase, because the cytosolic A-I is restricted to a subset of more recently evolved species (14Jenkinson C.P. Grody W.W. Cederbaum S.D. Comp. Biochem. Physiol. B. Biochem. Mol. Biol. 1996; 114: 107-132Crossref PubMed Scopus (498) Google Scholar). Aside from ureotelic organisms, most others produce only one form of arginase. In particular, in S. cerevisiae, arginase is encoded by the CAR1 gene, which has been extensively characterized (18Jauniaux J.C. Dubois E. Vissers S. Crabeel M. Wiame J.M. EMBO J. 1982; 1: 1125-1131Crossref PubMed Scopus (31) Google Scholar, 19Green S.M. Eisenstein E. McPhie P. Hensley P. J. Biol. Chem. 1990; 265: 1601-1607Abstract Full Text PDF PubMed Google Scholar). CAR1 encodes a cytosolic arginase that converts arginine to urea and ornithine. Although S. cerevisiae produces all the enzymes required to complete the urea cycle, compartmental separation of metabolic pathways and sophisticated regulation lead to the absence of a functional urea cycle. Sequencing of the S. cerevisiae genome did not reveal any additional arginase-related sequences (1Mewes H.W. Albermann K. Bahr M. Frishman D. Gleissner A. Hani J. Heumann K. Kleine K. Maierl A. Oliver S.G. Pfeiffer F. Zollner A. Nature. 1997; 387 (suppl.): 7-65Crossref Scopus (461) Google Scholar). In C. elegans, whose entire genome has also been sequenced (2C. elegans Genome Sequencing Consortium Trends Genet. 1999; 15: 51-58Abstract Full Text Full Text PDF PubMed Scopus (78) Google Scholar, 3C. elegans Genome Sequencing Consortium Science. 1998; 282: 2012-2018Crossref PubMed Scopus (3548) Google Scholar), only one gene showing significant relationship with known arginase genes has been identified (20Perozich J. Hempel J. Morris S.M.J. Biochim. Biophys. Acta. 1998; 15: 23-37Crossref Scopus (90) Google Scholar). No biological information is available yet on the function of this gene. Animals have evolved pathways adapted to their lifestyle for the excretion of urea, uric acid, or ammonia as the major nitrogeneous end product. The excretion of urea produced through the urea cycle is specific for ureotelic organisms. Insects (uricotelic) convert most of their excess ammonia to uric acid, an oxidized purine, via an arginase-independent pathway. Thus, the arginase function in uricotelic organisms, and more generally in organisms that do not have a functional urea cycle, is not centered on ammonia detoxification and remains poorly understood. Analysis of the arginase gene ofDrosophila, amenable to molecular and genetic analysis, should give insights into this other aspects of arginase activity. A Drosophilahead cDNA library in the λgt11 vector (a gift from Paul Salvaterra; Ref. 21Itoh N. Salvaterra P. Itakura K. Drosophila Information Service. 1985; 61: 89Google Scholar) was screened with a genomic probe corresponding to a mixture of two radiolabeled BamHI restriction fragments (kb 8–18 on the scale in Fig. 1) to obtain elav cDNAs clones. Among the cDNAs isolated in this screen, one clone (cDNA-30h) originated from a transcription unit different fromelav. Its analysis is reported in this paper. cDNA-30h phage lysates were prepared on plates and used to generate PCR products corresponding to the cDNA insert. Amplification mixtures were as perTaq DNA polymerase manufacturer's specifications (Promega), with 2 μl of lysate/25-μl reaction, using a pair of primers hybridizing to regions flanking the cDNA insert in λgt11, respectively gt11F (5′-GGTGGCGACGAGTCCTGGAGC) and gt11R (5′-GACACCAGACCAACTGGTAAT). Amplification yield was not affected by the lysate titer in the 108–1011 plaque-forming unit/ml range. Taq polymerase addition to the PCR mixture was performed after the initial step of hot start (95 °C for 1 min and 80 °C for 15 min). Amplification was performed with 30 cycles of 95 °C for 1 min, 55 °C for 1.5 min, and 72 °C for 1.5 min. After purification on the Wizard PCR preps purification system (Promega), the PCR product was sequenced using gt11 primers and subsequently primers hybridizing in the cDNA sequence. When cDNA-30h was originally characterized, the genomic sequence upstream of the BamHI restriction site at position 1 (see Fig. 1) had not yet been determined. To verify the predicted structure of cDNA-30h and determine the distance between 3′ elavgenomic sequences and the genomic region corresponding to cDNA-30h, genomic PCR (with DNA from the wild type strain CantonS) were performed, using oligonucleotides ARG2 (5′-GTCGCGCTCGCTGCCTAG) and ARG3 (5′-GATAGGCACTCCAGCGAAC), as shown on Fig. 1. Amplification mixtures included 50 ng of genomic DNA in a 25-μl reaction. Enzyme addition to the PCR mixture was performed after the initial step of hot start (95 °C for 1 min). Amplification was performed with 30 cycles of 95 °C for 1 min, 55 °C for 1.5 min, and 72 °C for 1.5 min. Sequence determination of phage and genomic PCR products were performed by the University of Nebraska Medical Center/Eppley molecular biology core facility on Applied Biosystems and Li-Cor sequencers. The sequence of the Drosophila arginase is available from GenBankTM under accession number AF228517. Sequence analysis and comparisons were performed with the Wisconsin Package (version 10.0; Genetics Computer Group, Madison, WI). Protein sequences were obtained from the following Swiss Protein Data Base accession numbers: P78540(A-II Homo sapiens), O08691 (A-II Mus musculus),Q91553 (A-II Xenopus laevis), P05089 (A-I H. sapiens), Q61176 (A-I M. musculus), X69820 (A-IX. laevis), P00812 (S. cerevisiae), and GenBankTM accession numbers: AF228517 (D. melanogaster) and U56959 (C. elegans). The Wisconsin Package was used to generate sequence alignments. Total CantonS RNA was prepared from frozen tissues by the guanidine-HCl method (22Cox R.A. Methods Enzymol. 1968; 12: 120-129Crossref Scopus (403) Google Scholar) for developmental Northern blot analysis. Total RNAs produced from arg andelav mutants stocks was purified from whole flies with TRIzol reagent (Life Technologies, Inc.). Northern blot analysis was performed as described in Ref. 8Yao K.-M. Samson M.-L. Reeves R. White K. J. Neurobiol. 1993; 24: 723-739Crossref PubMed Scopus (147) Google Scholar. In situ hybridization was performed with digoxigenin-substituted dUTP, essentially as in Ref. 23Tautz D. Pfeifle C. Chromosoma. 1989; 98: 81-85Crossref PubMed Scopus (2081) Google Scholar, using cDNA-30h as a template for transcription. All crosses were maintained on standard cornmeal medium at 25 °C. Stockelav 4 w/FM6, w/Dp(1;Y)y + sc was obtained from the Bloomington Stock Center. It carries a double mutation (elav 4) affecting both elav andarg, FM6 is a X chromosome balancer, andDp(1;Y)y + sc is a Y chromosome carrying a translocation of the distal region of the X chromosome including elav. Mutant arg stocks (elav 4/elav 4/Y;350-83-1/350-83-1) were built with standard genetic techniques by combining the double mutationelav 4 with a transposon (350-83-1, Ref. 24Samson M.-L. Genetics. 1998; 150: 723-733Crossref PubMed Google Scholar) carrying a functional elav gene, but no argcoding sequence. elav e5 is a loss of functionelav allele (8Yao K.-M. Samson M.-L. Reeves R. White K. J. Neurobiol. 1993; 24: 723-739Crossref PubMed Scopus (147) Google Scholar). Transposon 353-66-2, similar to350-83-1 carries a functional elav gene but noarg coding sequence (24Samson M.-L. Genetics. 1998; 150: 723-733Crossref PubMed Google Scholar). Developmental delay of the flies carrying the arginasemutation was shown by crossing femaleselav 4/elav 4;350-83-1/350-83-1 with males y w sn. Recombination occurs in the resulting y w sn/elav 4; 350-83-1/+ daughters, providing an opportunity to separate elav 4(hence arg −) from other possible alterations on the X chromosome and thus homogenizing the genetic background. These females were crossed with y w/Y; Tf (2C. elegans Genome Sequencing Consortium Trends Genet. 1999; 15: 51-58Abstract Full Text Full Text PDF PubMed Scopus (78) Google Scholar)DmORF3/T (2;3)ap Xa males, where Tf (2C. elegans Genome Sequencing Consortium Trends Genet. 1999; 15: 51-58Abstract Full Text Full Text PDF PubMed Scopus (78) Google Scholar)DmORF3 is a transposon providing elav + function (25Yao K.-M. White K. Mol. Cell. Biol. 1991; 11: 2994-3000Crossref PubMed Scopus (23) Google Scholar) but noarg function and T (2;3)ap Xa is a balancer. The parents were removed from the vials after 4 days. The latter cross yields hemizygous arg − andarg + males, homozygousarg +/arg + and heterozygous arg −/arg +females, all segregant brothers and sisters. The yellow(y) and elav genes are very closely spaced on the X chomosome (about 150 kb apart, based upon data from Ref. 6Rubin G.M. Hong L. Brokstein P. Evans-Holm M. Frise E. Stapleton M. Harvey D.A. Science. 2000; 287: 2222-2224Crossref PubMed Scopus (300) Google Scholar) and are at the same coordinates (1–0.0) on the genetic map (10FlyBase Consortium Nucleic Acid Res. 1999; 27: 85-88Crossref PubMed Scopus (337) Google Scholar). They recombine at a frequency significantly inferior to 1%. Thus, in the vast majority, y + male progeny areelav 4 (arg −), whereas y males progeny are elav +(arg +). Heterozygousarg +/arg − females were identified as y+ females and homozygousarg +/arg + females asy females. A head cDNA library was screened with a genomic probe corresponding to the presumptive 3′-untranslated region of the elav gene (kb 8–18 on the scale in Fig. 1). Sequencing of the cDNA clones recovered in the screen identified a new transcription unit. The cDNA corresponding to the new transcription unit (cDNA-30h) is 1170 base pairs long. It contains a 349-amino acid-long open reading frame (nucleotides 2–1051), which begins with a tryptophan, indicating that the 5′ region of the mRNA is truncated in cDNA-30h. Searches of protein sequence data bases with the ORF present in cDNA-30h revealed its similarity with arginases. The arginase sequences of ureotelic (H. sapiens, M. musculus, and X. laevis) and nonureotelic (S. cerevisiae and C. elegans) organisms were compared with the Drosophila putative arginase sequence. Arginases contain about 300 amino acids. They include three conserved regions (arginase signatures 1, 2, and 3 from the N terminus to the C terminus) with charged residues involved in the binding of two manganese ions to histidines and aspartic acids in the motifs (26Kanyo Z.F. Scolnick L.R. Ash D.E. Christianson D.W. Nature. 1996; 383: 554-557Crossref PubMed Scopus (386) Google Scholar). Comparison of protein sequences highlights the presence of these three arginase family signatures in the Drosophila sequence (Fig.2). The motifs are present in all the compared sequences, and all the invariant residues of the signatures are present. However some positions do not match the defined signature sequences. First, in arginase signature 1, Cys is found in C. elegans where Leu, Ile, Val, Met, or Thr is expected, and Gly is found in D. melanogaster where Ser, Thr, Ala, or Val is expected. Second, in arginase signature 3, Ser is found in D. melanogaster and Val is found in C. elegans where Pro, Ala, or Gln is expected. Extensive alignments of arginase sequences from different species highlight invariant residues (14Jenkinson C.P. Grody W.W. Cederbaum S.D. Comp. Biochem. Physiol. B. Biochem. Mol. Biol. 1996; 114: 107-132Crossref PubMed Scopus (498) Google Scholar, 20Perozich J. Hempel J. Morris S.M.J. Biochim. Biophys. Acta. 1998; 15: 23-37Crossref Scopus (90) Google Scholar). Sequences from 19 arginases from vertebrates and procaryotic/eucaryotic unicellular organisms share 37 invariant amino acids (20Perozich J. Hempel J. Morris S.M.J. Biochim. Biophys. Acta. 1998; 15: 23-37Crossref Scopus (90) Google Scholar), with 36 of them present in the D. melanogaster arginase and 27 of them present in the C. elegans arginase-related sequence (Fig.2). Pairwise comparisons (Fig. 3) of vertebrate A-II show that their sequences are 71–85% identical. A-I arginases are 64–87% identical. However, comparisons between the A-I and A-II arginases shows a lower degree of similarity, with 54–67% identity. This is consistent with the model of an arginasegene duplication giving rise to the genes for A-I and A-II (15Patterton D. Shi Y.B. J. Biol. Chem. 1994; 269: 25328-25334Abstract Full Text PDF PubMed Google Scholar, 20Perozich J. Hempel J. Morris S.M.J. Biochim. Biophys. Acta. 1998; 15: 23-37Crossref Scopus (90) Google Scholar). In contrast, the three proteins from invertebrates similarly resemble hepatic and nonhepatic vertebrate arginases, the identity levels with vertebrate arginases being, respectively, 39–40% for D. melanogaster arginase, 40–45% for S. cerevisiae, and 26–31% for C. elegans. This is consistent with a duplication of the arginase gene giving rise to the genes producing hepatic and extrahepatic arginases after divergence between the vertebrates and the invertebrates (15Patterton D. Shi Y.B. J. Biol. Chem. 1994; 269: 25328-25334Abstract Full Text PDF PubMed Google Scholar, 20Perozich J. Hempel J. Morris S.M.J. Biochim. Biophys. Acta. 1998; 15: 23-37Crossref Scopus (90) Google Scholar).Figure 2Protein sequence alignments between the virtual translation of cDNA-30h (D. melanogasterarginase) and representative arginase sequences from the data base. H.s., H. sapiens; Mm,M. musculus; X.l., X. laevis;D.m., D. melanogaster; S.c., S. cerevisiae; C.e., C. elegans. The predicted N-terminal of the D. melanogaster arginase ORF includes two additional amino acids (M and W) upstream of cDNA-30h ORF (see text), shown in parentheses. Below the alignment, a consensus sequence (plurality 7) is shown, where residues present in all nine compared species are in bold type. The threeblack boxes frame the arginase family signatures. Residues are circled at the positions (in the D. melanogaster and C. elegans sequences) that do not match the signatures consensus. The 37 positions marked withasterisks are strictly conserved in 19 arginases from procaryotes and eucaryotes (20Perozich J. Hempel J. Morris S.M.J. Biochim. Biophys. Acta. 1998; 15: 23-37Crossref Scopus (90) Google Scholar). Respectively 1 and 10 of these positions highlighted with small shadowed caps differ inD. melanogaster and C. elegans. Thevertical arrow separates the N-terminal region of theDrosophila arginase (encoded by sequences downstream of theelav transcription unit) from its C-terminal region (encoded by sequences upstream of the elav transcription unit).View Large Image Figure ViewerDownload Hi-res image Download (PPT)Figure 3Matrix of the percentages of amino acid identity between the arginases aligned in Fig. 2. Percentages were calculated with the GAP program of GCG, as specified under "Experimental Procedures." Species names symbols are as defined in Fig. 1.View Large Image Figure ViewerDownload Hi-res image Download (PPT) Alignment of cDNA-30h with genomic sequence shows that it derives from regions that flank the elav gene (Fig. 1). Based upon the polarity of the arg ORF in cDNA-30h, complementary strands of DNA are transcribed to give rise to the elav and arg transcripts. Limited sequence information was available when the cDNA was first identified (kb 0–18 on scale in Fig. 1; GenBankTM Accession NumberAF047180; Ref. 27Robinow S. Campos A.R. Yao K.-M. White K. Science. 1988; 242: 1570-1572Crossref PubMed Scopus (248) Google Scholar), 2M.-L. Samson, unpublished observations. and the data suggested that the 3′ end of cDNA-30h derived from a region of DNA upstream of the elav gene. Genomic PCR ofD. melanogaster DNA using primers ARG2 (designed according to cDNA-30h sequence) and ARG3 (designed according to the genomic sequence, see Fig. 1) was performed to demonstrate that elavis within an arg intron. A 1-kb-long PCR product was obtained and sequencing of its ends proved its specificity (not shown). Subsequent genomic sequencing of the entire region (6Rubin G.M. Hong L. Brokstein P. Evans-Holm M. Frise E. Stapleton M. Harvey D.A. Science. 2000; 287: 2222-2224Crossref PubMed Scopus (300) Google Scholar), with data from cosmid 171D11 (accession AL009147) and from cosmid 65F1 (accessionAL022139) around position kb 14 of the scale in Fig. 1, is in agreement with these data. The cDNA-30h clone is truncated at its 5′ end, because the ORF that it contains begins with a tryptophan at nucleotide 2 of the sequence. Four arg expressed sequence tags (ESTs) fromDrosophila adult head tissues have been reported (4Adams M.D. Celniker S.E. Holt R.A. Evans C.A. Gocayne J.D. Amanatides P.G. Scherer S.E. Li P.W. Hoskins R.A. Galle R.F. George R.A. Lewis S.E. Richards S. Ashburner M. Henderson S.N. et al.Science. 2000; 287: 2185-2195Crossref PubMed Scopus (4744) Google Scholar). ESTs are sequences of the 5′ or 3′ end of a cDNA that are generated to rapidly identify expressed genes in the genome. None of these ESTs overlaps the junction between exon 3 and 4, but they provide useful information about the 5′ and 3′ ends of the arg RNA. Two of the reported 5′ ESTs (GH02581 and GH02569) begin 99 nucleotides upstream of the first nucleotide of cDNA-30h, where their sequence is colinear with genomic DNA sequence. They are likely to define the actual 5′ end of the transcript, which generated CDNA-30 h. Their structure was taken into account to predict the true N terminus of thearg ORF, which includes two amino acids (methionine initiation codon, then tryptophan) upstream of the cDNA-30h ORF, as shown in Fig. 1. The ORF in cDNA-30h is followed by 119 nucleotides, ending with a stretch of eight A residues, likely corresponding to the complete 3′-untranslated region of the mRNA. This is consistent with the structure of the 481-base pair-long arg 3′ EST (GH02581), which identifies the same 3′ end for the arg RNA. The expression of arg was examined during the course of development using single-stranded RNA probes on Northern transfers. A single 1.3-kb-longarg transcript was identified (Fig.4) whose size is consistent with cDNA-30h sequencing and EST data. arg transcripts start accumulating during the last stages of embryogenesis (16-h-old embryos), quickly reach a plateau at 20 h of development, and are thereafter found at relatively high levels. Not surprisingly, thearg transcript is present, but not significantly enriched, in adult head RNA extracts (Fig. 4, lanes 10 and11), consistent with the fact that cDNA-30h and the four arginase ESTs derive from head cDNA libraries. The arg and elav mRNAs arise from transcription of the same genomic region in opposite directions. Based upon data from cDNA-30h, genomic DNA, and thearg ESTs, the structure of the mature argmRNA is well defined. In contrast, multiple developmentally regulated elav mRNA are produced from theelav gene (8Yao K.-M. Samson M.-L. Reeves R. White K. J. Neurobiol. 1993; 24: 723-739Crossref PubMed Scopus (147) Google Scholar, 28Campos A.R. Rosen D.R. Robinow S.N. White K. EMBO J. 1987; 6: 425-431Crossref PubMed Scopus (91) Google Scholar), and their precise structure has not been determined. Thus, it is unclear how much overlap may exist between mature transcripts of arg and elav. An attempt was made to re-evaluate the proposal (based upon Northern blot analy
Referência(s)