Crystal structure of an eIF4G‐like protein from Danio rerio
2010; Wiley; Volume: 78; Issue: 7 Linguagem: Inglês
10.1002/prot.22703
ISSN1097-0134
AutoresEuiyoung Bae, E. Bitto, C.A. Bingman, Jason G. McCoy, G.E. Wesenberg, G.N. Phillips,
Tópico(s)RNA modifications and cancer
ResumoThe gene LOC791917 Danio rerio (zebrafish) encodes a protein annotated in the UniProt knowledgebase1 as the “middle domain of eukaryotic initiation factor 4G domain containing protein b” (MIF4Gdb). Its molecular weight is 25.8 kDa, and it comprises 222 amino acid residues. BLAST searches revealed homologues of D. rerio MIF4Gdb in many eukaryotes including humans.2 The homologues and MIF4Gdb were identified as members of the Pfam family, MIF4G (PF02854), which is named after the middle domain of eukaryotic initiation factor 4G (eIF4G).3-5 eIF4G is a component of eukaryotic translational initiation complex, and contains binding sites for other initiation factors, suggesting its critical role in translational initiation.6 The MIF4G domain also occurs in several other proteins involved in RNA metabolism, including the Nonsense-mediated mRNA decay 2 protein (NMD2/UPF2), and the nuclear cap-binding protein 80-kDa subunit (CBP80).5 Sequence and structure analysis of the MIF4G domains in many proteins indicate that the domain assumes an all helical fold and has tandem repeated motifs.5, 7 The zebrafish protein described here has homology to domains of other proteins variously referred to as NIC-containing proteins (NMD2, eIF4G, CBP80). The biological function of D. rerio MIF4Gdb has not yet been experimentally characterized, and the annotation is based on amino acid sequence comparison. D. rerio MIF4Gdb did not share more than 25% sequence identity with any protein for which the three-dimensional structure is known and was selected as a target for structure determination by the Center for Eukaryotic Structural Genomics (CESG). Here, we report the crystal structure of D. rerio MIF4Gdb (UniGene code Dr.79360, UniProt code Q5EAQ1, CESG target number GO.79294). The gene coding for MIF4Gdb was selected as a target as part of a group of genes chosen to code for proteins that are as dissimilar as possible to structures previously deposited in the Protein Data Bank, and also as dissimilar as possible to targets that CESG had previously selected. It was assigned a project database identifier of GO.79294. Complete, detailed protocols for the production of this protein can be found in PepcDB.8 Briefly, the gene was cloned into pVP33K, the first production Flexi®Vector9 used on our project, and selenomethionyl protein was purified following the standard CESG pipeline protocol for cloning,10 protein expression,11 protein purification12 and overall information management.13 Initial crystallization screens were conducted at 4°C and 20°C, in Corning 3775 plates, using a local screen called UW-192. Crystal growth was monitored using Bruker Nonius Crystal Farms at 4°C and 20°C, and scored using Crystal Farm Navigator (Nexus Biosystems). A Tecan Genesis RSP 150 robot assembled precipitant solutions for optimization experiments. Diffraction quality crystals were grown in hanging drop batch experiments. Crystals were grown from 10 mg/mL protein solution in buffer (50 mM NaCl, 3 mM NaN3, 0.3 mM tris(2-carboxyethyl)phosphine (TCEP), 5 mM Bis-Tris pH 7.0) mixed with an equal amount of reservoir solution containing 7% (w/v) poly(ethylene glycol) (PEG) 4K, 0.4 M NaCl, 100 mM MES/acetate pH 5.5 at 20°C. The crystals were cryoprotected in 15% (w/v) ethylene glycol, 10% (w/v) PEG 4K, 100 mM MES/acetate pH 5.5 and were flash-frozen in liquid nitrogen. Diffraction data were collected at Southeast Regional Collaborative Access Team 22-ID beamline at the Advanced Photon Source, Argonne National Laboratory at 100 K. The diffraction images were processed with HKL2000.14 The selenium substructure of the crystal was determined by using SHELXD15 and HySS from PHENIX,16, 17 and the selenium positions were used for single wavelength anomalous diffraction phasing in autoSHARP.18 The initial model was built by the automatic tracing procedure of ARP/wARP,19 and the structure was completed using alternating cycles of manual building in COOT20 and refinement in REFMAC5.21 The stereochemical quality of the final model was assessed using MolProbity.22 PyMol was used to generate figures.23 The final coordinates were deposited in the RCSB Protein Data Bank24 with accession number 2I2O. The crystal structure of MIF4Gdb from D. rerio was determined to a resolution of 1.92 Å using single wavelength anomalous diffraction. Data collection and refinement statistics are summarized in Table I. The asymmetric unit of the structure contains two MIF4Gdb chains (residues 7–217 for chain A; residues 8–217 for chain B). Several N- and C-terminal residues were not included in the model due to insufficient electron density. The structure of the MIF4Gdb monomer reveals a crescent-shaped molecule consisting entirely of helices (13 α- and two 310-helices) and connecting loops (see Fig. 1). Except for the two terminal ones (h1 and h15), the 13 helices (h2–h14) are arranged as four huntingtin-elongation-A subunit-TOR (target of rapamycin)-like (HEAT-like) motifs containing armadillo repeats.25 Each HEAT-like motif consists mainly of two antiparallel α-helices (termed A and B) that are held together by hydrophobic interactions along their adjacent sides. The eight longer α-helices (h3, h5, h6, h8, h10, h12–h14) serve as the main constituents of the four HEAT-like motifs forming the characteristic antiparallel α-helical pairs, and the five shorter helices (h2, h4, h7, h9, and h11) are located either within a motif or between motifs mediating turns between adjacent helices. Each subunit has a nickel (or perhaps zinc) atom bound on the concave side of the crescent-shaped dimer. Whether there is a physiological role for the bound metal ion is unknown. Crystal structure of D. rerio MIF4Gdb. (A) Ribbon diagram of D. rerio MIF4Gdb structure. A rainbow gradient is used to color each polypeptide chain from its N-terminus (blue) to C-terminus (red). Pairs of α-helices can be seen to form an extended sheet. The C-termini form dimer contacts and the N-terminal regions are available for interaction with other proteins, in keeping with other homologs. (B) Representative electron density of the MIF4Gdb structure. Amino acid residues are colored by atom types (carbon: green; oxygen: red; nitrogen: blue), and the 2Fobs – Fcalc map is contoured at 1.5σ. The consecutive HEAT-like motifs are stacked on each other and the polypeptide chain forms a right-handed solenoid. The stacking of the four HEAT-like motifs is parallel. That is, the helices of the same type (A or B) in the motifs are located side by side. This parallel arrangement of the four HEAT-like motifs generates a double layer of α-helices in which the four A helices forms one face and the four B helices comprise the other. The structure has an extended hydrophobic core, which is essentially the area between the two layers formed by the A and B helices. The hydrophobic core is stabilized by salt bridges and Van der Waals interactions between the conserved nonpolar residues. Tandem arrays of HEAT-like motifs are found in a wide variety of proteins where they serve as scaffolding modules for assembly of large multiprotein complexes.25 They include huntingtin protein, protein phosphatase 2A (pp2A), importin β, elongation factor 3, and many others. The closest structural neighbors of the MIF4Gdb monomer identified by vector alignment search tool (VAST) search are the middle segment of eukaryotic initiation factor 4GI (eIF4GI) from Saccharomyces cerevisiae (PDB ID: 2VSX) and the middle domain of human eIF4GII (PDB ID:1HU3).26, 27 The MIF4Gdb structure described here superposes onto the homologous chain of the yeast structure with VAST score 12.4, 2.5 Å RMSD and 20.4% sequence identity over 167 aligned residues. The overlay with the human protein is comparable, with a score of 12.5, 2.4 RMSD Å, and 24.7% identity over 154 amino acids. The next closest structure in the list is the pp2A (PDB ID: 3GFA) with only 8.9% sequence identity. The comparison of these structures reveals the same overall fold with slight differences in the orientation of the N- and C-terminal helices. All three appear to be dimers in the physiological state. The dimeric interfaces amongst these three most similar proteins are also conserved, based on analysis with the protein interfaces, surfaces and assemblies (PISA) server,28 giving a Q score of 0.371 and 0.370 for 1HU3 and 2VSX with our structures, respectively. The yeast eIF4GI is one of the two isoforms (eIF4GI and eIF4GII) of the translation initiation factor eIF4G, a modular adaptor protein that recruits the components necessary for the initiation of protein synthesis in eukaryotes.29, 30 In the yeast complex, the helical domains of eIf4G serve to orient DEAD-box sequence motifs of an RNA helicase in a way that they become active. Some eIF4G proteins also bind other eukaryotic initiation factors and picornaviral internal ribosome entry site (IRES) elements.31 Previous structural and mutational studies identified several residues of eIF4G involved in eIF4A and IRES binding.27, 32 Although some of these residues are also present in MIF4Gdb, it is not straightforward to make a prediction about eIF4G-like function of MIF4Gdb on this basis as the ligand-binding residues are poorly conserved even among the eukaryotic eIF4G homologues.32 In a more recent study, a human protein with 72% sequence identity to the zebrafish eIF4Gdb was shown to be involved in stem loop rather than polyA-mediated translation,33 which is common for histone mRNAs. This human protein has been named SLIP1, for stem loop binding protein (SLBP) interacting protein 1. Given the sequence identity and the known role of similar proteins, it seems quite possible that the purpose of the protein whose structure is defined here is also to act as a scaffold that helps assemble components of translation machinery. We thank all the members of the CESG. Data were collected at SER-CAT 22-ID beamline at the APS, Argonne National Laboratory. We thank John Chrzas and John Gonczy for support during our data collection. Supporting institutions may be found at www.ser-cat.org/members.html. Use of the Advanced Photon Source was supported by the U. S. Department of Energy, Office of Science, Office of Basic Energy Sciences, under Contract No. W-31-109-Eng-38.
Referência(s)