Regulation of multispanning membrane protein topology via post-translational annealing
2015; eLife Sciences Publications Ltd; Volume: 4; Linguagem: Inglês
10.7554/elife.08697
ISSN2050-084X
AutoresReid C. Van Lehn, Bin Zhang, Thomas F. Miller,
Tópico(s)Protein Structure and Dynamics
ResumoFull text Figures and data Side by side Abstract eLife digest Introduction Results Discussion Materials and methods Appendix 1 References Decision letter Author response Article and author information Metrics Abstract The canonical mechanism for multispanning membrane protein topogenesis suggests that protein topology is established during cotranslational membrane integration. However, this mechanism is inconsistent with the behavior of EmrE, a dual-topology protein for which the mutation of positively charged loop residues, even close to the C-terminus, leads to dramatic shifts in its topology. We use coarse-grained simulations to investigate the Sec-facilitated membrane integration of EmrE and its mutants on realistic biological timescales. This work reveals a mechanism for regulating membrane-protein topogenesis, in which initially misintegrated configurations of the proteins undergo post-translational annealing to reach fully integrated multispanning topologies. The energetic barriers associated with this post-translational annealing process enforce kinetic pathways that dictate the topology of the fully integrated proteins. The proposed mechanism agrees well with the experimentally observed features of EmrE topogenesis and provides a range of experimentally testable predictions regarding the effect of translocon mutations on membrane protein topogenesis. https://doi.org/10.7554/eLife.08697.001 eLife digest Proteins are long chains of smaller molecules called amino acids, and are built inside cells by a molecular machine called the ribosome. Many important proteins must be inserted into the membrane that surrounds each cell in order to carry out their role. As these proteins are being built by the ribosome, they thread their way into a membrane-spanning channel (called the translocon) from the inner side of the membrane. Short segments of these integral membrane proteins (called transmembrane domains) then become embedded in the membrane, while other parts of the protein remain on either side of the membrane. For a membrane protein to work properly, the end of each of its transmembrane domains must be on the correct side of the membrane (i.e., the protein must obtain the correct ‘topology’). The conventional model for this process suggests that topology is fixed when the first transmembrane domain of a protein is initially integrated into the membrane, while the ribosome is still building the protein. This model can explain most integral membrane proteins, which only have a single topology. However, it cannot explain the family of membrane proteins that have an almost equal chance of adopting one of two different topologies (so-called ‘dual-topology proteins’). Van Lehn et al. have now used computer modeling to simulate how a bacterial protein called EmrE (which is a dual-topology protein) integrates into the membrane via the translocon. The results reveal that a few transmembrane domains in EmrE do not fully integrate into the membrane while the ribosome is building the protein. Instead, these transmembrane domains slowly integrate after the ribosome has finished its job. These findings contradict the conventional model and suggest that some membrane proteins only become fully integrated after the protein-building process is complete. The next step in this work is to experimentally test predictions from the computer simulations. https://doi.org/10.7554/eLife.08697.002 Introduction Integral membrane proteins (IMPs) are central to cellular functions that include signal transduction, transport across the cell membrane, and energy conversion. Performing these roles requires integration of the IMPs into the membrane with the correct topology (i.e., the correct orientation of the fully integrated IMP relative to the membrane). In most cases, membrane integration proceeds via the Sec translocon, a conserved protein-conducting channel located in the endoplasmic reticulum membrane in eukaryotes or in the plasma membrane in bacteria (White and von Heijne, 2004). During this process, the ribosome or other molecular motor docks to the cytoplasmic opening of the translocon, feeding the nascent protein into the translocon channel (Shao and Hegde, 2011); conformational changes in the lateral gate (LG) helices of the translocon then allow sufficiently hydrophobic segments of the nascent protein to integrate as transmembrane domains (TMD) (Hessa et al., 2005; Egea and Stroud, 2010; Zhang and Miller, 2010; Gogala et al., 2014). The orientation of a single TMD relative to the membrane is determined by factors that include the hydrophobicity of the TMD and the charge and length of the soluble loops that flank the TMD (Goder and Spiess, 2001, 2003; Devaraneni et al., 2011). However, the extent to which these factors influence the topology of multispanning IMPs is less clear. The conventional model of multispanning IMP topogenesis assumes that a single dominant topology is established via the successive integration of TMDs that thread back-and-forth across the membrane in alternating orientations (Blobel, 1980; Wessels and Spiess, 1988; Sadlish et al., 2005). In this cotranslational model, the dominant IMP topology is determined by the orientation of the N-terminal TMD and is primarily dictated by the features of that leading TMD (Hartmann et al., 1989; Borel and Simon, 1996; Dale et al., 2000). However, the cotranslational model is challenged by dual-topology proteins, which exhibit both possible orientations of the fully integrated IMP with respect to the membrane in approximately 1:1 stoichiometry (Rapp et al., 2006, 2007). The most thoroughly studied dual-topology protein is the bacterial multidrug transporter EmrE (Chen et al., 2007), which can be biased in favor of a single dominant topology by introducing positive charges to any of its soluble loops (Rapp et al., 2006, 2007; Seppälä et al., 2010). The dominant topology of each EmrE mutant retains the loop with the additional positive charges in the cytoplasm (Seppälä et al., 2010), apparently satisfying the empirical trend known as the ‘positive-inside’ rule which notes that the combined charges of the cytoplasmic loops (i.e., K+R bias) of an IMP correlates with its dominant topology (von Heijne, 1986). Surprisingly, adding charges to even C-terminal loops can influence the dominant topology of EmrE, suggesting that such mutations have a long-range effect on the orientation of previously-translated TMDs. This finding is inconsistent with the cotranslational model and raises interesting questions about IMP topogenesis. At what point is IMP topology established with respect to ribosomal translation? Are TMD orientations locked-in during the period in which the nascent IMP is attached to the ribosome (i.e., cotranslationally) or do TMD orientations remain subject to change even upon completion of ribosomal translation (i.e., post-translationally)? In this work, we simulate the topogenesis of EmrE and its mutants to address limitations in the cotranslational model of IMP topogenesis by understanding when IMP topology is established (co- or post-translationally) and how topology is regulated. We use a coarse-grained (CG) model that enables access to a timescale of minutes while retaining sufficient chemical accuracy to capture the forces that drive membrane integration (Zhang and Miller, 2012a). The distribution of topologies predicted by the simulations are in good agreement with previous experimental findings (Rapp et al., 2007; Seppälä et al., 2010). The simulation results show that TMDs in the dual-topology mutants do not completely integrate by the end of translation; instead, the slow post-translational flipping of loops across the membrane allows misintegrated TMDs to reorient and insert into the membrane. The fully integrated topology is determined by the position of the loop that undergoes flipping most slowly. This work elucidates the mechanism by which dual-topology protein topology is established, reconciles dominant protein topologies with the positive-inside rule, and predicts the role that the translocon plays in mediating multispanning IMP topogenesis. Other examples of post-translational topological changes in diverse multispanning IMP systems suggest that this mechanism may have generality beyond EmrE (Lu et al., 2000; Lambert and Prange, 2001; Kanki et al., 2002; Skach, 2009; Öjemalm et al., 2012; Bowie, 2013; Virkki et al., 2014). Coarse-grained model The cotranslational integration and topogenesis of EmrE and its mutants is simulated using a recently developed CG model (Zhang and Miller, 2012a), which we employ essentially unchanged from its original introduction. Figure 1 illustrates the CG representation of a nascent protein and the protocol for simulating membrane integration. The ribosome, translocon, and nascent protein are all composed of CG beads. Each bead has a diameter of σ = 0.8 nm to represent approximately three amino-acid residues. This bead diameter is similar to the Kuhn length of polypeptides (Staple et al., 2008) so that the nascent protein can be treated as a freely jointed chain. The surrounding solvent and lipid bilayer are included implicitly, a technique that is used in other CG models of the translocon (Rychkova and Warshel, 2013). The time-evolution of nascent protein configurations is calculated using Brownian dynamics with a 100 ns timestep. The kinetics of the LG are modeled as stochastic transitions between a closed conformation, which prevents the nascent protein from exiting from the channel interior to the membrane, and an open conformation, which removes the barrier to membrane entry. All bead positions are projected onto the plane that passes along the translocon channel axis between the helices forming the LG. This off-lattice 2D approximation reflects the cylindrical geometry of the channel and is inspired by previous models of biopolymer translocation through nanopores (Huopaniemi et al., 2006). Beads representing the ribosome enclosure and translocon are placed to approximate their structures (Van den Berg et al., 2004; Frauenfeld et al., 2011). Two negative charges are placed on a bead at the cytosolic end of the translocon LG, whereas two positive charges are placed on a bead at the periplasmic end of the translocon LG. This charge distribution reflects the position of conserved charged residues (White and von Heijne, 2004) near the translocon LG that have been previously shown to affect single-spanning protein topogenesis (Goder et al., 2004). Full details of the model are provided in Appendix 1. Figure 1 with 2 supplements see all Download asset Open asset Schematic of Sec-mediated cotranslational integration of EmrE and corresponding simulation representation. (A) At top, an illustration of the structural motifs in EmrE, including indication of the charged residues in the soluble loops with black circles and the transmembrane domain (TMD)/loop numbering scheme that is employed in the text; below, the corresponding sequence of coarse-grained (CG) beads that represent the EmrE amino-acid sequence. TMDs and loops are assigned based on the hydropathy plot and consensus topology prediction shown in Figure 1—figure supplement 1. (B) At top, a schematic illustration of the sequential integration of TMDs to obtain a multispanning Nperi/Cperi topology, in which both the N- and C-terminal loops are positioned in the periplasm, according to the cotranslational model; below, representative simulation snapshots of EmrE as the nascent protein grows during translation, integrates into the membrane, and exits the channel in the Nperi/Cperi multispanning topology. The nascent protein is colored according to the legend at top, the ribosome is brown, and the translocon is green with translocon charges labeled explicitly. https://doi.org/10.7554/eLife.08697.003 The CG model is well-suited to simulating the kinetics of cotranslational IMP integration, a process that is challenging for atomistic models (Zhang and Miller, 2010; Gumbart et al., 2011; Zhang and Miller, 2012b; Rychkova and Warshel, 2013) due to the large system size (>100,000 atoms) and the long timescale (minutes) of translation. We note that the model does not include nascent protein secondary/tertiary structure, charged lipids, protein chaperones, or an electrostatic potential across the membrane. However, the model does include explicit LG/translation dynamics, electrostatic interactions with the translocon, water/bilayer transfer free energies, and a direct mapping between the nascent protein sequence and the CG representation. The model thus captures the major physicochemical features of the translocon-membrane system (White and von Heijne, 2004). Moreover, the model has been shown to accurately predict features of single-spanning IMP integration and topogenesis (Zhang and Miller, 2012a), including the sigmoidal dependence of stop-transfer efficiency on TMD hydrophobicity (Hessa et al., 2005), the inversion of signal-anchor orientation during translation (Goder and Spiess, 2003), and the effect of translation rates and sequence features on signal-anchor orientation (Goder and Spiess, 2003). In particular, the model has been shown (Zhang and Miller, 2012a) to correctly describe integration processes that are governed either by thermodynamics (Hessa et al., 2005) or kinetics (Goder and Spiess, 2003), and it has provided a means of understanding the competition between such effects. The model has also been shown to correctly predict the dominant topology for a three-TMD multispanning IMP with a strong positive-inside bias (Zhang and Miller, 2012a). The strong agreement between simulation and experimental results presented in this work further indicates that IMP topological determinants are captured at this CG resolution. EmrE protein The EmrE amino-acid sequence includes four hydrophobic domains and five hydrophilic loops, according to both the hydropathy plot and consensus topology prediction shown in Figure 1—figure supplement 1. The hydropathy plot was calculated using the Wimley–White hydrophobicity scale (Wimley et al., 1996). The black line in the hydropathy plot indicates the water–octanol transfer free energy per residue and the overlaid red line shows a moving average using a 7-residue window. The consensus topology prediction was generated by the TOPCONS 1.0 server (Bernsel et al., 2009) and agrees with previous representations of EmrE structural elements (Seppälä et al., 2010). Shaded regions in the hydropathy plot indicate the predicted TMDs and loops. In the CG model, each TMD is represented by four CG beads and each soluble loop is represented by five CG beads, as seen in Figure 1A. The CG beads assume one of four types as determined by the associated amino-acid residues in the nascent protein; these CG bead-types include V (moderately hydrophobic), L (very hydrophobic), Q (neutral-hydrophilic), and K (positively charged). Among these types, the CG beads vary with respect to their charge and their water/membrane transfer free energies (Appendix table 1). In the hydropathy profile, the N-terminal TMD (TMD1) is less hydrophobic than the other three TMDs, so its beads are assigned the V bead type. All other TMD beads are assigned the L bead type. Beads in each soluble loop are assigned to either the K or Q bead type, depending on the location of positive charges in the amino-acid sequence; positive charges are highlighted in red in the EmrE wild-type amino-acid sequence in Figure 1—figure supplement 1. Each K bead type is assigned a +2 charge, following previous work (Zhang and Miller, 2012a). Negative charges are excluded from the CG representation of EmrE, because EmrE exhibits a small number of such charges (Figure 1—figure supplement 1) and because the experimentally studied EmrE mutations focus only on the addition/removal of positively charged residues (Seppälä et al., 2010). Nonetheless, the effect of negatively charged residues in the CG simulation was explicitly tested in Figure 5—figure supplement 1 and was found to be minor. Similarly, the results of the simulations are robust with respect to changes in the modeling of TMD1 hydrophobicity (Figure 5—figure supplement 1) and loop length (Figure 3—figure supplement 3). Using the CG model, we consider a series of EmrE mutants from Rapp et al. (2007) and Seppälä et al. (2010). We include EmrE mutants with single charge mutations—K3, T28R, A52K, L85R, and R111—from Seppälä et al. (2010) and EmrE mutants with single dominant topologies—EmrE(Ncyto) and EmrE(Nperi)—from Rapp et al. (2007). We also consider a series of mutants in which the protein has either zero positive charge or positive charges in only a single loop—nEmrE, nK3, nT28R1, nT28R2, nT28R, nA52K, nL85R, and nR111—from Seppälä et al. (2010). This list includes all 16 of the EmrE and nEmrE mutants with single added charges studied experimentally by Seppälä et al. (2010); mutants with added C-terminal His residues or an extra TMD are not considered. Finally, we include a ‘cotranslationally-biased’, or CB, mutant that has elongated, 10-bead hydrophilic loops and two positives charges in the first, third, and fifth loops to create a strong K+R bias that favors a Ncyto/Ccyto topology (i.e., with both the N-terminal and C-terminal loops in the cytoplasm) according to the positive-inside rule (von Heijne, 1986; Rapp et al., 2006); this protein is expected to be strongly biased towards membrane integration via the cotranslational mechanism, providing a useful comparison with the other EmrE mutants. The CG representation of each mutant is listed in Appendix table 2; for each mutant, charge mutations are reflected by changing between Q-type and K-type beads at the appropriate point in the sequence. Despite its simplicity, we emphasize that the CG representation captures the major features of EmrE and its mutants, including the number of TMDs/loops and the distribution of charges. Simulation protocol As illustrated in Figure 1B, the dynamics of the ribosome/nascent protein/translocon complex is directly simulated using the CG model. Each CG trajectory is initiated with a short nascent protein attached to the ribosome exit channel; as a function of time, the nascent protein grows in length (while remaining attached to the ribosome) until it completes translation and is released from the ribosome. The dynamics of the nascent protein continue to be simulated until the protein reaches a fully integrated topology. Simulations are initialized from equilibrated configurations of the nascent protein, initially comprised of 9 CG beads, with the C-terminus attached to the ribosome exit channel (Figure 1—figure supplement 2). Translation is performed by adding a new CG bead to the C-terminus of the nascent protein and attaching it to the ribosome exit channel; the previous C-terminus is released from the exit channel. The simulation is then continued for 125 ms before the next bead is added, a simulation time which corresponds to a translation rate of 24 residues/s (Bilgin et al., 1992). At the end of translation, the C-terminus is released from the ribosome exit channel and simulations are continued until all beads in the TMDs are at least 4.5σ from the origin and integrated with either a Ncyto/Ccyto or Nperi/Cperi topology. The ribosome remains bound to the translocon for the duration of all simulations (Potter and Nicchitta, 2002; Schaltetzky and Rapoport, 2006). The distance threshold ensures that the final configuration of the protein has exited from both the ribosome and translocon channel. The trajectory termination criteria are designed to examine the effects of the Sec-facilitated membrane integration process on EmrE topogenesis. Specifically, it is assumed that upon reaching configurations in which all of the TMDs are integrated into the membrane, the protein topology remains irreversibly fixed for all subsequent times; physical processes that may lead to this irreversibility include the dimerization of EmrE proteins to form functional channels in the membrane (Lloris-Garcerá et al., 2012) or the degradation of undimerized EmrE proteins prior to topological inversion (Woodall et al., 2015). Given the symmetry of the membrane-protein interactions in the absence of the translocon, if the CG trajectories were allowed to run for infinitely long times to reach full equilibration after diffusing away from the translocon, the relative probability of the Ncyto/Ccyto and Nperi/Cperi topologies would be equal, regardless of the protein sequence. The employed trajectory termination criteria thus isolate the role of the non-equilibrium integration process in determining IMP topology. Demonstration of the robustness of the reported results to the cutoff values employed in the trajectory termination criteria are provided in the Robustness checks for the trajectory termination criteria section of the ‘Materials and methods’. The integration and orientation of a TMD is interpreted from the positions of hydrophobic beads in each TMD and the third bead in each hydrophilic loop. The coordinate system is defined with the x-axis perpendicular to the bilayer (Figure 1—figure supplement 2). The origin is placed at the center of the channel such that negative x-values indicate cytoplasmic positions. A TMD is considered integrated if −2σ ≤ x ≤ 2σ for all four hydrophobic beads, corresponding to positions within the implicit bilayer, and if all y-positions are outside of the translocon interior. A loop is considered to be in the cytoplasm if the position of the reference bead satisfies x < −σ and in the periplasm if x > σ. The Ncyto/Ccyto topology is reached if the first, third, and fifth loops are positioned in the cytoplasm and the second and fourth loops are positioned in the periplasm. The Nperi/Cperi topology has the opposite loop positions as shown in Figure 1B. For each mutant, 250 independent trajectories are performed for a total of 4000 CG trajectories and nearly 6000 min of aggregate simulation time. Error bars measure the standard error between 2 blocks of 125 simulated trajectories. Complete system configurations are saved every 50 ms while loop positions and TMD orientations are saved every 1 ms. Results Simulations match experimental observations of topology For all 16 of the EmrE and nEmrE mutants with single added charges studied by Seppälä et al. (2010), Figure 2 compares the experimentally observed IMP topologies with the prediction from the CG model. Specifically, the figure compares the fraction of fully integrated proteins that adopt the Ncyto/Ccyto topology, with the remainder in the Nperi/Cperi topology. The top and bottom rows show variants of EmrE and nEmrE respectively. Each mutant differs only in the number and location of charges in the hydrophilic loops. A schematic of each mutant drawn with the dominant topology predicted from simulations is included; positive charges are indicated as filled-in circles with additional charges relative to EmrE (top row) or nEmrE (bottom row) highlighted in red. The topologies determined experimentally in Seppälä et al. (2010) are expressed as the fraction of Ncyto/Ccyto topologies by dividing the cell activity of each protein coexpressed with the EmrE(Nperi) mutant by the total growth of the protein coexpressed with either the EmrE(Nperi) or EmrE(Ncyto) mutant (Seppälä et al., 2010), as described in the Experimental interpretation of EmrE topology section of the ‘Materials and methods’. Figure 2 with 2 supplements see all Download asset Open asset Topologies determined from simulations (blue) and compared to the experiments of Seppälä et al. (2010) (red), reporting the fraction of fully integrated integral membrane protein (IMP) configurations in the Ncyto/Ccyto topology. Error bars indicate the standard error measured from independent blocks of simulations or taken from Seppälä et al. (2010). The dominant topology for each mutant is indicated schematically with additional positive charges relative to EmrE (top) or nEmrE (bottom) drawn as red dots. https://doi.org/10.7554/eLife.08697.006 It is clear from Figure 2 that the simulations are in excellent qualitative agreement with the experiments by correctly predicting the near 1:1 stoichiometry of wild-type EmrE and identifying the dominant topology for nearly all of the proteins considered. Figure 2—figure supplement 2 illustrates that the distribution of topologies determined experimentally and the distribution of topologies measured from the simulations are linearly correlated (Pearson correlation coefficient, r = 0.92); points lying in the two shaded quadrants of the graph correspond to proteins for which the simulations and experiments predict consistent topologies. All mutants, with the exception of A52K, have the same dominant topology in the simulations as in the experiments within the statistical error. The agreement between simulations and experiments suggests that the CG model correctly reproduces the essential molecular features of topogenesis; in the following, we analyze the ensembles of CG trajectories that give rise to these computed IMP topologies. Dual-topology proteins exhibit slow post-translational integration To investigate the molecular processes that govern the establishment of EmrE topology, we first examine the kinetics by which fully integrated topologies are reached. As a function of time, Figure 3A shows the fraction of CG trajectories in which the studied protein reaches a fully integrated topology for several EmrE mutants and the CB mutant. 0 s corresponds to the end of translation and negative values of time correspond to the period that precedes the end of ribosomal translation in which the nascent protein is still attached to the ribosome. Over 90% of the CB mutant trajectories reach the Ncyto/Ccyto topology within 3 s following the completion of translation and thus rapidly integrate as expected for the cotranslational model (Blobel, 1980; Wessels and Spiess, 1988; Sadlish et al., 2005); mechanistic features of individual TMD integration steps are discussed in the Cotranslational integration pathways section of the ‘Materials and methods’. In contrast, all variants of EmrE reach a fully integrated topology much more slowly, requiring hundreds of seconds for some CG trajectories to fully integrate (see also Figure 3—figure supplement 2). Figure 3 with 3 supplements see all Download asset Open asset Kinetics of EmrE topogenesis. (A) Fraction of CG trajectories in which all TMDs are fully integrated in a multispanning topology, plotted as a function of time for several mutants. (B) Fraction of CG trajectories in which each TMD is integrated, plotted as a function of time for the cotranslationally-biased (CB) mutant (top) and EmrE (bottom). The snapshots show an example of a simulation in which TMD4 of EmrE does not integrate during translation. In both panels, 0 s corresponds to the end of translation and negative values of time correspond to the period that precedes the end of ribosomal translation. https://doi.org/10.7554/eLife.08697.009 The slow post-translational integration of the dual-topology EmrE mutants is due to the fact that a significant fraction of trajectories exhibit configurations in which some TMDs are not fully integrated at the end of translation. As a function of time, Figure 3B shows the fraction of CG trajectories in which each TMD is integrated for both the CB mutant (top) and EmrE (bottom). TMDs in the CB mutant integrate sequentially with near 100% efficiency during translation, which is consistent with the standard cotranslational model of topogenesis (c.f. Figure 1) and explains the rapid timescale for fully integrating into a multispanning topology shown in Figure 3A. In contrast, the TMDs of EmrE exhibit only partial integration, even at long times following the completion of translation. Snapshots of a typical misintegrated TMD in EmrE are shown in Figure 3B. Various experiments have indicated that such configurations with misintegrated TMDs arise due to frustration from charges placed in consecutive loops (Gafvelin and von Heijne, 1994), the strong orientational preference of a neighboring TMD (Öjemalm et al., 2012), or the weak stop-transfer efficiency of marginally hydrophobic TMDs (Moss et al., 1998). Consistent with these experimental observations, the simulations in Figure 3B find that the weakly hydrophobic TMD1 of EmrE integrates the least efficiently, followed by TMD4 which is flanked by two charged loops. The proposed mechanism Kinetic annealing of the end-of-translation ensemble Analysis of the simulated CG trajectories reveals a straightforward molecular mechanism by which the multispanning topology of EmrE and its mutants is established. This mechanism, which we refer to as kinetic annealing of the end-of-translation (EOT) ensemble, is illustrated in Figure 4 and involves two basic steps. In the first step, the cotranslational integration (or misintegration) of each TMD leads to an ensemble of IMP configurations associated with the time at which ribosomal translation completes; we call this set of configurations the EOT ensemble. In the second step of the proposed mechanism, configurations in the EOT ensemble anneal toward a fully integrated topology as a function of time as loops post-translationally flip across the membrane. The rate at which the soluble loops undergo post-translational flipping is a key determinant of the fully integrated topology. We will show that this mechanism explains the unexpected elements of EmrE topogenesis observed experimentally, including
Referência(s)