Capturing protein communities by structural proteomics in a thermophilic eukaryote
2017; Springer Nature; Volume: 13; Issue: 7 Linguagem: Inglês
10.15252/msb.20167412
ISSN1744-4292
AutoresPanagiotis L. Kastritis, Francis J. O’Reilly, Thomas Bock, Yuanyue Li, Zbigniew Rogon, Katarzyna Buczak, Natalie Romanov, Matthew J. Betts, Khanh Huy Bui, Wim J. H. Hagen, Marco L. Hennrich, Marie‐Therese Mackmull, Juri Rappsilber, Robert B. Russell, Peer Bork, Martin Beck, Anne‐Claude Gavin,
Tópico(s)Advanced Proteomics Techniques and Applications
ResumoArticle25 July 2017Open Access Transparent process Capturing protein communities by structural proteomics in a thermophilic eukaryote Panagiotis L Kastritis Panagiotis L Kastritis orcid.org/0000-0002-1463-8422 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Francis J O'Reilly Francis J O'Reilly European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Chair of Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, Berlin, Germany Search for more papers by this author Thomas Bock Thomas Bock orcid.org/0000-0002-9314-5318 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Yuanyue Li Yuanyue Li orcid.org/0000-0001-5971-0355 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Matt Z Rogon Matt Z Rogon European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Katarzyna Buczak Katarzyna Buczak European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Natalie Romanov Natalie Romanov orcid.org/0000-0003-3749-2654 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Matthew J Betts Matthew J Betts Cell Networks, Bioquant & Biochemie Zentrum Heidelberg, Heidelberg University, Heidelberg, Germany Search for more papers by this author Khanh Huy Bui Khanh Huy Bui orcid.org/0000-0001-8231-3016 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Department of Anatomy and Cell Biology, McGill University, Montreal, QC, Canada Search for more papers by this author Wim J Hagen Wim J Hagen European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Marco L Hennrich Marco L Hennrich orcid.org/0000-0003-2853-9748 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Marie-Therese Mackmull Marie-Therese Mackmull orcid.org/0000-0003-2928-1144 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Juri Rappsilber Juri Rappsilber orcid.org/0000-0001-5999-1310 Chair of Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, Berlin, Germany Wellcome Trust Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK Search for more papers by this author Robert B Russell Robert B Russell Cell Networks, Bioquant & Biochemie Zentrum Heidelberg, Heidelberg University, Heidelberg, Germany Search for more papers by this author Peer Bork Peer Bork European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Martin Beck Corresponding Author Martin Beck [email protected] orcid.org/0000-0002-7397-1321 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Anne-Claude Gavin Corresponding Author Anne-Claude Gavin [email protected] orcid.org/0000-0003-4917-2340 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Panagiotis L Kastritis Panagiotis L Kastritis orcid.org/0000-0002-1463-8422 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Francis J O'Reilly Francis J O'Reilly European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Chair of Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, Berlin, Germany Search for more papers by this author Thomas Bock Thomas Bock orcid.org/0000-0002-9314-5318 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Yuanyue Li Yuanyue Li orcid.org/0000-0001-5971-0355 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Matt Z Rogon Matt Z Rogon European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Katarzyna Buczak Katarzyna Buczak European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Natalie Romanov Natalie Romanov orcid.org/0000-0003-3749-2654 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Matthew J Betts Matthew J Betts Cell Networks, Bioquant & Biochemie Zentrum Heidelberg, Heidelberg University, Heidelberg, Germany Search for more papers by this author Khanh Huy Bui Khanh Huy Bui orcid.org/0000-0001-8231-3016 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Department of Anatomy and Cell Biology, McGill University, Montreal, QC, Canada Search for more papers by this author Wim J Hagen Wim J Hagen European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Marco L Hennrich Marco L Hennrich orcid.org/0000-0003-2853-9748 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Marie-Therese Mackmull Marie-Therese Mackmull orcid.org/0000-0003-2928-1144 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Juri Rappsilber Juri Rappsilber orcid.org/0000-0001-5999-1310 Chair of Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, Berlin, Germany Wellcome Trust Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK Search for more papers by this author Robert B Russell Robert B Russell Cell Networks, Bioquant & Biochemie Zentrum Heidelberg, Heidelberg University, Heidelberg, Germany Search for more papers by this author Peer Bork Peer Bork European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Martin Beck Corresponding Author Martin Beck [email protected] orcid.org/0000-0002-7397-1321 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Anne-Claude Gavin Corresponding Author Anne-Claude Gavin [email protected] orcid.org/0000-0003-4917-2340 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany Search for more papers by this author Author Information Panagiotis L Kastritis1,‡, Francis J O'Reilly1,2,‡, Thomas Bock1,‡, Yuanyue Li1, Matt Z Rogon1, Katarzyna Buczak1, Natalie Romanov1, Matthew J Betts3, Khanh Huy Bui1,4, Wim J Hagen1, Marco L Hennrich1, Marie-Therese Mackmull1, Juri Rappsilber2,5, Robert B Russell3, Peer Bork1, Martin Beck *,1 and Anne-Claude Gavin *,1 1European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany 2Chair of Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, Berlin, Germany 3Cell Networks, Bioquant & Biochemie Zentrum Heidelberg, Heidelberg University, Heidelberg, Germany 4Department of Anatomy and Cell Biology, McGill University, Montreal, QC, Canada 5Wellcome Trust Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK ‡These authors contributed equally to this work *Corresponding author. Tel: +49 6221 387 8267; E-mail: [email protected] *Corresponding author. Tel: +49 6221 387 8816; E-mail: [email protected] Molecular Systems Biology (2017)13:936https://doi.org/10.15252/msb.20167412 PDFDownload PDF of article text and main figures. Peer ReviewDownload a summary of the editorial decision process including editorial decision letters, reviewer comments and author responses to feedback. ToolsAdd to favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures & Info Abstract The arrangement of proteins into complexes is a key organizational principle for many cellular functions. Although the topology of many complexes has been systematically analyzed in isolation, their molecular sociology in situ remains elusive. Here, we show that crude cellular extracts of a eukaryotic thermophile, Chaetomium thermophilum, retain basic principles of cellular organization. Using a structural proteomics approach, we simultaneously characterized the abundance, interactions, and structure of a third of the C. thermophilum proteome within these extracts. We identified 27 distinct protein communities that include 108 interconnected complexes, which dynamically associate with each other and functionally benefit from being in close proximity in the cell. Furthermore, we investigated the structure of fatty acid synthase within these extracts by cryoEM and this revealed multiple, flexible states of the enzyme in adaptation to its association with other complexes, thus exemplifying the need for in situ studies. As the components of the captured protein communities are known—at both the protein and complex levels—this study constitutes another step forward toward a molecular understanding of subcellular organization. Synopsis An integrative structural systems biology approach is presented to systematically characterize native protein communities of dynamically associated protein complexes. Cryo-electron microscopy detects a metabolon involved in fatty acid synthesis at unprecedented molecular details. In addition to the grouping of proteins into complexes, intracellular function requires a further layer of organization that involves multiple spatially and temporally interacting macromolecular complexes or protein communities. However, experimental approaches to capture this higher-order proteome organization are still missing. Here, we show that crude cellular fractions from a thermophilic eukaryote retain basic principles of proteome organization, and can be exploited to capture protein communities through integrative structural biology approaches. We report a compendium of 27 protein communities and have experimentally characterized and structurally analyzed one of these comprising enzymes involved in fatty acid metabolism. From the crude extracts, we obtained a cryo-EM structure of fungal fatty acid synthase that reveals a thus far uncharacterized catalytic intermediate. We demonstrate the feasibility of high-resolution cryoEM without the need to obtain biochemically highly homogenous samples. Introduction As the molecular machines of the cell, protein complexes are the cornerstones of most biological processes, and are the smallest, basic functional and structural units of proteome organization (Duve, 1975; Gavin et al, 2002; Krogan et al, 2006). Many individual studies and extensive proteome-wide screens in a variety of organisms have identified comprehensive repertoires of protein complexes and have provided insights into their molecular composition and anatomy (Gavin et al, 2002; Krogan et al, 2006; Kuhner et al, 2009; Amlacher et al, 2011; Havugimana et al, 2012; Lapinaite et al, 2013; von Appen et al, 2015; Hoffmann et al, 2015; Wan et al, 2015; Yan et al, 2015). These studies relied on extensive biochemical purification, often including multiple sequential steps or dimensions, and so inherently selected for the most biophysically stable assemblies. However, protein complexes—as an organizational principle—cannot account alone for the complex integration of the many cellular processes in situ. Additional layers of functional organization, beyond free diffusion and random collision of functional biomolecules within organelles, are required to ensure, for example, the efficient transfer of substrates along enzymatic pathways (dubbed metabolons; Srere, 1987; Wu & Minteer, 2015; Wan et al, 2015; Wheeldon et al, 2016), the effective transduction of signals (Wu, 2013), and the synthesis of proteins according to the local cellular needs (Gupta et al, 2016). This requires spatially and temporally synchronized sets of protein complexes—protein communities (Barabasi & Oltvai, 2004; Menche et al, 2015)—which we define as higher-order, often dynamically associated, assemblies of multiple macromolecular complexes that benefit from their close proximity to each other in the cell. To date, protein communities have not been properly conceptualized because experimental frameworks to capture this higher-order proteome organization are missing. We used cell fractions from a thermophilic eukaryote, Chaetomium thermophilum (Amlacher et al, 2011), to delineate and characterize protein communities in crude extracts that retain aspects of cellular complexity. Our experimental design, in particular our choice of a thermophilic organism to minimize the disassembly of protein–protein interactions and the respective fractionation conditions, favors the identification of especially higher molecular weight species. To cope with the complexity of such samples, we combined quantitative mass spectrometry (MS) with electron microscopy (EM) and computational modeling approaches. We computed a network capturing various communities and demonstrate its usefulness for further analysis. We used cross-linking mass spectrometry (XL-MS) and EM to validate our approach, which shows that crude cellular extracts retain the basic principles of proteome organization. They are amenable to high-resolution cryoEM analyses of the sociology of protein complexes within their higher-order assemblies. As the proteins can be readily identified within these extracts, our methodological framework complements the emerging single-cell structural biology approaches that provide high-resolution snapshots of subcellular features (Beck & Baumeister, 2016; Mahamid et al, 2016) but are currently unable to pinpoint the underlying biomolecular entities. Results Cellular fractions serve as a proxy for the cellular environment and retain basic principles of cellular organization Many fundamental components of the cell were first structurally investigated from thermophilic archaea because protein interactions in thermophiles have higher stability compared to their mesophilic counterparts. We chose to study the thermophilic eukaryote, Chaetomium thermophilum, a promising model organism for structurally investigating eukaryotic cell biology, because protein communities may be more robust than those from other model systems. Large-scale analyses based on extensive, multi-dimensional fractionation have been applied to characterize protein complexes from various organisms and cell lines. These have all demonstrated that protein complexes—as biochemically highly stable entities—are an ubiquitous organizational principle (Wan et al, 2015). Our goal here was to capture more transient, higher-order associations and to characterize the functional organization of a eukaryotic proteome under conditions that mimic the native, cellular state. To achieve this, we obtained simple and crude cellular fractions (simplified cell lysates) from the thermophilic fungus C. thermophilum by single-step analytical size exclusion chromatography (SEC; Fig 1). The chromatographic method used here achieves relatively high resolution compared with gel filtration methods commonly used on a preparative scale (Kristensen et al, 2012) and the resulting 30 fractions span molecular weights ranging from ~0.2 to ~5 MDa. We first analyzed these fractions in biological triplicate by label-free quantitative liquid chromatography–mass spectrometry (LC-MS/MS) to characterize co-eluting proteins, complexes, and communities. We identified 1,176 proteins across all fractions that were present in at least two of the triplicates (Dataset EV1, Appendix Fig S1A), which account for 27.4% of the expressed proteome of C. thermophilum (Bock et al, 2014). For comparison, in human HeLa and U2OS cell lines, 19 and 29% of the proteome elutes in these high molecular weight SEC fractions, respectively (Kristensen et al, 2012; Kirkwood et al, 2013). Of these 1,176 proteins, 97% have a molecular weight < 200 kDa as a monomer but were still reproducibly identified in fractions corresponding to larger molecular masses, suggesting that most are engaged in large macromolecular assemblies. Figure 1. Overview of integrative structural network biology of native cell extracts in a thermophilic eukaryote We combined computational modeling approaches adapted from network biology (molecular profiling) with molecular biophysics, electron microscopy (EM; structural profiling), and quantitative and cross-linking mass spectrometry (interface profiling) to systematically chart and characterize the organization of protein complexes into functional, local communities. Large-scale electron microscopy and cross-linking mass spectrometry are used as validation tools. Download figure Download PowerPoint Next, we determined an experimental elution profile for each protein by quantifying protein abundance based on iBAQ scoring (Schaab et al, 2012). The abundance of the detected proteins spans five orders of magnitude (Appendix Fig S1B and C), demonstrating that relatively rare complexes are also captured in this process. The elution profiles correlate well across the biological triplicates (squared Pearson coefficient; 0.82 < r2 < 0.88; Appendix Fig S1B and C, and Dataset EV1). Similarly, the protein composition of each SEC fraction was generally highly reproducible (Pearson coefficient; 0.61 < r < 0.98; Appendix Fig S1D and Dataset EV1). To further assess the quality and effectiveness of the biochemical separation, we determined whether the observed elution profiles matched the composition, molecular weight, and stoichiometries of well-characterized and conserved protein complexes as contained in the Protein Data Bank (PDB; Berman et al, 2000). We generated 3D interaction models for 378 out of the identified 1,176 C. thermophilum proteins using comparative structural modeling that takes into account species-specific differences (cutoffs: > 30% sequence coverage, > 30% sequence identity; Appendix Figs S2 and S3, Dataset EV2, details in the Materials and Methods). The resulting benchmark set of structurally known protein complexes comprises 34 heteromers (involving 212 proteins) and 166 homomers, the latter mainly consisting of metabolic enzymes (Appendix Fig S2E). As expected, the subunits of the heteromultimeric complexes typically co-eluted in the same biochemical fractions (Fig 2A, Dataset EV2 and Appendix Fig S4), although a considerable number of proteins showed multiple elution peaks indicating that they are engaged in various complexes (Kuhner et al, 2009). For 102 protein complexes that eluted in a single peak (Dataset EV2), we also compared their predicted molecular weights to those estimated from their retention time (tR) during SEC elution (Fig 2B). In 52 well-characterized cases—for example, the chaperonin-containing TCP-1 (CCT) complex or the 19S proteasome—we observed a good agreement between the expected and observed tRs, further validating the general efficiency of the cell lysate separation procedure. However, 50 protein complexes eluted at much higher molecular weights than anticipated from their structural models. These shifts are unlikely to be due to non-specific post-lysis protein aggregation as no visible aggregates were formed under our experimental conditions (EM analysis, see below). They are therefore probably functionally relevant as we observed that co-eluting complexes share the same functional ontology (independent two-sample t-test P-value = 3.88E-50, Appendix Fig S5) or directly interact (cross-linking experiments, see below), suggesting a functional relationship. This is consistent with the view that protein complexes might self-assemble with higher stoichiometries, contain additional components—that is, RNA, DNA, metabolites, or proteins—and/or form uncharacterized, protein communities. An interesting example is the glycolytic enzyme enolase (EC 4.2.1.11) that forms a structurally characterized dimer in vitro (2 × 47.7 = 95.4 kDa; (Kuhnel & Luisi, 2001); PDB:2AL2) but seems to be part of a ~4-MDa assembly in the cellular fractions of C. thermophilum (Fig 2B). This supports previous indications that enolase participates in higher-order multienzyme assemblies, such as the somewhat elusive eukaryotic glycolytic metabolon (Menard et al, 2014). Overall, our operational definition of protein communities using a reproducible and sensitive structural proteomics approach captures important snapshots of the functional organization of cellular proteomes. Figure 2. Identification of protein complexes and communities in the cellular extracts Elution of selected protein complexes as a function of their retention times (see Appendix Fig S4 for their corresponding subunit elutions). Scatter plot indicating discrepancies in the expected and measured molecular weights of 102 protein complexes that elute as a single peak; 50% of protein complexes are observed to have higher molecular weights than structurally characterized, indicating that they are organized in higher-order assemblies. Download figure Download PowerPoint A compendium of C. thermophilum protein complexes within protein communities We next used the protein elution profiles in conjunction with known functional associations to systematically define protein communities. Correlations between profiles can indicate membership of the same complex (Havugimana et al, 2012; Kristensen et al, 2012) or of protein communities that perform functions in a spatiotemporal context. For all possible protein pairs in the dataset, we calculated a Pearson correlation coefficient (cross-correlation co-elution (CCC) score), to measure the similarity of their elution profiles (see Materials and Methods for details). Although distinct complexes can share similar and overlapping elution profiles (Havugimana et al, 2012), CCC scores discriminate between random co-eluting and interacting protein pairs (Appendix Fig S6). To improve the assignment of interaction probabilities, we also exploited a set of indirect interactions (e.g. genetic interaction, colocalization) from the STRING database (v.9.1; Franceschini et al, 2013). These are based on orthologs from Saccharomyces cerevisiae (Dataset EV3) and a set of non-redundant structural interfaces that share homology with C. thermophilum predicted interfaces using Mechismo (Betts et al, 2015; Materials and Methods; Dataset EV3). We combined these two datasets with the interaction probabilities derived from the elution profiles. We used a random forest classifier trained with randomly sampled sets of true-positive (N = 5,000) and true-negative (N = 5,000) interactions that we extracted from public sources after manual curation (PDB (Berman et al, 2000) and affinity purification–mass spectrometry (AP-MS) data (Benschop et al, 2010); Dataset EV3). We took a minimum interaction probability of 0.85 to construct a protein–protein interaction network (Appendix Figs S7–S10) that contains 679 proteins, 427 of which are not known to be members of protein complexes as their orthologs in yeast are not in any complex defined by PDB (Berman et al, 2000), AP-MS data (Benschop et al, 2010), or the Saccharomyces Genome Database (SGD; www.yeastgenome.org). From this network, we used a clustering method that efficiently discovers densely connected overlapping regions that represent protein complexes and communities (ClusterONE; Nepusz et al, 2012). We systematized the recovery of protein complexes by an exhaustive parameter search and benchmarking (Sardiu et al, 2009) with the set of known structures (from the PDB) and yeast complexes (from AP-MS data; Dataset EV2; Materials and Methods). The optimal set of clustering parameters defines 21 clusters that account for protein complexes and 27 clusters accounting for protein communities that contain 108 interconnected protein complexes as subsets (Fig 3). Importantly, varying the parameters had only marginal impact on the final protein content (Dataset EV3 and Materials and Methods), highlighting the robustness of the protein communities. Overall, the protein communities include 62% of the set of known protein complexes (the set of known PDB and AP-MS data, Dataset EV2) with 90% average coverage of their components (Fig 3 and Dataset EV4). Of these communities, a well-known example is the ribosome protein community, which comprises not only the stable 60S and 40S ribosomal complexes but also the translation initiation factor eIF2B that is only transiently associated with the ribosome (Fig 3, Appendix Fig S8A). Other examples are novel such as the physical interaction between the Tup1-Cyc8 corepressor and a histone deacetylase complex (community #22), which is consistent with recent functional data demonstrating that these two complexes indeed cooperate to robustly repress transcription in yeast (Fleming et al, 2014). The analysis also captured a lipid anabolism metabolon (community #23), which not only includes the homomultimeric complexes of a cytochrome b reductase (Cbr1, which regulates the catalysis of sterol by biosynthetic enzymes) and a choline-phosphate cytidylyltransferase (Pct1, which is a rate-determining enzyme of the CDP-choline pathway for phosphatidylcholine synthesis), but also several enzymes in the sterol synthesis pathway. The transmembrane protein suppressor of choline sensitivity 2 (Scs2) is also observed, which is a known regulator of phospholipid metabolism. Its presence may seem peculiar at first; however, this provides physical evidence for a role for this community in validating the interconnectivity of lipid and sterol metabolism in fungi (Parks & Casey, 1995). Such coordinated regulatory effects may functionally optimize membrane plasticity and specificity (Ramgopal & Bloch, 1983). This community presumably localizes at the endoplasmic reticulum (ER)–plasma membrane (PM) interface as this is thought to be the location of all five predicted transmembrane proteins (Dataset EV4). Figure 3. Network derived from large-scale fractionation predicts 48 protein complexes and communities Integration of experimental elution data, known functional associations, and predicted interaction interfaces from homologous proteins allow the creation of a high-quality network with interconnected protein complexes (Appendix Figs S8–S10). Here, known protein complexes are shown in blue and other physically associated proteins in gray, predicted interactions of complexes as gray lines and cross-links as red lines, and cross-links between different subunits of a heteromultimeric complex are represented with red loops (see insert). Communities containing multiple complexes are highlighted with yellow; numbering and naming of complexes and communities are described in the legend of Appendix Fig S9. Download figure Download PowerPoint The protein communities include associations that have been reported as transient, non-stoichiometric or of low abundance in other organisms. For example, the 19S regulatory particle of the proteasome was found to be associated with two known components, Upb6 and Nas6, and the 20S core particle with two mutually exclusive alternative cap proteins, Blm10 and Cdc48 (Kish-Trier & Hill, 2013; Fig 3, Appendix Fig S8A). The protein communities also capture transient interactions between nuclear transport receptors and transport channel nucleoporins—specifically, the interactions between karyopherins and the Nsp1 complex and the Nup159 complex (Appendix Fig S8B)—that have been elusive in standard biochemical experiments (Patel & Rexach, 2008) and that were recently found to have high off-rates (Milles et al, 2015). Elsewhere, RNA polymerase II is found in a community with several splicing complexes, the U2 snRNP, the U4/U6.U5 tri-snRNP, and the smD3 complexes (Appendix Fig S8B, Dataset EV4). These spliceosomal machineries are known to interact with RNA polymerase II via the carboxy-terminal domain of its largest subunit, ensuring the tight coupling of mRNA transcription and splicing (Martins et al, 2011). We thus consider that our approach successfully identifies higher-order associations of complex core modules. This compendium of C. thermophilum protein communities (Dataset EV4), which are precisely assigned to specific and highly reproducible cellular fractions, represents an important resource for structural biologists (Appendix Fig S10). It not only captures transient associations but also identifies subunits of known complexes that have so far remained elusive. Due to the evolutionary distance between C. thermophilum and most well-established model organisms, subunits of even highly conserved core complexes are not necessarily identified (or unambiguously identified) by sequence alignments. As exemplified in Appendix Fig S8A, the co-elution data can be used to identify such subunits and to assign orthology (details in Dataset EV4) by narrowing down a set of protein complex member candidates
Referência(s)