Artigo Acesso aberto Revisado por pares

High Sensitivity Crosslink Detection Coupled With Integrative Structure Modeling in the Mass Spec Studio

2016; Elsevier BV; Volume: 15; Issue: 9 Linguagem: Inglês

10.1074/mcp.o116.058685

ISSN

1535-9484

Autores

Vladimir Sarpe, Atefeh Rafiei, Morgan Hepburn, Nicholas Ostan, Anthony B. Schryvers, David C. Schriemer,

Tópico(s)

RNA and protein synthesis mechanisms

Resumo

The Mass Spec Studio package was designed to support the extraction of hydrogen-deuterium exchange and covalent labeling data for a range of mass spectrometry (MS)-based workflows, to integrate with restraint-driven protein modeling activities. In this report, we present an extension of the underlying Studio framework and provide a plug-in for crosslink (XL) detection. To accommodate flexibility in XL methods and applications, while maintaining efficient data processing, the plug-in employs a peptide library reduction strategy via a presearch of the tandem-MS data. We demonstrate that prescoring linear unmodified peptide tags using a probabilistic approach substantially reduces search space by requiring both crosslinked peptides to generate sparse data attributable to their linear forms. The method demonstrates highly sensitive crosslink peptide identification with a low false positive rate. Integration with a Haddock plug-in provides a resource that can combine multiple sources of data for protein modeling activities. We generated a structural model of porcine transferrin bound to TbpB, a membrane-bound receptor essential for iron acquisition in Actinobacillus pleuropneumoniae. Using mutational data and crosslinking restraints, we confirm the mechanism by which TbpB recognizes the iron-loaded form of transferrin, and note the requirement for disparate sources of restraint data for accurate model construction. The software plugin is freely available at www.msstudio.ca. The Mass Spec Studio package was designed to support the extraction of hydrogen-deuterium exchange and covalent labeling data for a range of mass spectrometry (MS)-based workflows, to integrate with restraint-driven protein modeling activities. In this report, we present an extension of the underlying Studio framework and provide a plug-in for crosslink (XL) detection. To accommodate flexibility in XL methods and applications, while maintaining efficient data processing, the plug-in employs a peptide library reduction strategy via a presearch of the tandem-MS data. We demonstrate that prescoring linear unmodified peptide tags using a probabilistic approach substantially reduces search space by requiring both crosslinked peptides to generate sparse data attributable to their linear forms. The method demonstrates highly sensitive crosslink peptide identification with a low false positive rate. Integration with a Haddock plug-in provides a resource that can combine multiple sources of data for protein modeling activities. We generated a structural model of porcine transferrin bound to TbpB, a membrane-bound receptor essential for iron acquisition in Actinobacillus pleuropneumoniae. Using mutational data and crosslinking restraints, we confirm the mechanism by which TbpB recognizes the iron-loaded form of transferrin, and note the requirement for disparate sources of restraint data for accurate model construction. The software plugin is freely available at www.msstudio.ca. Integrative methods in structural biology use data from disparate sources to generate accurate models of large protein structures and assemblies (1.Webb B. Lasker K. Velazquez-Muriel J. Schneidman-Duhovny D. Pellarin R. Bonomi M. Greenberg C. Raveh B. Tjioe E. Russel D. Sali A. Modeling of proteins and their assemblies with the Integrative Modeling Platform.Methods Mol. Biol. 2014; 1091: 277-295Crossref PubMed Scopus (27) Google Scholar). In this way, the reach of classical structure providers such as x-ray crystallography and NMR can be extended. Biophysical data with an underlying spatial component can be combined with "building block" structures in a molecular modeling framework, to generate high-fidelity models of systems of impressive size and complexity (2.Kim S.J. Fernandez-Martinez J. Sampathkumar P. Martel A. Matsui T. Tsuruta H. Weiss T.M. Shi Y. Markina-Inarrairaegui A. Bonanno J.B. Sauder J.M. Burley S.K. Chait B.T. Almo S.C. Rout M.P. Sali A. Integrative structure-function mapping of the nucleoporin Nup133 suggests a conserved mechanism for membrane anchoring of the nuclear pore complex.Mol. Cell. Proteomics. 2014; 13: 2911-2926Abstract Full Text Full Text PDF PubMed Scopus (51) Google Scholar, 3.Fernandez-Martinez J. Phillips J. Sekedat M.D. Diaz-Avalos R. Velazquez-Muriel J. Franke J.D. Williams R. Stokes D.L. Chait B.T. Sali A. Rout M.P. Structure-function mapping of a heptameric module in the nuclear pore complex.J. Cell Biol. 2012; 196: 419-434Crossref PubMed Scopus (90) Google Scholar, 4.Alber F. Dokudovskaya S. Veenhoff L.M. Zhang W. Kipper J. Devos D. Suprapto A. Karni-Schmidt O. Williams R. Chait B.T. Sali A. Rout M.P. The molecular architecture of the nuclear pore complex.Nature. 2007; 450: 695-701Crossref PubMed Scopus (818) Google Scholar, 5.Greber B.J. Bieri P. Leibundgut M. Leitner A. Aebersold R. Boehringer D. Ban N. Ribosome. The complete structure of the 55S mammalian mitochondrial ribosome.Science. 2015; 348: 303-308Crossref PubMed Scopus (269) Google Scholar). Mass spectrometry can provide rich sets of data in support these activities, in the form of topographical footprints (covalent labeling, CL) 1The abbreviations used are:CLcovalent labelingBSAbovine serum albuminDSSdisuccinimidyl suberateFDRfalse discovery rateHXhydrogen-deuterium exchangeMEFmicrosoft extensibility frameworkPSMpeptide spectrum matchppmparts per millionpTfporcine transferrinSECsize-exclusion chromatographyTpbAtranferrin-binding protein ATbpBtranferrin-binding protein BUIuser interfaceXICextracted ion chromatogramXLcross-linking. (6.Kaur P. Tomechko S.E. Kiselar J. Shi W. Deperalta G. Wecksler A.T. Gokulrangan G. Ling V. Chance M.R. Characterizing monoclonal antibody structure by carboxyl group footprinting.MAbs. 2015; 7: 540-552Crossref PubMed Scopus (35) Google Scholar, 7.Zhou Y. Vachet R.W. Covalent labeling with isotopically encoded reagents for faster structural analysis of proteins by mass spectrometry.Anal. Chem. 2013; 85: 9664-9670Crossref PubMed Scopus (17) Google Scholar, 8.Stocks B.B. Rezvanpour A. Shaw G.S. Konermann L. Temporal development of protein structure during S100A11 folding and dimerization probed by oxidative labeling and mass spectrometry.J. Mol. Biol. 2011; 409: 669-679Crossref PubMed Scopus (21) Google Scholar), conformational dynamics (hydrogen/deuterium exchange, HX) (9.Underbakke E.S. Iavarone A.T. Chalmers M.J. Pascal B.D. Novick S. Griffin P.R. Marletta M.A. Nitric oxide-induced conformational changes in soluble guanylate cyclase.Structure. 2014; 22: 602-611Abstract Full Text Full Text PDF PubMed Scopus (63) Google Scholar, 10.Rand K.D. Zehl M. Jorgensen T.J. Measuring the hydrogen/deuterium exchange of proteins at high spatial resolution by mass spectrometry: overcoming gas-phase hydrogen/deuterium scrambling.Acc. Chem. Re.s. 2014; 47: 3018-3027Crossref PubMed Scopus (70) Google Scholar) and distance restraints (crosslinking, XL) (11.Sinz A. Arlt C. Chorev D. Sharon M. Chemical cross-linking and native mass spectrometry: A fruitful combination for structural biology.Protein Sci. 2015; 24: 1193-1209Crossref PubMed Scopus (100) Google Scholar, 12.Sinz A. Chemical cross-linking and mass spectrometry to map three-dimensional protein structures and protein-protein interactions.Mass Spectrom. Rev. 2006; 25: 663-682Crossref PubMed Scopus (525) Google Scholar, 13.Rappsilber J. The beginning of a beautiful friendship: cross-linking/mass spectrometry and modelling of proteins and multi-protein complexes.J. Struct. Biol. 2011; 173: 530-540Crossref PubMed Scopus (316) Google Scholar). We have built informatics routines within the Mass Spec Studio framework to mine restraints from both CL and HX data (14.Rey M. Sarpe V. Burns K.M. Buse J. Baker C.A. van Dijk M. Wordeman L. Bonvin A.M. Schriemer D.C. Mass spec studio for integrative structural biology.Structure. 2014; 22: 1538-1548Abstract Full Text Full Text PDF PubMed Scopus (69) Google Scholar), to support such data-driven molecular modeling activities. In this study, we describe a new plug-in built into the Studio for identifying crosslinks from LC-MS/MS data sets. covalent labeling bovine serum albumin disuccinimidyl suberate false discovery rate hydrogen-deuterium exchange microsoft extensibility framework peptide spectrum match parts per million porcine transferrin size-exclusion chromatography tranferrin-binding protein A tranferrin-binding protein B user interface extracted ion chromatogram cross-linking. Advances in instrumentation, methods and cross-linking protocols have generated renewed interest in what is an older technique. However, useful informatics routines are essential for gaining access to quality crosslinking information as site identification is not a trivial problem (15.Leitner A. Walzthoeni T. Kahraman A. Herzog F. Rinner O. Beck M. Aebersold R. Probing native protein structures by chemical cross-linking, mass spectrometry, and bioinformatics.Mol. Cell. Proteomics. 2010; 9: 1634-1649Abstract Full Text Full Text PDF PubMed Scopus (371) Google Scholar). Some noteworthy tools that have emerged in the last few years include xQuest (16.Rinner O. Seebacher J. Walzthoeni T. Mueller L.N. Beck M. Schmidt A. Mueller M. Aebersold R. Identification of cross-linked peptides from large sequence databases.Nat. Methods. 2008; 5: 315-318Crossref PubMed Scopus (5) Google Scholar), Merox (17.Götze M. Pettelkau J. Fritzsche R. Ihling C.H. Schäfer M. Sinz A. Automated assignment of MS/MS cleavable cross-links in protein 3D-structure analysis.J. Am. Soc. Mass Spectrom. 2015; 26: 83-97Crossref PubMed Scopus (99) Google Scholar), Stavrox (18.Götze M. Pettelkau J. Schaks S. Bosse K. Ihling C.H. Krauth F. Fritzsche R. Kühn U. Sinz A. StavroX—a software for analyzing crosslinked products in protein interaction studies.J. Am. Soc. Mass Spectrom. 2012; 23: 76-87Crossref PubMed Scopus (251) Google Scholar), Sim-XL (19.Lima D.B. de Lima T.B. Balbuena T.S. Neves-Ferreira A.G. Barbosa V.C. Gozzo F.C. Carvalho P.C. SIM-XL: A powerful and user-friendly tool for peptide cross-linking analysis.J. Proteomics. 2015; 129: 51-55Crossref PubMed Scopus (60) Google Scholar), pLink (20.Gao Q. Xue S. Doneanu C.E. Shaffer S.A. Goodlett D.R. Nelson S.D. Pro-CrossLink. Software tool for protein cross-linking and mass spectrometry.Anal. Chem. 2006; 78: 2145-2149Crossref PubMed Scopus (54) Google Scholar), XlinkX (21.Liu F. Rijkers D.T. Post H. Heck A.J. Proteome-wide profiling of protein assemblies by cross-linking mass spectrometry.Nat. Methods. 2015; 12: 1179-1184Crossref PubMed Scopus (303) Google Scholar), and XiQ (22.Fischer L. Chen Z.A. Rappsilber J. Quantitative cross-linking/mass spectrometry using isotope-labelled cross-linkers.J. Proteomics. 2013; 88: 120-128Crossref PubMed Scopus (108) Google Scholar). The proliferation of such tools is a strong indication that new XL reagents and methods require dynamic software development to accommodate the needs of challenging structural applications. For example, multiplexed structural analysis from whole proteomes (21.Liu F. Rijkers D.T. Post H. Heck A.J. Proteome-wide profiling of protein assemblies by cross-linking mass spectrometry.Nat. Methods. 2015; 12: 1179-1184Crossref PubMed Scopus (303) Google Scholar), a requirement for richer sets of "molecular rulers" for de novo structure determination (23.Politis A. Stengel F. Hall Z. Hernandez H. Leitner A. Walzthoeni T. Robinson C.V. Aebersold R. A mass spectrometry-based hybrid method for structural modeling of protein complexes.Nat. Methods. 2014; 11: 403-406Crossref PubMed Scopus (132) Google Scholar), and integration with other sources of structural data are three application areas that are not yet well served by available tools. It would be useful to develop packages that collect and integrate concepts that have demonstrated some utility in the detection of crosslinks (24.Petrotchenko E.V. Borchers C.H. Application of a fast sorting algorithm to the assignment of mass spectrometric cross-linking data.Proteomics. 2014; 14: 1987-1989Crossref PubMed Scopus (10) Google Scholar), and support the easy addition of new concepts. In our view, there are a number of features that should be bundled into a single solution. Any source of MS data should be accommodated, regardless of the instrument vendor or style of experiment. Both low and high resolution data-dependent LC-MS/MS data have been collected in crosslinking experiments, on both FT-MS and TOF-based instruments (11.Sinz A. Arlt C. Chorev D. Sharon M. Chemical cross-linking and native mass spectrometry: A fruitful combination for structural biology.Protein Sci. 2015; 24: 1193-1209Crossref PubMed Scopus (100) Google Scholar, 25.Soderberg C.A. Lambert W. Kjellstrom S. Wiegandt A. Wulff R.P. Mansson C. Rutsdottir G. Emanuelsson C. Detection of crosslinks within and between proteins by LC-MALDI-TOFTOF and the software FINDX to reduce the MSMS-data to acquire for validation.PLoS ONE. 2012; 7: e38927Crossref PubMed Scopus (23) Google Scholar). Many software applications are restricted to the analysis of processed and/or converted data files, which seems to us unnecessary with the increasing willingness of vendors to supply file readers directly or through Proteowizard (26.Holman J.D. Tabb D.L. Mallick P. Employing ProteoWizard to Convert Raw Mass Spectrometry Data.Curr. Protoc. Bioinformatics. 2014; 46: 11-19Google Scholar). Raw data should be handled natively where possible, as it provides the greatest opportunity for error detection and results validation. Chromatographic data and the precursor ion profiles should be available to support the identification exercise, or at least the validation exercise. The output should be easily navigated for rapid hit validation, and readily harvested for integration with visualization and modeling activities (14.Rey M. Sarpe V. Burns K.M. Buse J. Baker C.A. van Dijk M. Wordeman L. Bonvin A.M. Schriemer D.C. Mass spec studio for integrative structural biology.Structure. 2014; 22: 1538-1548Abstract Full Text Full Text PDF PubMed Scopus (69) Google Scholar). Further, although isotopic labeling has high value in the validation exercise (16.Rinner O. Seebacher J. Walzthoeni T. Mueller L.N. Beck M. Schmidt A. Mueller M. Aebersold R. Identification of cross-linked peptides from large sequence databases.Nat. Methods. 2008; 5: 315-318Crossref PubMed Scopus (5) Google Scholar, 27.Petrotchenko E.V. Borchers C.H. ICC-CLASS: isotopically-coded cleavable crosslinking analysis software suite.BMC Bioinformatics. 2010; 11: 64Crossref PubMed Scopus (49) Google Scholar), a very large number of useful reagents are not available in labeled format, and labeling is not a strict requirement for accurate linkage detection (28.Giese S.H. Fischer L. Rappsilber J. A study into the CID behavior of cross-linked peptides.Mol. Cell. Proteomics. 2016; 15: 1094-1104Abstract Full Text Full Text PDF PubMed Scopus (52) Google Scholar). Finally, and perhaps most importantly, robust probabilistic scoring algorithms should be implemented (29.Trnka M.J. Baker P.R. Robinson P.J. Burlingame A.L. Chalkley R.J. Matching cross-linked peptide spectra: only as good as the worse identification.Mol. Cell. Proteomics. 2014; 13: 420-434Abstract Full Text Full Text PDF PubMed Scopus (127) Google Scholar), and software design should promote the easy inclusion of alternative methods as they become available. Simple fragment-counting may be useful, but it does not always promote sensitive site identification. In the wider context of scoring, database reduction strategies are necessary to address the n2 time complexity of searching for crosslinked peptides, which becomes particularly acute when using nonspecific crosslinker chemistries and nonspecific digestion enzymes (24.Petrotchenko E.V. Borchers C.H. Application of a fast sorting algorithm to the assignment of mass spectrometric cross-linking data.Proteomics. 2014; 14: 1987-1989Crossref PubMed Scopus (10) Google Scholar, 28.Giese S.H. Fischer L. Rappsilber J. A study into the CID behavior of cross-linked peptides.Mol. Cell. Proteomics. 2016; 15: 1094-1104Abstract Full Text Full Text PDF PubMed Scopus (52) Google Scholar). Methods that do not complicate the experimental workflow or compromise the validity of a probabilistic scoring function are needed. Crosslinkers that are cleavable in the gas phase through CID or ETD are emerging that have been used to reduce time complexity (21.Liu F. Rijkers D.T. Post H. Heck A.J. Proteome-wide profiling of protein assemblies by cross-linking mass spectrometry.Nat. Methods. 2015; 12: 1179-1184Crossref PubMed Scopus (303) Google Scholar, 30.Gotze M. Pettelkau J. Fritzsche R. Ihling C.H. Schafer M. Sinz A. Automated assignment of MS/MS cleavable cross-links in protein 3D-structure analysis.J. Am. Soc. Mass Spectrom. 2015; 26: 83-97Crossref PubMed Scopus (128) Google Scholar), but these currently come with platform restrictions and limitations on the choice of reagents. Concepts that treat the second peptide in a linked pair as an open modification are particularly useful in site identification (28.Giese S.H. Fischer L. Rappsilber J. A study into the CID behavior of cross-linked peptides.Mol. Cell. Proteomics. 2016; 15: 1094-1104Abstract Full Text Full Text PDF PubMed Scopus (52) Google Scholar, 31.Chu F. Baker P.R. Burlingame A.L. Chalkley R.J. Finding chimeras: a bioinformatics strategy for identification of cross-linked peptides.Mol. Cell. Proteomics. 2010; 9: 25-31Abstract Full Text Full Text PDF PubMed Scopus (66) Google Scholar, 32.Wang J. Anania V.G. Knott J. Rush J. Lill J.R. Bourne P.E. Bandeira N. Combinatorial approach for large-scale identification of linked peptides from tandem mass spectrometry spectra.Mol. Cell. Proteomics. 2014; 13: 1128-1136Abstract Full Text Full Text PDF PubMed Scopus (17) Google Scholar, 33.Singh P. Shaffer S.A. Scherl A. Holman C. Pfuetzner R.A. Larson Freeman T.J. Miller S.I. Hernandez P. Appel R.D. Goodlett D.R. Characterization of protein cross-links via mass spectrometry and an open-modification search strategy.Anal. Chem. 2008; 80: 8799-8806Crossref PubMed Scopus (63) Google Scholar). Integrating and extending these concepts within a wider platform will be required to meet the challenges associated with nonspecific labeling and digestion protocols, however. In this contribution we present a crosslinking plug-in for the flexible Mass Spec Studio framework, which combines useful components for each of the feature categories discussed above. We extend the database reduction concept to limit dependences on precursor ion m/z, and provide a collection of tools for rapid results validation using the raw data. Finally, we illustrate how the XL-MS data can be combined in the Studio with other structural data to support modeling activities. The Mass Spec Studio was designed to capture the growing number of experimental methods for structural mass spectrometry, and the diverse computational strategies for identifying and quantifying structural restraints for modeling purposes (14.Rey M. Sarpe V. Burns K.M. Buse J. Baker C.A. van Dijk M. Wordeman L. Bonvin A.M. Schriemer D.C. Mass spec studio for integrative structural biology.Structure. 2014; 22: 1538-1548Abstract Full Text Full Text PDF PubMed Scopus (69) Google Scholar). A large number of the core activities associated with mining LC-MS/MS data are shared by most label-detection applications, therefore we designed a composite application where loosely coupled components allow effective use of shared tools, as well as straightforward development of future extensions and new pluggable content. The crosslinking software package was built as a plug-in, using an updated version of the Studio framework (v2). As with v1 of the framework, plug-ins are either individual components or collections of components, consisting of object libraries, processing algorithms, user interface (UI) elements and experiment types. During software start-up, each component is dynamically loaded and assembled into the main application. Several enhancements to the v1 framework facilitate per-session component customization, dependence management, as well as version control. Updated versions of AvalonDock and Prism boost Mass Spec Studio's capacity to conform to the latest software design practices and patterns, for scalable and sustainable .NET applications. Fig. 1 provides a schematic of the Studio framework that we used to develop a robust concept in crosslink detection. The new dependence management in v2 makes sharing of common tools and components more streamlined, which simplifies the development of new plug-ins. We took advantage of this feature to refactor common elements of our existing packages (HX-MS and CL-MS) into a new "Structural Biology" (StructBio) resource package. The new framework, therefore, supports all three structural mass spectrometry experiment types. As a result, any future development of structural biology applications or upgrades can employ a common, carefully implemented and tested set of tools and resources. Such tools and resources include mass calculators, digestion/fragmentation rules, peptide identification algorithms, and molecular visualization. To develop a crosslink detection concept that supports both computational and visual validation of large data sets, we applied design concepts specifically borrowed from our HX-MS2 platform (34.Burns K.M. Sarpe V. Wagenbach M. Wordeman L. Schriemer D.C. HX-MS2 for high performance conformational analysis of complex protein states.Protein Sci. 2015; 24: 1313-1324Crossref PubMed Scopus (4) Google Scholar). This primarily involved revising our strategy for peptide library generation and augmenting our scoring metrics for crosslink detection and validation. These two elements are described below. To generate a searchable library, a database of linear peptides is created, using proteins known to exist in the sample or from a whole proteome when the sample composition cannot be inferred from other data (e.g. a prior data-dependent proteomics experiment). For individual proteins, the sequence information and any associated structures can be fetched automatically from the protein data bank (PDB, www.rcsb.org). The database is expanded according to a flexible selection of fixed or variable modifications, and a suitable crosslinker is selected from a set of available options. New reagents can also be designed and implemented, using a crosslinker design wizard, with heavy and light versions if desired. Library expansion is parameterized by representing a user-defined range of possible charge states and user-selected mass modifications. This list automatically includes peptides singly labeled with the crosslinker, or so-called dead-ended peptides. Multiple data-dependent LC-MS/MS runs, from any file type, are associated with the library for subsequent processing. An option exists to reformat the data into a highly compressed binary file structure (.mssdata) for efficient storage and later use in other applications, although this is not required. We have implemented a probabilistic scoring strategy within the Mass Spec Studio framework, behind a robust library reduction method. For library reduction, we considered a number of concepts to address the underlying n2 time complexity associated with investigating all possible pairs of peptides. In one case, the peptide library can be sorted by mass and a smaller list of peptide pairs can be assembled with entries that fall within an acceptable tolerance around the precursor ion mass (an MS-level approach) (24.Petrotchenko E.V. Borchers C.H. Application of a fast sorting algorithm to the assignment of mass spectrometric cross-linking data.Proteomics. 2014; 14: 1987-1989Crossref PubMed Scopus (10) Google Scholar, 35.Choi S. Jeong J. Na S. Lee H.S. Kim H.Y. Lee K.J. Paek E. New algorithm for the identification of intact disulfide linkages based on fragmentation characteristics in tandem mass spectra.J. Proteome Res. 2010; 9: 626-635Crossref PubMed Scopus (73) Google Scholar). A classic binary search could then speed the analysis. In another case, assumptions can be made based on the acquired fragment spectra (an MS/MS-level approach). For example, one method requires that evidence of a dead-ended version of a crosslinked individual peptide should be observable (19.Lima D.B. de Lima T.B. Balbuena T.S. Neves-Ferreira A.G. Barbosa V.C. Gozzo F.C. Carvalho P.C. SIM-XL: A powerful and user-friendly tool for peptide cross-linking analysis.J. Proteomics. 2015; 129: 51-55Crossref PubMed Scopus (60) Google Scholar). This idea is based on the assumption that crosslinking chemistries will always generate a distribution of all product types, with the actual crosslinking event itself occurring with a relatively low abundance. A third approach extends the idea of MS/MS-level library filtering, by requiring fragment evidence for one peptide in the MS/MS spectrum and treating the other as an open modification constrained by the precursor ion m/z (e.g. (28.Giese S.H. Fischer L. Rappsilber J. A study into the CID behavior of cross-linked peptides.Mol. Cell. Proteomics. 2016; 15: 1094-1104Abstract Full Text Full Text PDF PubMed Scopus (52) Google Scholar)). In the Studio, we chose to develop an approach using MS/MS-level filtering, based loosely on the concept of ion tags in xQuest (16.Rinner O. Seebacher J. Walzthoeni T. Mueller L.N. Beck M. Schmidt A. Mueller M. Aebersold R. Identification of cross-linked peptides from large sequence databases.Nat. Methods. 2008; 5: 315-318Crossref PubMed Scopus (5) Google Scholar). Our approach is outlined in Fig. 2. We assume that an MS/MS spectrum of a crosslinked peptide must possess some minimum unmodified sequence information for each peptide comprising the pair, which can be scored in a probabilistic manner. That is, a population of simple sequence ions will exist for each peptide (possibly few and low intensity), akin to a peptide sequence tag (36.Mann M. Wilm M. Error-tolerant identification of peptides in sequence databases by peptide sequence tags.Anal. Chem. 1994; 66: 4390-4399Crossref PubMed Scopus (1314) Google Scholar). Ions from the unmodified portions represent one of the more abundant ion types in CID MS/MS data of crosslinked peptides (16.Rinner O. Seebacher J. Walzthoeni T. Mueller L.N. Beck M. Schmidt A. Mueller M. Aebersold R. Identification of cross-linked peptides from large sequence databases.Nat. Methods. 2008; 5: 315-318Crossref PubMed Scopus (5) Google Scholar, 37.Yang B. Wu Y.J. Zhu M. Fan S.B. Lin J. Zhang K. Li S. Chi H. Li Y.X. Chen H.F. Luo S.K. Ding Y.H. Wang L.H. Hao Z. Xiu L.Y. Chen S. Ye K. He S.M. Dong M.Q. Identification of cross-linked peptides from complex samples.Nat. Methods. 2012; 9: 904-906Crossref PubMed Scopus (404) Google Scholar). A library of peptides is constructed that consists of all linear peptides having some minimum threshold score, which we base on the E-score concept used in OMSSA (38.Geer L.Y. Markey S.P. Kowalak J.A. Wagner L. Xu M. Maynard D.M. Yang X. Shi W. Bryant S.H. Open mass spectrometry search algorithm.J. Proteome Res. 2004; 3: 958-964Crossref PubMed Scopus (1163) Google Scholar), E(y,μ)=N1−∑x=0y−1P(x,μ)N(Eq. 1) where P(x, μ) is a Poisson distribution: x is the number of fragment matches between an experimental MS/MS spectrum and a theoretical spectrum of a library entry, the average number of random matches is μ, y is the total number of successful fragment matches and N is the peptide library size. The E-score provides an opportunity for a probabilistic assessment of the ion series for the unmodified portions of peptide, based on an underlying Poisson model of noise in the search, which seems sensible to preserve for crosslinked peptide detection. That is, we assume that the noise characteristics of the search space for crosslinked peptides is not significantly different than linear peptide noise, which is well modeled by the Poisson distribution (38.Geer L.Y. Markey S.P. Kowalak J.A. Wagner L. Xu M. Maynard D.M. Yang X. Shi W. Bryant S.H. Open mass spectrometry search algorithm.J. Proteome Res. 2004; 3: 958-964Crossref PubMed Scopus (1163) Google Scholar). In the initial search for library reduction, we employ a hard filter on fragment ions in the MS/MS spectrum by requiring strong evidence for monoisotopic ions (i.e. M peak >5 times the intensity of an M-1 peak, if detected). The exercise is to float E as high as possible, while preserving a time-manageable list of candidate linear peptides for the next stage of scoring. Precursor ion mass is only introduced after this initial reduction in search space. It can be difficult for data-dependent experiment types to accurately select a monoisotopic ion, particularly as peptide mass and charge increase. We filter candidates based on the actual isotope profile in the LC-MS data, rather than relying solely on the instrument-generated precursor ion values. Bounded by a precursor mass tolerance that we set in the search, we initially accept all possible peptide candidates with an abundant isotope at the triggered mass/charge value, but ultimately require every high probability crosslinked peptide to generate an isotopic distribution that maintains a minimum goodness-of-fit to the actual data. Our acceptance criteria allows us to evaluate candidate crosslink peptides against all possible background peptides that may fall within the ion inclusion window (i.e. crosslinked peptides, linear peptides, dead-end peptides or other modified linear peptides permitted in the library construction phase). A list of possible crosslinked peptides is assembled from the output of the mass filtering, based on linear peptides that may coexist in the MS/MS spectrum. The actual scoring phase implements a strategy adopted from X!link (39.Lee Y.J. Probability-based shotgun cross-linking sites analysis.J. Am. Soc. Mass Spectrom. 2009; 20: 1896-1899Crossref

Referência(s)