Interactive Peptide Spectral Annotator: A Versatile Web-based Tool for Proteomic Applications

Artigo Acesso aberto Revisado por pares

Interactive Peptide Spectral Annotator: A Versatile Web-based Tool for Proteomic Applications

2019; Elsevier BV; Volume: 18; Issue: 8 Linguagem: Inglês

10.1074/mcp.tir118.001209

ISSN

1535-9484

Autores

Dain R. Brademan, Nicholas M. Riley, Nicholas W. Kwiecien, Joshua J. Coon,

Tópico(s)

Mass Spectrometry Techniques and Applications

Resumo

Here we present IPSA, an innovative web-based spectrum annotator that visualizes and characterizes peptide tandem mass spectra. A tool for the scientific community, IPSA can visualize peptides collected using a wide variety of experimental and instrumental configurations. Annotated spectra are customizable via a selection of interactive features and can be exported as editable scalable vector graphics to aid in the production of publication-quality figures. Single spectra can be analyzed through provided web forms, whereas data for multiple peptide spectral matches can be uploaded using the Proteomics Standards Initiative file formats mzTab, mzIdentML, and mzML. Alternatively, peptide identifications and spectral data can be provided using generic file formats. IPSA provides supports for annotating spectra collecting using negative-mode ionization and facilitates the characterization of experimental MS/MS performance through the optional export of fragment ion statistics from one to many peptide spectral matches. This resource is made freely accessible at http://interactivepeptidespectralannotator.com, whereas the source code and user guides are available at https://github.com/coongroup/IPSA for private hosting or custom implementations. Here we present IPSA, an innovative web-based spectrum annotator that visualizes and characterizes peptide tandem mass spectra. A tool for the scientific community, IPSA can visualize peptides collected using a wide variety of experimental and instrumental configurations. Annotated spectra are customizable via a selection of interactive features and can be exported as editable scalable vector graphics to aid in the production of publication-quality figures. Single spectra can be analyzed through provided web forms, whereas data for multiple peptide spectral matches can be uploaded using the Proteomics Standards Initiative file formats mzTab, mzIdentML, and mzML. Alternatively, peptide identifications and spectral data can be provided using generic file formats. IPSA provides supports for annotating spectra collecting using negative-mode ionization and facilitates the characterization of experimental MS/MS performance through the optional export of fragment ion statistics from one to many peptide spectral matches. This resource is made freely accessible at http://interactivepeptidespectralannotator.com, whereas the source code and user guides are available at https://github.com/coongroup/IPSA for private hosting or custom implementations. Tandem mass spectrometry (MS/MS) 1The abbreviations used are: MS/MS, tandem mass spectrometry; CSV, comma-separated value; MGF, mascot generic format; IPSA, interactive peptide spectral annotator; PSM, peptide spectral match; m/z, mass to charge ratio; ETD, electron transfer dissociation; EThcD, electron transfer and higher-energy collision dissociation; ETciD, electron transfer and collision-induced dissociation; AI-ETD, activated ion electron transfer dissociation; AI-NETD, activated ion negative electron transfer dissociation; UVPD, ultraviolet photodissociation; SVG, scalable vector graphic; ACN, acetonitrile; TFA, trifluoroacetic acid; PPM, parts per million; PTM, post-translational modification; FDR, false discovery rate; JSON, JavaScript Object Notation.1The abbreviations used are: MS/MS, tandem mass spectrometry; CSV, comma-separated value; MGF, mascot generic format; IPSA, interactive peptide spectral annotator; PSM, peptide spectral match; m/z, mass to charge ratio; ETD, electron transfer dissociation; EThcD, electron transfer and higher-energy collision dissociation; ETciD, electron transfer and collision-induced dissociation; AI-ETD, activated ion electron transfer dissociation; AI-NETD, activated ion negative electron transfer dissociation; UVPD, ultraviolet photodissociation; SVG, scalable vector graphic; ACN, acetonitrile; TFA, trifluoroacetic acid; PPM, parts per million; PTM, post-translational modification; FDR, false discovery rate; JSON, JavaScript Object Notation.is the centerpiece of modern proteome analysis. Advances in instrument design and acquisition software have enabled collection of well over 100,000 MS/MS scans in less than an hour of analysis (1.Hebert A.S. Richards A.L. Bailey D.J. Ulbrich A. Coughlin E.E. Westphall M.S. Coon J.J. The One Hour Yeast Proteome.Mol. Cell Proteomics. 2014; 13: 339-347Abstract Full Text Full Text PDF PubMed Scopus (411) Google Scholar, 2.Richards A.L. Hebert A.S. Ulbrich A. Bailey D.J. Coughlin E.E. Westphall M.S. Coon J.J. One-hour proteome analysis in yeast.Nat. Protoc. 2015; 10: 701-714Crossref PubMed Scopus (85) Google Scholar, 3.Senko M.W. Remes P.M. Canterbury J.D. Mathur R. Song Q. Eliuk S.M. Mullen C. Earley L. Hardman M. Blethrow J.D. Bui H. Specht A. Lange O. Denisov E. Makarov A. Horning S. Zabrouskov V. Novel parallelized quadrupole/linear ion trap/orbitrap tribrid mass spectrometer improving proteome coverage and peptide identification rates.Anal. Chem. 2013; 85: 11710-11714Crossref PubMed Scopus (176) Google Scholar, 4.Richards A.L. Merrill A.E. Coon J.J. Proteome sequencing goes deep.Curr. Opin. Chem. Biol. 2015; 24: 11-17Crossref PubMed Scopus (74) Google Scholar, 5.Hebert A.S. Thöing C. Riley N.M. Kwiecien N.W. Shiskova E. Huguet R. Cardasis H.L. Kuehn A. Eliuk S. Zabrouskov V. Westphall M.S. McAlister G.C. Coon J.J. Improved precursor characterization for data-dependent mass spectrometry.Anal. Chem. 2018; 90: 2333-2340Crossref PubMed Scopus (46) Google Scholar, 6.Scheltema R.A. Hauschild J.-P. Lange O. Hornburg D. Denisov E. Damoc E. Kuehn A. Makarov A. Mann M. The Q exactive HF, a benchtop mass spectrometer with a pre-filter, high-performance quadrupole and an ultra-high-field orbitrap analyzer.Mol. Cell Proteomics. 2014; 13: 3698-3708Abstract Full Text Full Text PDF PubMed Scopus (231) Google Scholar, 7.Kelstrup C.D. Bekker-Jensen D.B. Arrey T.N. Hogrebe A. Harder A. Olsen J.V. Performance evaluation of the Q exactive HF-X for shotgun proteomics.J. Proteome Res. 2018; 17: 727-738Crossref PubMed Scopus (159) Google Scholar, 8.Bekker-Jensen D.B. Kelstrup C.D. Batth T.S. Larsen S.C. Haldrup C. Bramsen J.B. Sørensen K.D. Høyer S. Ørntoft T.F. Andersen C.L. Nielsen M.L. Olsen J.V. An optimized shotgun strategy for the rapid generation of comprehensive human proteomes.Cell Syst. 2017; 4: 587-599.e4Abstract Full Text Full Text PDF PubMed Scopus (255) Google Scholar, 9.Shishkova E. Hebert A.S. Coon J.J. Now, more than ever, proteomics needs better chromatography.Cell Syst. 2016; 3: 321-324Abstract Full Text Full Text PDF PubMed Scopus (93) Google Scholar). Researchers have developed a wide variety of search algorithms and related computational tools to rapidly translate this large volume of experimental data to peptide spectral matches (PSMs), where peptide sequences are assigned to spectra to identify the proteins present in a sample (10.Eng J.K. Mccormack A.L. Yates J.R. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.Am. Soc. Mass Spectrom. 1994; 5: 976-989Crossref PubMed Scopus (5420) Google Scholar, 11.Cox J. Mann M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification.Nat. Biotechnol. 2008; 26: 1367-1372Crossref PubMed Scopus (9154) Google Scholar, 12.Perkins D.N. Pappin D.J. C.C. Creasy D.M. Cottrell J.S. Probability-based protein identification by searching sequence databases using mass spectrometry data.Electrophoresis. 1999; 20: 3551-3567Crossref PubMed Scopus (6763) Google Scholar, 13.Taylor J.A. Johnson R.S. Sequence database searches via de Novo peptide sequencing by tandem mass spectrometry.Rapid Commun. Mass Spectrom. 1997; 11: 1067-1075Crossref PubMed Scopus (339) Google Scholar, 14.Ma B. Zhang K. Hendrie C. Liang C. Li M. Doherty-Kirby A. Lajoie G. PEAKS: Powerful software for peptide de novo sequencing by tandem mass spectrometry.Rapid Commun. Mass Spectrom. 2003; 17: 2337-2342Crossref PubMed Scopus (966) Google Scholar, 15.Chi H. Sun R.X. Yang B. Song C.Q. Wang L.H. Liu C. Fu Y. Yuan Z.F. Wang H.P. He S.M. Dong M.Q. PNovo: De novo peptide sequencing and identification using HCD spectra.J. Proteome Res. 2010; 9: 2713-2724Crossref PubMed Scopus (127) Google Scholar, 16.Sinitcyn P. Daniel Rudolph J. Cox J. Rudolph J.D. Cox J. Computational methods for understanding mass spectrometry–based shotgun proteomics data.Annu. Rev. Biomed. Data Sci. 2018; 1 (annurev-biodatasci-080917-013516)Crossref Google Scholar). An important component to this process is matching expected product ions to those observed in the experimental spectra. Annotation of spectra in this sense usually involves labeling observed m/z features with matched fragment ion designations (e.g. a/x-, b/y-, or c/z-type product ions) derived from the reported peptide sequence. Expert manual annotation is a valuable but greatly time-consuming process—unfeasible for the large volume of spectra generated in modern proteomic experiments. Proteomic field guidelines have increasingly emphasized the importance of providing access to annotated MS/MS spectra for publication, which allows others to inspect reported PSMs and validate their assignment to a given sequence (17.Bradshaw R.A. Burlingame A.L. Carr S. Aebersold R. Reporting protein identification data: the next generation of guidelines.Mol. Cell Proteomics. 2006; 5: 787-788Abstract Full Text Full Text PDF PubMed Scopus (203) Google Scholar, 18.Jones A.R. Eisenacher M. Mayer G. Kohlbacher O. Siepen J. Hubbard S.J. Selley J.N. Searle B.C. Shofstahl J. Seymour S.L. Julian R. Binz P.-A. Deutsch E.W. Hermjakob H. Reisinger F. Griss J. Vizcaíno J.A. Chambers M. Pizarro A. Creasy D. The mzIdentML data standard for mass spectrometry-based proteomics results.Mol. Cell Proteomics. 2012; 11 (M111.014381)Abstract Full Text Full Text PDF PubMed Scopus (158) Google Scholar, 19.Seymour S.L. Farrah T. Binz P.A. Chalkley R.J. Cottrell J.S. Searle B.C. Tabb D.L. Vizcaíno J.A. Prieto G. Uszkoreit J. Eisenacher M. Martínez-Bartolomé S. Ghali F. Jones A.R. A standardized framing for reporting protein identifications in mzIdentML 1.2.Proteomics. 2014; 14: 2389-2399Crossref PubMed Scopus (20) Google Scholar, 20.Burlingame A. Carr S.A. Bradshaw R.A. Chalkley R.J. On credibility, clarity, and compliance.Mol. Cell Proteomics. 2015; 14: 1731-1733Abstract Full Text Full Text PDF PubMed Scopus (9) Google Scholar). Many software tools have been created to aid researchers annotating individual PSMs contained in bulk datasets. Most such tools are downloadable and often integrated directly into data-analysis suites, although a handful have been developed as web browser-based platforms (21.Baker P.R. Chalkley R.J. MS-viewer: a web-based spectral viewer for proteomics results.Mol. Cell Proteomics. 2014; 13: 1392-1396Abstract Full Text Full Text PDF PubMed Scopus (104) Google Scholar, 22.Strohalm M. Hassman M. Košata B. Kodíček M. mMass data miner: An open source alternative for mass spectrometric data analysis.Rapid Commun. Mass Spectrom. 2008; 22: 905-908Crossref PubMed Scopus (375) Google Scholar, 23.Colinge J. Masselot A. Carbonell P. Appel R.D. InSilicoSpectro: An open-source proteomics library.J. Proteome Res. 2006; 5: 619-624Crossref PubMed Scopus (24) Google Scholar). Lorikeet (https://uwpr.github.io/Lorikeet/) is a well-established web-based spectral annotator which has been integrated into several online mass spectrometry resources to visualize routine shotgun and cross-linked proteomics data (24.Sharma V. Eng J.K. Maccoss M.J. Riffle M. A mass spectrometry proteomics data management platform.Mol. Cell Proteomics. 2012; 11: 824-831Abstract Full Text Full Text PDF PubMed Scopus (18) Google Scholar, 25.Riffle M. Jaschob D. Zelter A. Davis T.N. ProXL (Protein Cross-Linking Database): A platform for analysis, visualization, and sharing of protein cross-linking mass spectrometry data.J. Proteome Res. 2016; 15: 2863-2870Crossref PubMed Scopus (37) Google Scholar, 26.Perez-Riverol Y. Alpi E. Wang R. Hermjakob H. Vizcaíno J.A. Making proteomics data accessible and reusable: Current state of proteomics databases and repositories.Proteomics. 2015; 15: 930-950Crossref PubMed Scopus (128) Google Scholar, 27.Riffle M. Merrihew G.E. Jaschob D. Sharma V. Davis T.N. Noble W.S. MacCoss M.J. Visualization and dissemination of multidimensional proteomics data comparing protein abundance during Caenorhabditis elegans development.J. Am. Soc. Mass Spectrom. 2015; 26: 1827-1836Crossref PubMed Scopus (5) Google Scholar). However, Lorikeet does not render generated annotated spectra in scalable vector graphics (SVG) format, limiting the flexibility of exported visualizations with regards to figure creation. Although powerful for the platforms for which they were designed, many of these tools are inseparable from their respective analytical pipelines; data visualization in MaxQuant is only available following processing with the integrated Andromeda search engine, for example. Their purview is therefore limited, and facile spectral annotation is restricted to only those search algorithms packaged in a pipeline with a developed annotator. This restriction poses a problem for numerous applications, especially for alternative peptide fragmentation methods such as ultraviolet photodissociation (UVPD), collisionally supplemented electron-transfer dissociation (EThcD), or activated-ion electron-transfer dissociation (AI-ETD) (28.Ly T. Julian R.R. Ultraviolet photodissociation: developments towards applications for mass-spectrometry-based proteomics.Angew. Chemie - Int. Ed. 2009; 48: 7130-7137Crossref PubMed Scopus (117) Google Scholar, 29.Yu Q. Wang B. Chen Z. Urabe G. Glover M.S. Shi X. Guo L.W. Kent K.C. Li L. Electron-transfer/higher-energy collision dissociation (EThcD)-enabled intact glycopeptide/glycoproteome characterization.J. Am. Soc. Mass Spectrom. 2017; 28: 1751-1764Crossref PubMed Scopus (139) Google Scholar, 30.Ledvina A.R. Beauchene N.A. McAlister G.C. Syka J.E.P. Schwartz J.C. Griep-Raming J. Westphall M.S. Coon J.J. Activated-ion electron transfer dissociation improves the ability of electron transfer dissociation to identify peptides in a complex mixture.Anal. Chem. 2010; 82: 10068-10074Crossref PubMed Scopus (61) Google Scholar). Often these methods can be integrated into established analytical pipelines adopted by the field over the course of several years. But flexible annotation tools are largely unavailable in the beginning stages of method development—arguably when they are needed most. For example, Lorikeet bundles annotation calculations directly with its spectrum viewer. This requires in-depth knowledge of Lorikeet's architecture to add functionality for new technologies. However, separating the annotation process from the spectrum renderer is amenable toward a more stable platform for spectral annotation as the components can be maintained and implemented independently. Here we present the Interactive Peptide Spectra Annotator (IPSA) to provide a standalone web platform for annotation and interpretation of peptide tandem mass spectra independent of instrumental platform, identification pipeline, and peptide fragmentation technique. IPSA provides flexibility to annotate spectra containing any of the six common peptide fragment ion types. Importantly, it can export annotated data in a tabular format, which enables the rapid culmination of fragment ion statistics for individual or multiple peptide tandem mass spectra, a useful tool in a wide range of proteomic experiments. We have also built in compatibility with spectra collected in the negative mode, providing a much-needed resource for the continued development of negative-mode proteomic approaches. Further, IPSA offers a platform for the generation and exportation of figure-ready annotated spectra in an editable format. In all, IPSA expands spectral annotation capabilities to all types of shotgun proteomic data regardless of how data was collected or processed. IPSA is composed of two major components: a client-facing interactive web visualizer and a server-side data processor which handles the data processing required for spectral annotation. Client-side visualization software was developed using AngularJS. The D3.js library is leveraged to generate interactive annotated spectra using SVG from annotated data returned from the server after analysis (31.Bostock M. Ogievetsky V. Heer J. D3 data-driven documents.IEEE Trans. Vis. Comput. Graph. 2011; 17: 2301-2309Crossref PubMed Scopus (1966) Google Scholar). Server-side software consists of a set of modular PHP scripts, which perform form validation, data processing and annotation, file upload handling, and data export. A MySQL database is incorporated to securely cache parsed peptide identifications and spectral information extracted from uploaded data. MySQL integration facilitates data storage and retrieval when annotation requests are submitted to the server. Cell pellets of Saccharomyces cerevisiae (strain BY4742) containing ∼1 × 108 cells were harvested from liquid culture by centrifugation (3000 × g, 3 min, 4 °C). The supernatant was removed, and the cell pellet was resuspended in 8 m urea, 100 mm tris (pH 8.0). Methanol was added to 90% by volume and vortexed to lyse the cells and induce protein precipitation. The resulting solution was centrifuged (14,000 × g, 3 min) to form a protein pellet. The supernatant was removed, and the pellet was resuspended in 8 m urea, 100 mm tris (pH 8.0), 10 mm tris(2-carboxyethyl)phosphine, and 40 mm chloroacetamide. The solution was then diluted to 1.5 m urea with 50 mm tris. Trypsin (Promega, Madison, WI) was added (1:50 enzyme/protein) and was allowed to digest overnight (22 °C). The resultant peptides were acidified (pH < 2.0) using 0.1% trifluoroacetic acid (TFA) and were desalted using polymeric reverse phase Strata-X columns. Columns were equilibrated using one bed volume of 100% acetonitrile (ACN), then one bed volume of 0.1% TFA. Peptides were loaded onto the column and washed with two bed volumes of 0.1% TFA. Peptides were eluted by an addition of 500 μl 40% ACN, 0.1% TFA followed by an addition of 650 μl 70% ACN, 0.1% TFA and were then dried and resuspended in 0.2% formic acid. Peptide concentration was determined using a Pierce quantitative colorimetric peptide assay (Thermo Fisher Scientific, Rockford, IL). Low pH reverse-phase liquid chromatography was conducted using a Dionex UltiMate 3000 UPLC as described previously (1.Hebert A.S. Richards A.L. Bailey D.J. Ulbrich A. Coughlin E.E. Westphall M.S. Coon J.J. The One Hour Yeast Proteome.Mol. Cell Proteomics. 2014; 13: 339-347Abstract Full Text Full Text PDF PubMed Scopus (411) Google Scholar, 2.Richards A.L. Hebert A.S. Ulbrich A. Bailey D.J. Coughlin E.E. Westphall M.S. Coon J.J. One-hour proteome analysis in yeast.Nat. Protoc. 2015; 10: 701-714Crossref PubMed Scopus (85) Google Scholar). Eluting peptides were analyzed using a Q Exactive HF hybrid quadrupole Orbitrap mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) and were fragmented at HCD at 25% normalized collisional energy. Survey scans were taken at a resolution of 60,000 at 200 m/z, whereas tandem mass spectra were collected using a resolution of 15,000 at 200 m/z. The resulting tandem mass spectra were searched using the Coon OMSSA Proteomic Analysis Software Suite (v1.4.1) (32.Geer L.Y. Markey S.P. Kowalak J.A. Wagner L. Xu M. Maynard D.M. Yang X. Shi W. Bryant S.H. Open mass spectrometry search algorithm.J. Proteome Res. 2004; 3: 958-964Crossref PubMed Scopus (1164) Google Scholar, 33.Wenger C.D. Phanstiel D.H. Lee M.V. Bailey D.J. Coon J.J. COMPASS: A suite of pre- and post-search proteomics software tools for OMSSA.Proteomics. 2011; 11: 1064-1074Crossref PubMed Scopus (132) Google Scholar). A precursor mass tolerance of ±150 ppm was used, whereas fragment ions were searched using a mass tolerance of ±0.01 Da. A maximum of 3 missed tryptic cleavages were permitted. Carbamidomethylation of cysteine was set as a fixed modification, whereas oxidation of methionine was set as a variable modification. Data was searched against a canonical and isoform Saccharomyces cerevisiae database (UniProt, June 10, 2016) concatenated with the reverse protein sequence for decoy generation. A 1% FDR threshold was used at the peptide level, using both e-value and precursor mass accuracy to filter results. Additional peptide identifications and spectral data were acquired from the previous work of Riley et al. to demonstrate IPSA's ability to process PSMs fragmented using alternative dissociation techniques. These include ETD; collisionally supplemented ETD (ETcaD and EThcD); AI-ETD; AI-NETD; and AI-ETD with supplemental infrared photon irradiation post-reaction (AI-ETD+) (34.Riley N.M. Westphall M.S. Hebert A.S. Coon J.J. Implementation of activated ion electron transfer dissociation on a quadrupole-orbitrap-linear ion trap hybrid mass spectrometer.Anal. Chem. 2017; 89: 6358-6366Crossref PubMed Scopus (32) Google Scholar, 35.Riley N.M. Rush M.J.P. Rose C.M. Richards A.L. Kwiecien N.W. Bailey D.J. Hebert A.S. Westphall M.S. Coon J.J. The negative mode proteome with activated ion negative electron transfer dissociation (AI-NETD).Mol. Cell Proteomics. 2015; 14: 2644-2660Abstract Full Text Full Text PDF PubMed Scopus (42) Google Scholar). IPSA was developed as a versatile web-based spectral analysis tool capable of individual or en masse annotation of PSMs generated from experiments that produce any of the six common peptide fragment ion types (Fig. 1). Single spectra can be annotated by entering peptide and spectral data into an intuitive web form, whereas multiple spectra can be uploaded directly to the website to be individually queried or batch processed. Single annotations are conducted using the metrics provided by the user through the web form and are returned client-side to generate an exportable, annotated spectrum. Exported spectra can easily be shared or integrated into figures. Because the individual interrogation of large numbers of PSMs can quickly become tedious, we added functionality to batch process all uploaded PSMs and export the annotations in a tabular format. This feature permits the rapid characterization of tens of thousands of tandem mass spectra. A single peptide spectrum can be annotated by providing the peptide's sequence, precursor charge, maximum allowed fragment charge, and spectral data to the user interface shown in Fig. 2A. Expected fragmentation patterns and neutral losses can be selected to specify which theoretical peptide fragment ions are generated during data processing (36.Roepstorff P. Fohlman J. Letter to the editors.Biol. Mass Spectrom. 1984; 11: 601Crossref Scopus (2386) Google Scholar, 37.Johnson R.S. Martin S.A. Biemann K. Stults J.T. Watson J.T. Novel fragmentation process of peptides by collision-induced decomposition in a tandem mass spectrometer: differentiation of leucine and isoleucine.Anal. Chem. 1987; 59: 2621-2625Crossref PubMed Scopus (390) Google Scholar). The mass tolerance for matching experimental features to theoretical fragment ions can be set in either ppm or Daltons. A relative intensity, raw intensity, or S/N (if supplied with spectral data) cutoff can be defined to ignore low-abundance or insignificant features during matching. Visualization colors can additionally be customized. A predefined list of common protein post-translational modifications (PTMs) can be queried and selected using a searchable dropdown below the fragmentation options. Available PTMs for a peptide are intelligently filtered to only show PTMs relevant to the entered peptide sequence. If a desired PTM is not included in the predefined modification list, new PTMs can be defined and are stored locally in the user's web browser. The user can provide a new modification name, target site, and mass shift to create a custom PTM option. When the server receives an annotation request, data entered into the user interface is validated and sent for processing. The peptide sequence is parsed and assembled into an intact peptide in-silico. Theoretical peptide fragment ions are created from the intact peptide using the fragmentation schema selected by the user. Each fragment is matched to m/z peak within the specified mass tolerance. To address the case that multiple theoretical fragments are mapped to the same experimental feature, only the theoretical fragment that matches with the smallest mass error is reported. Once annotation mapping has been finalized, annotated spectral data is formatted into JSON and is returned to the client for visualization. Immediately upon this return, IPSA generates the interactive annotated spectrum (Fig. 2B). This visualization consists of three portions: a peptide sequence marked with detected fragment ion locations and summary statistics, an interactive annotated spectrum, and an interactive scatterplot of the matched fragment-ion mass errors. The visualization supports many interactive features to help facilitate data interpretation. Both axes allow contextual zooming for deeper investigation of congested sections of annotated spectra, whereas tooltips provide exact values for any highlighted plotted experimental features. Highlighted fragments are mirrored in each section of the visualization to emphasize all aspects of the feature of interest. Additionally, annotation labels can be dragged to clearer locations to declutter busy regions. The generated visualization can be exported as an SVG file for figure creation as it appears on screen or in a tabular format at any time. If many spectra need to be rapidly interrogated, IPSA provides functionality to serially process multiple PSMs by directly uploading files containing peptide identifications and spectral data to the server. Identifications can be provided either in the Proteomics Standards Initiative file formats mzTab or mzIdentML, or in a generic CSV format (18.Jones A.R. Eisenacher M. Mayer G. Kohlbacher O. Siepen J. Hubbard S.J. Selley J.N. Searle B.C. Shofstahl J. Seymour S.L. Julian R. Binz P.-A. Deutsch E.W. Hermjakob H. Reisinger F. Griss J. Vizcaíno J.A. Chambers M. Pizarro A. Creasy D. The mzIdentML data standard for mass spectrometry-based proteomics results.Mol. Cell Proteomics. 2012; 11 (M111.014381)Abstract Full Text Full Text PDF PubMed Scopus (158) Google Scholar, 19.Seymour S.L. Farrah T. Binz P.A. Chalkley R.J. Cottrell J.S. Searle B.C. Tabb D.L. Vizcaíno J.A. Prieto G. Uszkoreit J. Eisenacher M. Martínez-Bartolomé S. Ghali F. Jones A.R. A standardized framing for reporting protein identifications in mzIdentML 1.2.Proteomics. 2014; 14: 2389-2399Crossref PubMed Scopus (20) Google Scholar, 38.Griss J. Jones A.R. Sachsenberg T. Walzer M. Gatto L. Hartler J. Thallinger G.G. Salek R.M. Steinbeck C. Neuhauser N. Cox J. Neumann S. Fan J. Reisinger F. Xu Q.-W. Del Toro N. Pérez-Riverol Y. Ghali F. Bandeira N. Xenarios I. Kohlbacher O. Vizcaíno J.A. Hermjakob H. The mzTab data exchange format: communicating mass-spectrometry-based proteomics and metabolomics experimental results to a wider audience.Mol. Cell Proteomics. 2014; 13: 2765-2775Abstract Full Text Full Text PDF PubMed Scopus (98) Google Scholar). Each row in the generic CSV lists a scan number, peptide sequence, precursor charge, and all PTM names and locations for each peptide identification. We chose this architecture for its simplicity; peptide identifications produced from a wide variety of search algorithms can easily be converted into this format. Spectral data can be uploaded as a Mascot Generic Format (MGF) or mzML file (12.Perkins D.N. Pappin D.J. C.C. Creasy D.M. Cottrell J.S. Probability-based protein identification by searching sequence databases using mass spectrometry data.Electrophoresis. 1999; 20: 3551-3567Crossref PubMed Scopus (6763) Google Scholar, 39.Martens L. Chambers M. Sturm M. Kessner D. Levander F. Shofstahl J. Tang W.H. Römpp A. Neumann S. Pizarro A.D. Montecchi-Palazzi L. Tasman N. Coleman M. Reisinger F. Souda P. Hermjakob H. Binz P.-A. Deutsch E.W. mzML—a community standard for mass spectrometry data.Mol. Cell Proteomics. 2011; 10 (R110.000133)Abstract Full Text Full Text PDF PubMed Scopus (452) Google Scholar). Finally, a modifications file can be uploaded to link peptide modification names to their respective masses. We provide a set of example files on IPSA's file upload page to demonstrate how each of these files should be structured. MGF and mzML files can easily be generated from vendor or open file formats using conversion tools such as MSConvert (40.Chambers M.C. Maclean B. Burke R. Amode D. Ruderman D.L. Neumann S. Gatto L. Fischer B. Pratt B. Egertson J. Hoff K. Kessner D. Tasman N. Shulman N. Frewen B. Baker T.A. Brusniak M.-Y. Paulse C. Creasy D. Flashner L. Kani K. Moulding C. Seymour S.L. Nuwaysir L.M. Lefebvre B. Kuhlmann F. Roark J. Rainer P. Detlev S. Hemenway T. Huhmer A. Langridge J. Connolly B. Chadick T. Holly K. Eckels J. Deutsch E.W. Moritz R.L. Katz J.E. Agus D.B. MacCoss M. Tabb D.L. Mallick P. A cross-platform toolkit for mass spectrometry and proteomics.Nat. Biotechnol. 2012; 30Crossref Scopus (1775) Google Scholar). Data parsed from bulk identification and spectral data uploads are stored securely server-side in a MySQL database. On data upload, a unique identifier is assigned to the user's browser which is used to exclusively access the uploader's data. After data extraction, uploaded files are deleted to reduce server footprint. Only one data set can be stored at a time. Proteomic analyses are typically conducted using low-pH separations and positive-mode electrospray ionization to create peptide cations. This tendency leads to a systematic underrepresentation of acidic peptide species, which preferentially ionize as anions (41.McAlister G.C. Russell J.D. Rumachik N.G. Hebert A.S. Syka J.E.P. Geer L.Y. Westphall M.S. Pagliarini D.J. Coon J.J. Analysis of the acidic proteome with negative electron-transfer dissociation mass spectrometry.Anal. Chem. 2012; 84: 2875-2882Crossref PubMed Scopus (47) Google Scholar, 42.Rumachik N.G. McAlister G.C. Russell J.D. Bailey D.J. Wenger C.D. Coon J.J. Characterizing peptide neutral losses induced by negative electron-transfer dissociation (NETD).J. Am. Soc. Mass Spectrom. 2012; 23: 718-727Crossref PubMed Scopus (17) Google Scholar, 43.Madsen J.A. Xu H. Robinson M.R. Horton A.P. Shaw J.B. Giles D.K. Kaoud T.S. Dalby K.N. Trent M.S. Brodbelt J.S. High-throughput database search and large-scale negative polarity liquid chromatography–tandem mass spectrometry with ultraviolet photodissociation for complex proteomic samples.Mol. Cell Proteomics. 2013; 12: 2604-2614Abstract Full Text Full Text PDF PubMed Scopus (28) Google Scholar). High-pH separations using negative-mode ionization can be used to better study these acidic species, but the complexity of tandem mass spectra generated using traditional collision-based activational methods has precluded the widespread adoption of this mode. This spectral complexity arises in part from a multitude of neutral losses originating from precursor and fragment ions (44.Ewing N.P. Cassady C.J. Dissociation of multiply charged negative ions for hirudin (54–65), fibrinopeptide B, and insulin A (oxidized).J. Am. Soc. Mass Spectrom. 2001; 12: 105-116Crossref PubMed Scopus (59) Google Scholar). Alternative fragmentation techniques such as UVPD or AI-NETD, producing a/x-, b/y-, c/z-type and a•/x-type product ions respectively, have recently demonstrated their utility in producing informative tandem mass spectra from peptide anions (35.Riley N.M. Rush M.J.P. Rose C.M. Richards A.L. Kwiecien N.W. Bailey D.J. Hebert A.S. Westphall M.S. Coon J.J. The negative mode proteome with activated ion negative electron transfer dissociation (AI-NETD).Mol. Cell Proteomics. 2015; 14: 2644-2660Abstract Full Text Full Text PDF PubMed Scopus (42) Google Scholar, 43.Madsen J.A. Xu H. Robinson M.R. Horton A.P. Shaw J.B. Giles D.K. Kaoud T.S. Dalby K.N. Trent M.S. Brodbelt J.S. High-throughput database search and large-scale negative polarity liquid chromatography–tandem mass spectrometry with ultraviolet photodissociation for complex proteomic samples.Mol. Cell Proteomics. 2013; 12: 2604-2614Abstract Full Text Full Text PDF PubMed Scopus (28) Google Scholar). However, many spectral annotators do not support these data types. IPSA is capable of annotating PSMs collected using negative-mode electrospray ionization. Fig. 3 demonstrates an IPSA-annotated spectrum of the triply deprotonated peptide LIPSDFILAAQSHNPIENK dissociated using AI-NETD (35.Riley N.M. Rush M.J.P. Rose C.M. Richards A.L. Kwiecien N.W. Bailey D.J. Hebert A.S. Westphall M.S. Coon J.J. The negative mode proteome with activated ion negative electron transfer dissociation (AI-NETD).Mol. Cell Proteomics. 2015; 14: 2644-2660Abstract Full Text Full Text PDF PubMed Scopus (42) Google Scholar). Obtaining fragment ion statistics in an automated fashion for an entire mass spectrometry experiment is no trivial task. Fragment ion statistics can be greatly informative during method optimization and can be used to monitor MS/MS performance by providing information on what ion types (and in what amounts) are being generated. Additional informative metrics include the sequence coverage of all detected peptide fragments, fragment ion mass errors, and the percent of the total ion current (TIC) that can be explained by annotated fragment ions. IPSA provides a unique utility among web-based spectral annotators to compute and export all detected fragment ions for an uploaded experiment in a tabular format. The server extracts the fragment ion series, mass tolerances, and any intensity threshold from the provided user interface and serially processes every uploaded peptide identification. The annotation results are continuously written to a set of two downloadable CSVs. The first file contains summary statistics for the matched fragment ions for each uploaded PSM. This file reports the number of matched fragment ions, unique peptide bonds broken, and the percent of the total ion current explained by matched fragment ions. The second file contains detailed information concerning every detected fragment ion for all uploaded identifications; more specifically, the raw intensity, theoretical m/z, experimental m/z, mass error, percent of base peak, and percent of the total ion current explained is reported. A series of experiments were previously described by Riley et al. to examine the efficacy of ETD, ETcaD, EThcD, AI-ETD, and AI-ETD+ fragmentation on a liquid chromatography timescale (34.Riley N.M. Westphall M.S. Hebert A.S. Coon J.J. Implementation of activated ion electron transfer dissociation on a quadrupole-orbitrap-linear ion trap hybrid mass spectrometer.Anal. Chem. 2017; 89: 6358-6366Crossref PubMed Scopus (32) Google Scholar). The authors found AI-ETD+ to be the optimal supplemental ETD fragmentation technique. Using the authors' reported peptide identifications and spectral data, we created a set of detailed comparisons similar to those made in the referenced manuscript using the ion statistics files directly exported from IPSA (Fig. 4). No further programming was required to extract these data or make this figure, and all data manipulation postexport was performed in a spreadsheet using basic arithmetical functions. In summary, IPSA is capable of both cleanly annotating peptide spectra collected using a wide variety of dissociation techniques in both positive and negative mode and of exporting the generated annotated spectra in the editable SVG format. Additionally, IPSA allows the bulk analysis of detected fragment ions for any number of uploaded spectra, permitting in turn the deep interrogation of data without requiring programming experience. Modern MS-based proteomics techniques are widely used to identify and characterize tens of thousands of peptides and proteins originating from a variety of biological samples. The annotation of the tandem mass spectra used to identify these species is an arduous task requiring extensive expertise. Our web-based and open-source peptide spectral annotator, IPSA, provides a resource for generating and investigating annotated spectra for peptide identifications to a wide research community. IPSA can generate customizable annotated peptide spectra using a clean and intuitive user interface, allowing researchers to export customizable, publication-ready annotated spectra as vector graphics to aid in figure creation. It can process MS/MS spectra from both anionic and cationic precursors, and it has built-in support to annotate fragment ions generated from a diverse assortment of dissociative techniques. Additionally, IPSA can extract fragment ion statistics from any number of peptide spectra and return results in a tabular format, giving researchers a deeper and more comprehensive view of their peptide analyses. We chose to develop IPSA as an online platform to reach a wide audience of proteomics researchers: those with an Internet connection on a computer with a web browser. Web-based software also allowed us to use the flexibility of the well-established JavaScript visualization library D3.js while avoiding software compatibility issues and version control. Through IPSA, we aim to increase the approachability of spectral annotation to proteomics novices and experts alike. The IPSA source code is freely available for inspection and download at https://github.com/coongroup/IPSA alongside additional guides regarding software usage. We recommend using an updated web browser to access IPSA at http://interactivepeptidespectralannotator.com as outdated browsers may not provide support for critical functions. IPSA can be easily installed on a private desktop or server using a prebuilt Docker image and instructions at https://hub.docker.com/r/dbrademan/ipsa, or IPSA's project files can be manually configured to operate on private web servers with full functionality. Additionally, the JavaScript file used to render the interactive visualization, IPSA.js, is configured to be used as an AngularJS directive. This directive can be attached to custom annotation scripts in many website environments, allowing the use of our software beyond that of the platform we described here. Raw spectral data, peptide identifications, and protein databases have been deposited to the ProteomeXchange Consortium via the PRIDE (45.Vizcaíno J.A. Csordas A. del-Toro N. Dianes J.A. Griss J. Lavidas I. Mayer G. Perez-Riverol Y. Reisinger F. Ternent T. Xu Q.-W. Wang R. Hermjakob H. 2016 update of the PRIDE database and its related tools.Nucleic Acids Res. 2016; 44: D447-D456Crossref PubMed Scopus (2775) Google Scholar) partner repository with the dataset identifier PXD011695. We thank Kevin Schauer for providing feedback during IPSA's design. Download .zip (17.64 MB) Help with zip files

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Interactive Peptide Spectral Annotator: A Versatile Web-based Tool for Proteomic Applications