Revisão Acesso aberto Revisado por pares

Data‐independent acquisition‐based SWATH ‐ MS for quantitative proteomics: a tutorial

2018; Springer Nature; Volume: 14; Issue: 8 Linguagem: Inglês

10.15252/msb.20178126

ISSN

1744-4292

Autores

Christina Ludwig, Ludovic Gillet, George Rosenberger, Sabine Amon, Ben C. Collins, Ruedi Aebersold,

Tópico(s)

Metabolomics and Mass Spectrometry Studies

Resumo

Review13 August 2018Open Access Data-independent acquisition-based SWATH-MS for quantitative proteomics: a tutorial Christina Ludwig Corresponding Author [email protected] orcid.org/0000-0002-6131-7322 Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), Technical University of Munich (TUM), Freising, Germany Search for more papers by this author Ludovic Gillet Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland Search for more papers by this author George Rosenberger orcid.org/0000-0002-1655-6789 Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland Department of Systems Biology, Columbia University, New York, NY, USA Search for more papers by this author Sabine Amon Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland Search for more papers by this author Ben C Collins orcid.org/0000-0003-0827-3495 Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland Search for more papers by this author Ruedi Aebersold Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland Faculty of Science, University of Zurich, Zurich, Switzerland Search for more papers by this author Christina Ludwig Corresponding Author [email protected] orcid.org/0000-0002-6131-7322 Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), Technical University of Munich (TUM), Freising, Germany Search for more papers by this author Ludovic Gillet Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland Search for more papers by this author George Rosenberger orcid.org/0000-0002-1655-6789 Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland Department of Systems Biology, Columbia University, New York, NY, USA Search for more papers by this author Sabine Amon Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland Search for more papers by this author Ben C Collins orcid.org/0000-0003-0827-3495 Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland Search for more papers by this author Ruedi Aebersold Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland Faculty of Science, University of Zurich, Zurich, Switzerland Search for more papers by this author Author Information Christina Ludwig *,1,‡, Ludovic Gillet2,‡, George Rosenberger2,3, Sabine Amon2, Ben C Collins2 and Ruedi Aebersold2,4 1Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), Technical University of Munich (TUM), Freising, Germany 2Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland 3Department of Systems Biology, Columbia University, New York, NY, USA 4Faculty of Science, University of Zurich, Zurich, Switzerland ‡These authors contributed equally to this work *Corresponding author. Tel: +49 8161 71 6130; E-mail: [email protected] Mol Syst Biol (2018)14:e8126https://doi.org/10.15252/msb.20178126 PDFDownload PDF of article text and main figures. ToolsAdd to favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures & Info Abstract Many research questions in fields such as personalized medicine, drug screens or systems biology depend on obtaining consistent and quantitatively accurate proteomics data from many samples. SWATH-MS is a specific variant of data-independent acquisition (DIA) methods and is emerging as a technology that combines deep proteome coverage capabilities with quantitative consistency and accuracy. In a SWATH-MS measurement, all ionized peptides of a given sample that fall within a specified mass range are fragmented in a systematic and unbiased fashion using rather large precursor isolation windows. To analyse SWATH-MS data, a strategy based on peptide-centric scoring has been established, which typically requires prior knowledge about the chromatographic and mass spectrometric behaviour of peptides of interest in the form of spectral libraries and peptide query parameters. This tutorial provides guidelines on how to set up and plan a SWATH-MS experiment, how to perform the mass spectrometric measurement and how to analyse SWATH-MS data using peptide-centric scoring. Furthermore, concepts on how to improve SWATH-MS data acquisition, potential trade-offs of parameter settings and alternative data analysis strategies are discussed. Introduction Over the last decades, liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) has become the technology of choice for the high-throughput characterization of proteins and proteomes (Aebersold & Mann, 2016). Recent developments in the field have moved beyond enumerating the proteins, peptides or post-translational modifications detected in one or few samples towards delivering high quality and consistent quantification in large-scale projects that comprise 100s of samples. Especially in areas such as personalized medicine, biomarker research, drug screens, genetic association studies or systems biology, large numbers of individuals, conditions and perturbations need to be investigated to draw meaningful biological conclusions. For this purpose, the large data matrices generated must be as reproducible, complete and accurate as possible. In order to address these needs, several different proteomic strategies have been developed over the last years. An emerging strategy, and the focus of this tutorial, is sequential window acquisition of all theoretical mass spectra (SWATH-MS), which was described by Gillet et al (2012). For a detailed introduction into the history, the basic principles as well as the general advantages and limitations of SWATH-MS (also summarized in Table 1), we refer to the Appendix. Briefly, for a SWATH-MS measurement, typically non-labelled protein samples are digested with trypsin and the resulting peptides are analysed by liquid chromatography coupled to a tandem mass spectrometer operating in the so-called data-independent acquisition (DIA) mode. In this mode, all ionized compounds of a given sample that fall within a specified mass range are fragmented in a systematic and unbiased fashion. Figure 1A–C shows the DIA scheme described as the initial implementation of SWATH-MS (Gillet et al, 2012), using 32 consecutive, slightly overlapping precursor isolation windows, with a width of 25 m/z each. Depending on sample complexity, this acquisition scheme will lead to the co-fragmentation of many co-eluting peptides concurrently selected in the precursor ion window and ultimately to highly multiplexed and complex fragment ion spectra (Fig 1D). To deal with this complexity, Gillet et al proposed a novel data analysis strategy based on peptide-centric scoring, which relies on querying chromatographic and mass spectrometric coordinates of the proteins and peptides of interest in form of so-called peptide query parameters (PQPs). PQPs are typically derived from previously generated spectral libraries. Table 1. Advantages and limitations of SWATH-MS in comparison with data-dependent (DDA) and targeted (SRM, PRM) proteomics Figure 1. Principle of sequentially windowed data-independent acquisition in SWATH-MS(A) SWATH-MS measurements are performed on fast scanning hybrid mass spectrometers, typically employing a quadrupole as first mass analyser and a TOF or Orbitrap as second mass analyser. In SWATH-MS mode, typically a single precursor ion (MS1) spectrum is recorded, followed by a series of fragment ion (MS2) spectra with wide precursor isolation windows (for example 25 m/z). Through repeated cycling of consecutive precursor isolation windows over a defined mass range, a comprehensive data set is recorded, which includes continuous information on all detectable fragment and precursor ions. Hence, extracted ion chromatograms can be generated on MS2 as well as MS1 level. For the analysis of SWATH-MS data, a peptide-centric scoring strategy can be employed, which requires prior knowledge about the chromatographic and mass spectrometric behaviour of all queried peptides in form of peptide query parameters (PQPs). (B) The SWATH-MS data acquisition scheme described by Gillet et al (2012) for a Q-TOF mass spectrometer uses 32 MS2 scans with defined increments of 25 m/z, starting at 400 m/z and ending at 1,200 m/z. One full MS1 scan is recorded at the beginning. By applying an acquisition time of 100 ms per scan, a total cycle time of ~3.3 s is achieved. (C) The MS1 full scan detects all peptide precursors eluting at a given time point. For example, in the mass range from 925 to 950 m/z, three co-eluting peptide species are detected (green, red and blue). (D) The corresponding MS2 scan with a precursor isolation window of 925–950 m/z represents a mixed MS2 spectrum with fragments of all three peptide species. Download figure Download PowerPoint In addition to the described SWATH-MS method, a wealth of other DIA schemes and alternative data analysis strategies has been established. For an overview, see Appendix Fig S1, as well as recent reviews (Chapman et al, 2014; Bilbao et al, 2015). While the term “SWATH” became a registered trademark of SCIEX in the context of Q-TOF instrumentation, the company Biognosys trademarked the name “Hyper Reaction Monitoring” (HRM) for an analogous mode of data acquisition on Orbitrap instrumentation (Box 1). Throughout this tutorial, we only use the term “SWATH-MS”, independently from the underlying instrument type, and the generic term “DIA” when referring to the breadth of all data-independent acquisition strategies. Box 1: Definitions of frequently used terms in the context of SWATH-MS Term Definition DIA Data-Independent Acquisition (DIA)—here, we use DIA as an umbrella term for mass spectrometric acquisition methods that continuously acquire fragment ion (MS2) spectra in an unbiased fashion, without requiring the detection of peptide precursor ions in an MS1 survey scan (as in DDA) nor prior knowledge about peptide precursor m/z values (as in SRM and PRM). Specific acquisition methods of the DIA family include for example SWATH-MS, Shotgun-CID, MSE, XDIA, MSX, AIF FT-ARM and others SWATH-MS Sequential Windowed Acquisition of All Theoretical Fragment Ion Mass Spectra (SWATH-MS)—represents a specific variant of DIA, performed on hybrid full-scan instruments (preferably Q-TOF and Q-Orbitrap). The term SWATH is a registered trademark of SCIEX. In SWATH-MS data acquisition, successive pre-defined ranges of precursor m/z values are isolated and subjected to co-fragmentation (Gillet et al, 2012). Peptide-centric scoring of SWATH-MS data can be performed for example by using prior knowledge in form of a spectral library HRM Hyper Reaction Monitoring (HRM)—Synonym for SWATH-MS. The term has been implemented in the context of data acquisition on Orbitrap mass analysers and is a registered trademark of Biognosys Peptide-centric scoring A data query strategy which starts with a pre-defined list of target peptides and tests whether those peptides are detectable in the data with a certain confidence. It can be applied to individual MS2 spectra or to extracted ion chromatograms (XICs). For a peptide-centric scoring analysis, peptide query parameters need to be readily available. Typically, data acquired in SRM or PRM mode are analysed by peptide-centric scoring, but also data acquired by DIA methods, such as SWATH-MS, can be analysed in this way. An equivalent term also used in the literature is “targeted data extraction” Spectrum-centric scoring A data analysis type which aims at finding the peptide sequence(s) from a user-specified proteome that explain(s) a given MS2 spectrum best. It is typically applied in the context of discovery-driven proteomics with data acquired by data-dependent acquisition (DDA), but also data acquired by different DIA methods can be analysed in this way. Peptide query parameters (PQPs) Compendium of parameters required for peptide identification by peptide-centric scoring. PQPs are stored in a table format and include (i) optimal (proteotypic) peptides to target for a given protein, (ii) chromatographic elution times of those peptides on the applied chromatography setup, (iii) most intense fragment ions (typically four to six) generated under the applied fragmentation conditions, (iv) charge state(s) of precursors and fragment ions and (v) relative ion intensity of all selected fragments. PQPs can be derived from previous discovery-driven experiments, from which all peptide identifications are summarized in form of a spectral library (Targeted) Peptide assay Synonym for peptide query parameters (PQPs). Term mainly used in the context of SRM, where different tier levels (1–3) of analytical assay validation have been defined (Carr et al, 2014) Spectral library Compendium of MS2 spectra confidently assigned to a specific peptide sequence, typically acquired by discovery-driven proteomics using data-dependent acquisition. In case, several MS2 spectra refer to the same peptide sequence either the best scoring spectrum or an average consensus spectrum gets reported. Peptide retention time information can also be stored in a spectral library file and normalized retention times can be generated through retention time re-alignment using reference peptides. Alternatively, spectral libraries can be generated from deconvoluted pseudo-MS/MS spectra directly from DIA data (DIA-Umpire; Tsou et al, 2015) Transition Pair of a precursor and one corresponding fragment ion m/z value. Mainly used in the context of SRM Targeted proteomics Umbrella term for mass spectrometric methods that aim at quantifying a list of pre-defined proteins, peptides or PTM-peptides of interest. In the two classical targeted proteomic approaches, SRM and PRM, the data acquisition itself is performed in a targeted fashion. However, also data acquired “untargeted”, using for example DIA measurements, can be analysed using a peptide-centric data analysis strategy, which classifies SWATH-MS as targeted proteomic approaches When is SWATH-MS the method of choice for my proteomic study? The major advantage of SWATH-MS is that it supports quantitative analyses of peptides covering 1,000s of proteins with a high quantitative consistency and accuracy. It is ideally suited for projects that entail a large number of samples and that require accurate and reproducible quantification for the major fraction of the expressed proteome or peptidome in each sample. Typical projects that require exactly these properties include for example biomarker studies (Liu et al, 2014; Muntel et al, 2015; Kulkarni et al, 2016; Ortea et al, 2016), genetic association studies (Liu et al, 2015; Okada et al, 2016; Williams et al, 2016), clinical drug/perturbation studies (preprint: Litichevskiy et al, 2018; Tan et al, 2017; Keam et al, 2018) or exploratory basic research (Collins et al, 2013; Lambert et al, 2013; Parker et al, 2015a; Schubert et al, 2015b). SWATH-MS is also particularly well suited for studies that need fast analyses using LC gradient lengths below 60 min (Vowinckel et al, 2018). Proteome coverages of ~50% of the MS-detectable proteome have been achieved in complex mammalian samples in a single-shot analysis (Bruderer et al, 2017; Kelstrup et al, 2018). A current drawback of SWATH-MS compared to the classical targeted proteomic approaches (SRM or PRM) is that peptide quantification with SWATH-MS is still three- to 10-fold less sensitive (Gillet et al, 2012; Liu et al, 2013; Schmidlin et al, 2016). Hence, targeted data acquisition remains the better option for projects that involve quantification of particularly low-abundant proteins and peptides with maximal accuracy. A further drawback of SWATH-MS in comparison with DDA-based methods is the required upfront effort on experimental or in silico spectral library and PQP generation and optimization (Table 1). In the context of very large-scale quantitative proteomic analyses, two alternative mass spectrometric strategies are currently used successfully in the field in addition to SWATH-MS. The first is the classical label-free DDA proteomics workflow, where quantification is based on precursor ion (MS1) intensities or spectral counts, and which can possibly be combined with peptide fractionation techniques to improve proteome coverage (Lawrence et al, 2015; Geyer et al, 2016; Frejno et al, 2017). An important improvement for the application of MS1 quantification to DDA data sets was the development of analysis tools that allow the transfer of peptide identifications between samples and thereby improve the completeness of the quantitative data matrix (Prakash et al, 2006; Mueller et al, 2007; Cox et al, 2014). However, even when using these tools, the number of missing values in DDA data sets still remains higher than for data acquired in SWATH-MS mode, especially for peptides and proteins in the low concentration range (Bruderer et al, 2015, 2017; Kelstrup et al, 2018). While direct comparisons of an optimal label-free MS1/DDA workflow versus a SWATH-MS workflow are challenging, several papers have demonstrated that when the same sample is injected under the same conditions using the same mass spectrometer operated once in DDA and once in SWATH-MS mode, SWATH-MS outperforms DDA in terms of detectable peptides and associated proteins as well as measurement reproducibility (Bruderer et al, 2015; Kelstrup et al, 2018). A second popular strategy for large-scale quantitative proteomics relies on isobaric labelling, using, for example, tandem mass tags (TMT) (Thompson et al, 2003) or isobaric tags for relative and absolute quantitation (iTRAQ) (Unwin et al, 2005). Frequently, isobaric labelling is followed by an extensive peptide fractionation procedure. The resulting fractions are then analysed individually by DDA mass spectrometry (Chick et al, 2016; Roumeliotis et al, 2017). With state-of-the-art TMT reagents 10 (McAlister et al, 2012) or 11, samples can be mixed and analysed simultaneously, leading to minimal sample preparation biases and highly consistent and complete data matrices within a set of multiplexed samples. While early implementations of this method suffered from quantitative ratio compression due to interferences in reporter ions from co-eluting and co-fragmenting peptides, this has been addressed to some extent by using optimized data acquisition and analysis methods (Ting et al, 2011; Savitski et al, 2013; McAlister et al, 2014; Ahrne et al, 2016; O'Brien et al, 2018; Sonnett et al, 2018). Labelling samples with isobaric tags can be an optimal workflow for comparative analysis of medium-sized projects; however, if 100s of samples need to be analysed, the issues of data incompleteness and batch effects can become apparent again across sets of multiplexed samples. To date, studies that directly compare isobaric tagging with SWATH-MS are missing. The overall intent of this tutorial is to guide readers towards performing their own SWATH-MS measurements. We give guidelines on how to set up and plan a SWATH-MS experiment, how to perform the mass spectrometric measurement using data-independent acquisition and how to analyse SWATH-MS data using peptide-centric scoring. Furthermore, concepts on how to improve SWATH-MS data acquisition, potential trade-offs of parameter settings and alternative data analysis strategies are discussed. Setting up and planning a SWATH-MS experiment If a SWATH-MS study progresses towards the analysis of 100s and eventually 1,000s of samples, particular attention should be paid to the feasibility of producing comparable data of good quality, both longitudinally on a single instrument, as well as across multiple instruments of the same type, and conceivably across different instrument platforms. Encouraging progress has been made demonstrating the comparability of SWATH-MS data generated between laboratories using standard samples (Collins et al, 2017). Cumulative instrument contamination during a measurement series, caused for example by contaminants such as lipids, polymers or detergents, is a major concern and extra care must be taken to produce samples that are mostly devoid of such contaminants. Contaminants are an especially important issue in the context of SWATH-MS, because we observed faster and more severe instrument performance loss (such as charging effects and sensitivity issues) in SWATH-MS mode than in DDA or PRM mode on the same instrument. One possible explanation for this observation is that the instrument operated in SWATH-MS mode has a substantially higher ion flux in the fragment ion scans (which compose > 90% of the data acquisition time), which in turn means that also more sample contaminants and impurities might enter and contaminate the instrument, leading to a faster performance loss. Therefore, monitoring the performance of the mass spectrometer and maintaining it at an acceptable level are an important prerequisite. Recent efforts have been undertaken to develop software tools that enable systematic tracking of instrument performance and that can be applied also to SWATH-MS data (Rudnick et al, 2010; Wang et al, 2014; Bereman et al, 2016; Chiva et al, 2018). When planning a label-free large-scale proteomic experiment, statistical considerations for the experimental design, such as group size, biological and technical variability or achievable sensitivity and selectivity, should be taken into account (Krzywinski & Altman, 2014b). We suggest that particular attention should be paid to proper randomization and blocking (Krzywinski & Altman, 2014a) of a sufficient number of biological and technical sample replicates (Blainey et al, 2014) to ensure optimal statistical power (Krzywinski & Altman, 2013) during the downstream data analysis process. Prior knowledge required for peptide-centric scoring and spectral libraries The underlying concept behind SWATH-MS is that empirically derived prior knowledge of the mass spectrometric and chromatographic behaviour of peptides of interest can be used to selectively extract peptide-specific information from highly convoluted SWATH data in a targeted fashion (Gillet et al, 2012). This required prior knowledge is referred to as “peptide query parameters” (PQPs). It is worth noting that “prior” in this context indicates that PQPs should be available as a prerequisite before data analysis is undertaken, while the actual acquisition of the SWATH-MS data itself does not depend on the availability of PQPs. What kind of information do PQPs contain? In detail, the information includes (i) the peptide sequence(s) to monitor for a given protein, (ii) the dominant precursor ion m/z value(s) of the peptide(s) and thus the charge state distribution, (iii) the four to six most intense fragment ion m/z values for the peptide(s) under the applied fragmentation conditions, (iv) information about the expected fragmentation pattern under the applied conditions, i.e. the relative fragment ion intensities, and v) the expected retention time of the peptide(s) and thus the associated fragment ion signals, ideally normalized to a reference. PQPs can commonly be obtained from a spectral library [or potentially chromatogram library (Sharma et al, 2014)] and are stored in a table format as shown in Fig 2. Computational pipelines integrating all steps of spectral library generation and PQP extraction have been developed that simplify and standardize this process and are available for example within Skyline (Egertson et al, 2015), PeakView (SCIEX), Spectronaut Pulsar (further referred to as Spectronaut) (Bruderer et al, 2015) (Biognosys) and the Trans-Proteomic Pipeline (TPP) (Deutsch et al, 2010). Further tools to prepare and convert [specL (Panse et al, 2015), Fraggle/Tramler/Franklin (Teleman et al, 2017)] or extend [SwathXtend (Wu et al, 2016)] sets of peptide query parameters are also available. Particularly large spectral libraries can be optimized and constrained by using MSPLIT-DIA (Wang et al, 2015). Figure 2. What are peptide query parameters (PQPs) and where do these parameters come from?PQPs contain information about the chromatographic and mass spectrometric behaviour of a given peptide, as exemplified here for the peptide AAHTEDINACTLTTSPR. Various different input sample types can be used for the purpose of PQP generation. Typically, those samples are analysed in initial DDA measurements, and the results are summarized in the form of one or several spectral library files. From the spectral library file(s), the relevant PQPs are extracted by filtering the identified peptide coordinates using the indicated criteria. PQPs contain information about: the underlying protein, peptide sequence, precursor m/z, fragment m/z, precursor and fragment charge, fragment ion type, expected relative fragment ion intensities and normalized retention time (retention time relative to a set of reference peptides, iRT). Download figure Download PowerPoint For further details on how to create spectral libraries and PQPs from DDA data, we refer to a recent protocol paper (Schubert et al, 2015a). In the following paragraphs, we will discuss the types and sources of information that can be used to generate spectral libraries from which PQPs can be extracted. An alternative to acquiring DDA runs for spectral library generation represents spectrum-centric scoring of the DIA data. While such algorithms have been proposed early in the development of DIA acquisition schemes, recently developed algorithms such as DIA-Umpire (Tsou et al, 2015), Group-DIA (Li et al, 2015) and Spectronaut Pulsar (Biognosys) make specifically use of the improved data quality of the latest generation instruments. Generally, these algorithms generate a cumulative spectral library of a related set of samples and achieve similar coverage as DDA runs of unfractionated samples. Representative sample types for spectral library generation Here, one has to choose between endogenous sources of peptides from the samples of interest, synthetic analogues of those peptides or recombinant full-length proteins (Fig 2). The majority of SWATH-MS studies to date have used a side-by-side characterization of the samples of interest by DDA analysis for generation of spectral libraries. These libraries frequently include a sample fractionation step prior to DDA analysis, which can be beneficial from a sensitivity perspective for post-acquisition peptide queries (Rosenberger et al, 2014; Zi et al, 2014; Selevsek et al, 2015). This is because the sensitivity and coverage of single-shot DDA analysis are frequently lower than that of SWATH-MS data (Bruderer et al, 2015; Kelstrup et al, 2018). Therefore, the SWATH-MS data would not be fully covered by a “single shot” DDA spectral library generation strategy. The strategy focusing on repeated DDA analysis of non-fractionated samples is sometimes favoured because it is straightforward to implement; however, primarily for library completeness and quality reasons, other sources of peptides can be considered. Chemically synthesized peptides have long been proposed as a source of prior knowledge, in particular, with respect to the development of SRM assays (Kuster et al, 2005; Picotti et al, 2010), and large-scale efforts to synthesize and measure peptides for several organisms on a proteome-wide scale have been reported (Picotti et al, 2013; Schubert et al, 2013; Kusebauch et al, 2016; Zolg et al, 2017). This approach has several advantages: (i) all proteins can be represented in the library, irrespective of whether they have been previously observed, (ii) high-quality MS2 spectra and derived PQPs can be generated, because the synthetic peptides can be analysed at very high concentrations and represent ground truth, and (iii) the error rate in the spectral library generated should be close to zero. In some cases, the selection of which peptides to synthesize for a given protein has been driven by prior empirical observation. However, since there has been so far no species with absolute complete proteome coverage by DDA methods, sets of proteotypic peptides for each protein have also been computationally predicted (Mallick et al, 2007). Since computational prediction methods have turned out to be less reliable than expected (Searle et al, 2015), empirical peptide selection is still preferred over computational prediction. More recent efforts to create synthetic proteomes for general purposes in proteomics have extended to a very large scale (> 330,000 human peptides) by synthesizing much larger numbers of peptides per protein and by also including PTMs and common sequence variants (www.proteometools.org; Zolg et al, 2017). A useful extension of this approach is to create full-length proteins by recombinant methods or in vitro transcription/translation systems. In this way, the most suitable peptides per protein for analysis can be determined empirically (Stergachis et al, 2011; Matsumoto et al, 2017). While those large-scale synthetic peptide and protein MS resources have not yet been fully exploited for SWATH-MS analysis, it seems likely that they will be useful resources going forward. Hybrid libraries consisting of endogenous samples and synthetic peptides to increase coverage may also be an attractive option and have proved useful in the case of an organism-scale library for Mycobacterium tuberculosis (Schubert et al, 2015b). Comprehensive versus sample-specific spectral libraries A natural extension of the ideas discussed above is to attempt to characterize peptides by DDA from all proteins in a given species to generate a

Referência(s)