Artigo Acesso aberto Revisado por pares

Tandem Mass Spectral Libraries of Peptides in Digests of Individual Proteins: Human Serum Albumin (HSA)

2014; Elsevier BV; Volume: 13; Issue: 9 Linguagem: Inglês

10.1074/mcp.o113.037135

ISSN

1535-9484

Autores

Qian Dong, Xinjian Yan, Lisa E. Kilpatrick, Yuxue Liang, Yuri A. Mirokhin, Jeri S. Roth, Paul A. Rudnick, Stephen E. Stein,

Tópico(s)

Metabolomics and Mass Spectrometry Studies

Resumo

This work presents a method for creating a mass spectral library containing tandem spectra of identifiable peptide ions in the tryptic digestion of a single protein. Human serum albumin (HSA1) was selected for this purpose owing to its ubiquity, high level of characterization and availability of digest data. The underlying experimental data consisted of ∼3000 one-dimensional LC-ESI-MS/MS runs with ion-trap fragmentation. In order to generate a wide range of peptides, studies covered a broad set of instrument and digestion conditions using multiple sources of HSA and trypsin. Computer methods were developed to enable the reliable identification and reference spectrum extraction of all peptide ions identifiable by current sequence search methods. This process made use of both MS2 (tandem) spectra and MS1 (electrospray) data. Identified spectra were generated for 2918 different peptide ions, using a variety of manually-validated filters to ensure spectrum quality and identification reliability. The resulting library was composed of 10% conventional tryptic and 29% semitryptic peptide ions, along with 42% tryptic peptide ions with known or unknown modifications, which included both analytical artifacts and post-translational modifications (PTMs) present in the original HSA. The remaining 19% contained unexpected missed-cleavages or were under/over alkylated. The methods described can be extended to create equivalent spectral libraries for any target protein. Such libraries have a number of applications in addition to their known advantages of speed and sensitivity, including the ready re-identification of known PTMs, rejection of artifact spectra and a means of assessing sample and digestion quality. This work presents a method for creating a mass spectral library containing tandem spectra of identifiable peptide ions in the tryptic digestion of a single protein. Human serum albumin (HSA1) was selected for this purpose owing to its ubiquity, high level of characterization and availability of digest data. The underlying experimental data consisted of ∼3000 one-dimensional LC-ESI-MS/MS runs with ion-trap fragmentation. In order to generate a wide range of peptides, studies covered a broad set of instrument and digestion conditions using multiple sources of HSA and trypsin. Computer methods were developed to enable the reliable identification and reference spectrum extraction of all peptide ions identifiable by current sequence search methods. This process made use of both MS2 (tandem) spectra and MS1 (electrospray) data. Identified spectra were generated for 2918 different peptide ions, using a variety of manually-validated filters to ensure spectrum quality and identification reliability. The resulting library was composed of 10% conventional tryptic and 29% semitryptic peptide ions, along with 42% tryptic peptide ions with known or unknown modifications, which included both analytical artifacts and post-translational modifications (PTMs) present in the original HSA. The remaining 19% contained unexpected missed-cleavages or were under/over alkylated. The methods described can be extended to create equivalent spectral libraries for any target protein. Such libraries have a number of applications in addition to their known advantages of speed and sensitivity, including the ready re-identification of known PTMs, rejection of artifact spectra and a means of assessing sample and digestion quality. Shotgun proteomics is a widely used and evolving method for determining the protein composition of a biological mixture (1.Washburn M.P. Wolters D. Yates 3rd, J.R. Large-scale analysis of the yeast proteome by multidimensional protein identification technology.Nat. Biotechnol. 2001; 19: 242-247Crossref PubMed Scopus (4077) Google Scholar, 2.Mallick P. Kuster B. Proteomics: a pragmatic perspective.Nat. Biotechnol. 2010; 28: 695-709Crossref PubMed Scopus (319) Google Scholar, 3.Nagaraj N. Wisniewski J.R. Geiger T. Cox J. Kircher M. Kelso J. Pääbo S. Mann M. Deep proteome and transcriptome mapping of a human cancer cell line.Mol. Syst. Biol. 2011; 7: 548Crossref PubMed Scopus (751) Google Scholar). It most often involves the digestion of denatured proteins by trypsin, followed by the identification of product peptides and the use of this information to infer protein identities and possibly targeted post-translational modifications (PTMs) 1The abbreviations used are: HSA, Human Serum Albumin; MS1, full MS scan; MS2, tandem MS scan; PTM, post-translational modification; DTT, dithiothreitol; IAA, iodoacetamide; TCEP, tris(2-carboxyethyl)phosphine; TRIS, tris-hydroxymethyl-aminomethane; NIH, National Institutes of Health; NCI, National Cancer Institute; CPTAC, Clinical Proteomic Technology Assessment for Cancer; LTQ, linear trap quadrupole; NIST, National Institute of Standards and Technology; FDR, false discovery rate; MRAB, median relative abundance; PIIF, peptide ion identification frequency; XIC, extracted ion chromatograms; NBR, number of basic residue; PSIG, peptide identification significance. 1The abbreviations used are: HSA, Human Serum Albumin; MS1, full MS scan; MS2, tandem MS scan; PTM, post-translational modification; DTT, dithiothreitol; IAA, iodoacetamide; TCEP, tris(2-carboxyethyl)phosphine; TRIS, tris-hydroxymethyl-aminomethane; NIH, National Institutes of Health; NCI, National Cancer Institute; CPTAC, Clinical Proteomic Technology Assessment for Cancer; LTQ, linear trap quadrupole; NIST, National Institute of Standards and Technology; FDR, false discovery rate; MRAB, median relative abundance; PIIF, peptide ion identification frequency; XIC, extracted ion chromatograms; NBR, number of basic residue; PSIG, peptide identification significance.. However, because digestion is a highly complex chemical process, a large proportion of identifiable products are not specifically targeted for analysis and therefore invisible to the analysis. These include unexpected and unwanted peptides that interfere with the analysis. Others may contain modifications of biological origin, which, unless specifically targeted, can be lost among the forest of artifacts (4.Baldwin M. Protein identification by mass spectrometry: issues to be considered.Mol. Cell. Proteomics. 2004; 3: 1-9Abstract Full Text Full Text PDF PubMed Scopus (169) Google Scholar, 5.Nesvizhskii A.I. Roos F.F. Grossmann J. Vogelzang M. Eddes J.S. Gruissem W. Baginsky S. Aebersold R. Dynamic spectrum quality assessment and iterative computational analysis of shotgun proteomic data: toward more efficient identification of post-translational modifications, sequence polymorphisms, and novel peptides.Mol. Cell. Proteomics. 2005; 5: 652-670Abstract Full Text Full Text PDF PubMed Scopus (154) Google Scholar, 6.Picotti P. Aebersold R. Domon B. The implications of proteolytic background for shotgun proteomics.Mol. Cell, Proteomics. 2007; 6: 1589-1598Abstract Full Text Full Text PDF PubMed Scopus (160) Google Scholar). This paper describes methods for building a tandem mass spectral library capable of characterizing all identifiable peptides in a tryptic digest of a selected protein. Spectral libraries are known to provide an effective way of reusing this information to quickly, reliably, and sensitively determine peptide identities (7.Yates 3rd, J.R. Morgan S.F. Gatlin C.L. Griffin P.R. Eng J.K. Method to compare collision-induced dissociation spectra of peptides: potential for library searching and subtractive analysis.Anal. Chem. 1998; 70: 3557-3565Crossref PubMed Scopus (160) Google Scholar, 8.Frewen B.E. Merrihew G.E. Wu C.C. Noble W.S. MacCoss M.J. Analysis of peptide MS/MS spectra from large-scale proteomics experiments using spectrum libraries.Anal. Chem. 2006; 78: 5678-5684Crossref PubMed Scopus (198) Google Scholar, 9.Craig R. Cortens J.C. Fenyo D. Beavis R.C. Using annotated peptide mass spectrum libraries for protein identification.J. Proteome Res. 2006; 5: 1843-1849Crossref PubMed Scopus (247) Google Scholar, 10.Lam H. Deutsch E.W. Eddes J.S. Eng J.K. King N. Stein S.E. Aebersold R. Development and validation of a spectral library searching method for peptide identification from MS/MS.Proteomics. 2007; 7: 655-667Crossref PubMed Scopus (397) Google Scholar, 11.Lam H. Aebersold R. Building and searching tandem mass (MS/MS) spectral libraries for peptide identification in proteomics.Method. 2011; 54: 424-431Crossref PubMed Scopus (46) Google Scholar). These identifications can serve several purposes, including 1) ensuring that all previously identified peptides are identified regardless of search engine settings, 2) tagging artifact peptides that might otherwise lead to false positive identifications, 3) ensuring the identification of known and identifiable biological post-translational modifications without explicitly looking for them, and 4) providing a list of artifact peptides for assessing the quality of the sample preparation process. HSA, human serum albumin, was selected as the target protein for library development partly because of its ubiquity, making up >50% of the total protein in blood (12.Theodore P. All about albumin: biochemistry, genetics, and medical applications.Academic Press. California, San Diego1995Google Scholar, 13.Fanalia G. Masib A. Trezzab V. Marinob M. Fasanoa M. Ascenzib P. Human serum albumin: from bench to bedside.Mol. Aspects Med. 2012; 33: 209-290Crossref PubMed Scopus (1070) Google Scholar) and therefore found in many biological samples, and partly because of the considerable background information available for its digestion products (14.Kratz F. Albumin as a drug carrier: design of prodrugs, drug conjugates, and nanoparticles.J. Control. Release. 2008; 132: 171-183Crossref PubMed Scopus (1691) Google Scholar, 15.Barber M.D. Ross J.A. Fearon K.C. Changes in nutritional, functional, and inflammatorymarkers in advanced pancreatic cancer.Nutr. Cancer. 1999; 35: 106-110Crossref PubMed Scopus (143) Google Scholar, 16.Koga M. Kasayama S. Clinical impact of glycated albumin as another glycemic control marker.Endocrine J. 2010; 57: 751-762Crossref PubMed Scopus (205) Google Scholar, 17.Roohk H.V. Zaidi A.R. A review of glycated albumin as an intermediate glycation index for controlling diabetes.J. Diabet. Sci. Technol. 2008; 2: 1114-1121Crossref PubMed Scopus (132) Google Scholar, 18.Gundry R. Fu Q. Jelinek C. Van Eyk J.E. Cotter R. Investigation of an albumin-enriched fraction of human serum and its albuminome.Proteomics Clin. Appl. 2007; 1: 73-88Crossref PubMed Scopus (157) Google Scholar, 19.David Bar-Or D. Rael L.T. Bar-Or R. Slone D.S. Craun M.L. Case report: The formation and rapid clearance of a truncated albumin species in a critically ill patient.Clin. Chim. Acta. 2006; 365: 346-349Crossref PubMed Scopus (29) Google Scholar). However, despite the long-standing interest in this protein (20.Mingetti P.P. Ruffner D.E. Kuang W.J. Dennison O.E. Hawkins J.W. Beattie W.G. Dugaiczyk A. Molecular structure of the human albumin gene is revealed by nucleotide sequence within q11–22 of chromosome 4.J. Biol. Chem. 1986; 261: 6747-6757Abstract Full Text PDF PubMed Google Scholar, 21.Kobayashi K. Summary of recombinant human serum albumin development.Biologicals. 2006; 34: 55-59Crossref PubMed Scopus (108) Google Scholar), a thorough determination of its digestion products has not been reported. HSA is composed of 585 amino acids and yields a wide range of tryptic peptides, including many with missed or irregular cleavages and a variety of both native and analytical modifications. At first sight, the analysis of just one protein may appear straightforward because it is common practice in the field of proteomics to search for thousands of proteins in a biological sample. However, this analysis aiming at thorough analytical characterization of HSA peptide ions requires a very different method of analysis. It needs to deal with the wide diversity of digestion products, many of which cannot be predicted in advance and whose relative concentrations are likely to depend on complex chemical processes that cannot be fully controlled. Products include peptides with missed and irregular cleavages, under or over alkylation, unexpectedly high and low charge states, and an uncertain number of modifications, including unknown modifications (i.e. so-called blind modifications (22.Chen Y. Chen W. Cobb M.H. Zhao Y. PTMap: A sequence alignment software for unrestricted, accurate, and full-spectrum identification of post-translational modification sites.Proc. Natl. Acad. Sci. U.S.A. 2009; 106: 761-766Crossref PubMed Scopus (82) Google Scholar, 23.Tanner S. Payne S.H. Dasari S. Shen Z. Wilmarth P.A. David L.L. Loomis W.F. Briggs S.P. Bafna V. Accurate annotation of peptide modifications through unrestrictive database search.J. Proteome Res. 2008; 7: 170-181Crossref PubMed Scopus (47) Google Scholar)). Furthermore, the process of identifying such peptides is prone to misidentification by accidental "homologies" (two different peptides yielding an overlapping set of y/b ions). Including these variant peptides leads to a dramatic increase in the number of both true and false HSA peptide identifications compared with those of the commonly sought tryptic peptides (24.Wa C. Cerny R. Hage D.S. Obtaining high sequence coverage in matrix-assisted laser desorption time-of-flight mass spectrometry for studies of protein modification: analysis of human serum albumin as a model.Anal. Biochem. 2006; 349: 229-241Crossref PubMed Scopus (45) Google Scholar, 25.Aldini G. Gamberoni L. Orioli M. Beretta G. Regazzoni L. Maffei F.R. Carini M. Mass spectrometric characterization of covalent modification of human serum albumin by 4-hydroxy-trans-2-nonenal.J. Mass Spectrom. 2006; 41: 1149-1161Crossref PubMed Scopus (102) Google Scholar) at a given score threshold. This paper describes a series of methods designed to first produce all possible identifications and then to reject false identifications using a variety of filters to generate a reliable and comprehensive library of reference spectra for a single protein. Most of the mass spectral data used for building the HSA library came from 2035 LTQ runs and 522 LTQ/Orbitrap runs (Thermo Fisher Scientific, San Jose, CA, see Disclaimer). Many of these were generated for two studies examining digestion variability (26.Lowenthal M.S. Liang Y. Phinney K.W. Stein S.E. Quantitative bottom-up proteomics depends on digestion conditions.Anal. Chem. 2013; 1: 551-558Google Scholar, 27.Walmsley S.J. Rudnick P.A. Liang L. Dong Q. Stein S.E. Nesvizhskii A.I. Comprehensive analysis of protein digestion using six trypsins reveals the origin of trypsin as a significant source of variability in proteomics.J. Proteome Res. 2013; 12: 5666-5680Crossref PubMed Scopus (54) Google Scholar). These served to generate peptides over a wide range of conditions and HSA sources, including 12 HSA samples from five vendors, eight sources of trypsin, and a range of denaturing/digestion conditions. High temperature (90 °C) and urea (6 m) were the most commonly used denaturing conditions. Most commonly, dithiothreitol (DTT) was the reducing agent, iodoacetamide (IAA) the alkylation agent and tris-hydroxymethyl-aminomethane (TRIS) the buffer. Concentrations of these were varied as were those of HSA and trypsin. Other runs employed organic and no denaturants, cleavable surfactants, tris(2-carboxyethyl)phosphine (TCEP) as a reducing agent, and widely varying digestion times (5 min. to 2 days). Also included were 355 runs of digests of a plasma-like protein mix from the NIH/NCI-supported Clinical Proteomic Technology Assessment for Cancer (CPTAC) program (http://proteomics.cancer.gov/programs/CPTAC/), comprised of 200 LTQ and 155 LTQ/Orbitrap runs (28.Tabb D.L. Vega-Montoto L. Rudnick P.A. Variyath A.M. Ham A.J. Bunk D.M. Kilpatrick L.E. Billheimer D.D. Blackman R.K. Cardasis H.L. Carr S.A. Clauser K.R. Jaffe J.D. Kowalski K.A. Neubert T.A. Regnier F.E. Schilling B. Tegeler T.J. Wang M. Wang P. Whiteaker J.R. Zimmerman L.J. Fisher S.J. Gibson B.W. Kinsinger C.R. Mesri M. Rodriguez H. Stein S.E. Tempst P. Paulovich A.G. Liebler D.C. Spiegelman C. Repeatability and reproducibility in proteomic identifications by liquid chromatography-tandem mass spectrometry.J. Proteome Res. 2010; 9: 761-776Crossref PubMed Scopus (389) Google Scholar, 29.Paulovich A.G. Billheimer D. Ham A.J. Vega-Montoto L. Rudnick P.A. Tabb D.L. Wang P. Blackman R.K. Bunk D.M. Cardasis H.L. Clauser K.R. Kinsinger C.R. Schilling B. Tegeler T.J. Variyath A.M. Wang M. Whiteaker J.R. Zimmerman L.J. Fenyo D. Carr S.A. Fisher S.J. Gibson B.W. Mesri M. Neubert T.A. Regnier F.E. Rodriguez H. Spiegelman C. Stein S.E. Tempst P. Liebler D.C. Interlaboratory study characterizing a yeast performance standard for benchmarking LC-MS platform performance.Mol. Cell. Proteomics. 2010; 9: 242-254Abstract Full Text Full Text PDF PubMed Scopus (133) Google Scholar, 30.Rudnick P.A. Clauser K.R. Kilpatrick L.E. Tchekhovskoi D.V. Neta P. Blonder N. Billheimer D.D. Blackman R.K. Bunk D.M. Cardasis H.L. Ham A.J. Jaffe J.D. Kinsinger C.R. Mesri M. Neubert T.A. Schilling B. Tabb D.L. Tegeler T.J. Vega-Montoto L. Variyath A.M. Wang M. Wang P. Whiteaker J.R. Zimmerman L.J. Carr S.A. Fisher S.J. Gibson B.W. Paulovich A.G. Regnier F.E. Rodriguez H. Spiegelman C. Tempst P. Liebler D.C. Stein S.E. Performance metrics for liquid chromatography-tandem mass spectrometry systems in proteomics analyses.Mol. Cell. Proteomics. 2010; 9: 225-241Abstract Full Text Full Text PDF PubMed Scopus (152) Google Scholar). Some 122 spectra from the NIST Human library were also included (described later). The method developed for building this single-protein spectral library was derived from the methods currently used for building the NIST tandem mass spectral libraries of tryptic peptides from digests of biological protein samples (31.Eds. Stein, S. E., Rudnick, P. A., NIST peptide tandem mass spectral libraries. human peptide mass spectral reference data, H. sapiens, ion trap, Official Build Date: Feb. 4, 2009. National Institute of Standards and Technology, Gaithersburg, MD, 20899. Downloaded from http://peptide.nist.gov on October 17, 2012.Google Scholar, 32.Loevenich S.N. Brunner E. King N.L. Deutsch E.W. Stein S.E. Aebersold R. Hafen E. The Drosophila melanogaster PeptideAtlas facilitates the use of peptide data for improved fly proteomics and genome annotation.BMC Bioinformatics. 2009; 11: 10-59Google Scholar). As in that earlier work, initial identifications were made from ion-trap fragmentation spectra derived from tryptic digests using four sequence search engines (OMSSA (33.Geer L.Y. Markey S.P. Kowalak J.A. Wagner L. Xu M. Maynard D.M. Yang X. Shi W. Bryant S.H. Open mass spectrometry search algorithm.J. Proteome Res. 2004; 3: 958-964Crossref PubMed Scopus (1164) Google Scholar), X!Tandem (34.Craig R. Beavis R.C. TANDEM: matching proteins with tandem mass spectra.Bioinformatics. 2004; 20: 1466-1467Crossref PubMed Scopus (1987) Google Scholar), Comet (35.Keller A. Eng J. Zhang N. Li X. Aebersold R. A uniform proteomics MS/MS analysis platform utilizing open XML file formats.Mol. Syst. Biol. 2005; 1: 2005-2017Crossref PubMed Scopus (598) Google Scholar), and ProteinProspector (36.Clauser K.R. Baker P. Burlingame A.L. Role of accurate mass measurement (+/−10 ppm) in protein identification strategies employing MS or MS/MS and database searching.Anal. Chem. 1999; 71: 2871-2882Crossref PubMed Scopus (980) Google Scholar)), but used a fasta file containing only the HSA sequence (see Supplemental Table S1) and its reverse. It was found that to reliably identify both long, highly-charged peptides as well as peptides containing a wide range of peptide modifications, two separate sets of searches were necessary. Otherwise, incorrect high scoring semitryptic peptides with unusual modifications could overwhelm correct identifications of conventional tryptic peptides, especially those with multiple missed cleavages. The first search allowed up to two missed cleavages and four charges as well as one nontryptic terminus (semitryptic) and included a list of 22 categories of HSA-targeted modifications (16 in Table IV and 6 in Table V). The second search allowed up to four missed cleavage sites, six charge states, did not allow semitrypic peptides, and permitted only common modifications (variable cysteine alkylation, methionine oxidation, ammonia loss of N-terminal Gln and Carbamidomethyl-Cys, and water loss from N-terminal Glu). Results of these searches were merged. To find unidentified modifications, two additional search engines, namely InSpect (37.Tanner S. Shu H.J. Frank A. Wang L.C. Zandi E. Mumby M. Pevzner P.A. Bafna V. InsPecT: identification of posttransiationally modified peptides from tandem mass spectra.Anal. Chem. 2005; 77: 4626-4639Crossref PubMed Scopus (501) Google Scholar) and TagRecon (38.Dasari S. Chambers M.C. Slebos R.J. Zimmerman L.J. Ham A.J.L. Tabb D.L. TagRecon: high-throughput mutation identification through sequence tagging.J. Proteome Res. 2010; 9: 1716-1726Crossref PubMed Scopus (95) Google Scholar), served to identify single, untargeted modifications with mass shifts at specific residues between −300 and 300 Da. The list of the 22 specified modifications just described was partly built by examining and assigning some of these identifications. Parent and fragment tolerances of 0.2 m/z and 0.8 m/z, respectively, were used at this stage.Table IVSixteen categories of modifications sorted by percent of total ionsModification labelDelta massModified siteModified ions% Ions% Total MRABOxidation+15.9949M, H, W1455.01.27Carbamyl+43.0058N-terminus, K, T, M1214.22.26aIncludes only runs with urea as denaturant.Formyl+27.9949N-terminus, K, S, T1123.80.54Cation:Na+21.9819D, E893.10.50Cation:Fe[II]+53.9193E … bOur data revealed adducts Fe and Ca can also be attached to many other residues such as L, G, S, T, P, V.772.60.91Cation:Ca[II]+37.9469E … bOur data revealed adducts Fe and Ca can also be attached to many other residues such as L, G, S, T, P, V.582.01.71Dehydrated−18.0106D, S, T541.90.37ArgcAddition of arginine or Lysine on N- or C-terminus due to transpeptidation catalyzed by trypsin.+156.1011N- or C-terminus451.50.08LyscAddition of arginine or Lysine on N- or C-terminus due to transpeptidation catalyzed by trypsin.+128.0950N- or C-terminus80.30.01Gln->pyro-Glu−17.0265Q at N-terminus461.61.73Methyl+14.0157K, H431.50.21Pyro-carbamidomethyl+39.9949C at N-terminus311.11.64Glu->pyro-Glu−18.0106E at N-terminus190.70.06Deamidated+0.9840N, Q40.10.01VicidisulfidedVicinal disulfide labeled internal disulfide observed on several HSA adjacent cysteines. They were only observed from runs without a reducing agent.−2.0157C-C30.10.01Dioxidation+31.9898W20.10.004Delta:H(2)C(2)eFormation of Schiff base on N-terminus, see Reference 60.+26.0157N-terminus10.030.0003a Includes only runs with urea as denaturant.b Our data revealed adducts Fe and Ca can also be attached to many other residues such as L, G, S, T, P, V.c Addition of arginine or Lysine on N- or C-terminus due to transpeptidation catalyzed by trypsin.d Vicinal disulfide labeled internal disulfide observed on several HSA adjacent cysteines. They were only observed from runs without a reducing agent.e Formation of Schiff base on N-terminus, see Reference 60.Chalkley R.J. Baker P.R. Medzihradszky K.F. Lynn A.J. Burlingame A.L. In-depth analysis of tandem mass spectrometry data from disparate instrument types.Mol. Cell. Proteomics. 2008; 7: 2386-2398Abstract Full Text Full Text PDF PubMed Scopus (143) Google Scholar. Open table in a new tab Table IFive quality filters and one flag used for quality assessment of HSA peptide ion spectraFilterData typeDescriptionRejection threshold1Ion significanceMS1Median relative abundance (MRAB) and peptide ion identification frequency (PIIF)MRAB = 0 or PIIF ≤ 0.012m/z errorMS1Actual and absolute median m/z deviation≥0.25 m/z for LTQ≥5 ppm for Orbitrap3Unidentified fragment ionsMS2Unassigned abundance (subfilter 1), Unassigned abundance and numbers of peaks (subfilter 2)Subfilter 1 ≥ 0.32 Subfilter 2 ≥ 0.364Insufficient ions above the precursor m/zMS2Fraction of the largest 20 fragment ions above precursor m/z≤0.2 for charge 2, ≤0.3 for charge 3, or ≤0.36 for charge state higher than 35Principal charge statePeptide charge assignmentNumber of basic residues, NBR, and charge state, CSNBR-CS > 0FlagData typeDescriptionFlagging threshold1Gaps in charge state distributionPeptide charge assignmentCharge states of a given peptideGap in the charge states Open table in a new tab Table VSix categories of posttranslational modifications (PTMs) identified in HSA. Sites of modification are shown in boldface. MRAB, relative abundance; PIIF, peptide ion identification frequency; Cysteinyl, cysteinylation; cys34 oxidation adducts, +2O, +3O, or +O and -2H; Acetyl, acetylation; Hex, glycation; Phospho, phosphorylationPTMm/zzPeptide sequenceModified siteDelta massMRABPIIFaPIIF was calculated a) for cysteinylation using 24 LTQ non-reducing runs, b) for Cys34 oxidation, N- or C- terminal truncation, glycation, and acetylation, using 350 LTQ-Orbitrap runs, and c) for phosphorylation using 170 LTQ-Orbitrap runs in CPTAC studies (26–27).1Cysteinylation1276.6382ALVLIAFAQYLQQC(Cysteinyl)PFEDHVKCys34119.000.01190.25Cysteinylation851.4283ALVLIAFAQYLQQC(Cysteinyl)PFEDHVKCys34119.000.25850.58Cysteinylation638.8234ALVLIAFAQYLQQC(Cysteinyl)PFEDHVKCys34119.000.17530.46Cysteinylation871.9294DLGEENFKALVLIAFAQYLQQC(Cysteinyl)PFEDHVKCys34119.000.02450.462bCategory 2, Cys34 oxidation, has three oxidized forms (sulfinic/sulfonic acid and sulfinamide).Sulfinic acid822.4273ALVLIAFAQYLQQC(+2O)PFEDHVKCys3431.990.00660.17Sulfonic acid1241.1362ALVLIAFAQYLQQC(+3O)PFEDHVKCys3447.990.00090.17Sulfonic acid827.7673ALVLIAFAQYLQQC(+3O)PFEDHVKCys3447.990.00500.12Sulfinamide816.7603ALVLIAFAQYLQQC(+O,-2H)PFEDHVKCys3413.980.00020.013Truncation963.5121(-DA)HKSEVAHRN-term−186.060.00030.05Truncation482.2602(-DA)HKSEVAHRN-term−186.060.00940.07Truncation900.5151LVAASQAALG(-L)C-term−113.080.01710.62Truncation450.7622LVAASQAALG(-L)C-term−113.080.02050.444Glycation931.0823LVNEVTEFAK(Hex)TCVADESAENCDKLys51162.050.00220.29Glycation605.3043AEFAEVSK(Hex)LVTDLTKLys234162.050.01220.53Glycation736.3883VFDEFK(Hex)PLVEEPQNLIKLys378162.050.00160.26Glycation430.9233K(Hex)QTALVELVKLys525162.050.01170.395Acetylation989.5451LK(Acetyl)CASLQKcAll cysteines in the categories 4–6 are alkylated.Lys19942.010.00010.01Acetylation495.2772LK(Acetyl)CASLQKcAll cysteines in the categories 4–6 are alkylated.Lys19942.010.01210.82Acetylation585.8592K(Acetyl)QTALVELVKLys52542.010.00030.076Phosphorylation789.7762TCVADES(Phospho)AENCDKcAll cysteines in the categories 4–6 are alkylated.Ser5879.970.00010.03Phosphorylation860.4562KVPQVST(Phospho)PTLVEVSRThr42079.970.00050.15Phosphorylation573.9733KVPQVST(Phospho)PTLVEVSRThr42079.970.00080.16a PIIF was calculated a) for cysteinylation using 24 LTQ non-reducing runs, b) for Cys34 oxidation, N- or C- terminal truncation, glycation, and acetylation, using 350 LTQ-Orbitrap runs, and c) for phosphorylation using 170 LTQ-Orbitrap runs in CPTAC studies (26–27).b Category 2, Cys34 oxidation, has three oxidized forms (sulfinic/sulfonic acid and sulfinamide).c All cysteines in the categories 4–6 are alkylated. Open table in a new tab Scores from each of the search engines were normalized using results of searching a combined HSA forward and reversed sequence database. This method refined scores using fractions of unassigned fragment abundances and peptide classes. Tentative identifications were determined based upon a formal 5% false discovery rate (FDR) using a target-decoy approach (39.Elias J.E. Gygi S.P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry.Nat. Methods. 2007; 4: 207-214Crossref PubMed Scopus (2827) Google Scholar). Owing to the large variety of peptides allowed, even this single protein generated sufficient decoy hits to allow setting a statistically meaningful FDR. Manual examination showed that the computed score threshold was sufficiently low not to miss any of the conventional peptides expected to be generated in HSA digestion. Note also that the actual FDR was far higher than 5% because of the wide search space employed and the consequent generation of many false "homologous" peptide identifications. The wide peptide search space generated a large number of incorrect identifications at search scores appropriate for reliable identification of conventional tryptic peptides. Ideally, scores would depend on the "prior probability" (40.Stein S. Mass spectral reference libraries: an ever-expanding resource for chemical identification.Anal. Chem. 2012; 84: 7274-7282Crossref PubMed Scopus (184) Google Scholar) that a particular variety of peptide ion would be present in the digest - of course this is not done by present methods. Rejection of these unusual and less predictable peptides requires post-processing analyses. To some degree, this was done by adjusting scores of certain classes of peptides (31.Eds. Stein, S. E., Rudnick, P. A., NIST peptide tandem mass spectral libraries. human peptide mass spectral reference data, H. sapiens, ion trap, Official Build Date: Feb. 4, 2009. National Institute of Standards and Technology, Gaithersburg, MD, 20899. Downloaded from http://peptide.nist.gov on October 17, 2012.Google Scholar, 32.Loevenich S.N. Brunner E. King N.L. Deutsch E.W. Stein S.E. Aebersold R. Hafen E. The Drosophila melanogaster PeptideAtlas facilitates the use of peptide data for improved fly proteomics and genome annotation.BMC Bioinformatics. 2009; 11: 10-59Google Scholar), but this was found to

Referência(s)