Artigo Acesso aberto Revisado por pares

The DegraBase: A Database of Proteolysis in Healthy and Apoptotic Human Cells

2012; Elsevier BV; Volume: 12; Issue: 3 Linguagem: Inglês

10.1074/mcp.o112.024372

ISSN

1535-9484

Autores

Emily Crawford, Julia E. Seaman, Nicholas J. Agard, Gerald W. Hsu, Olivier Julien, Sami Mahrus, Huy Cong Xuan Nguyen, Kazutaka Shimbo, Hikari A. I. Yoshihara, Min Zhuang, Robert J. Chalkley, James A. Wells,

Tópico(s)

Protease and Inhibitor Mechanisms

Resumo

Proteolysis is a critical post-translational modification for regulation of cellular processes. Our lab has previously developed a technique for specifically labeling unmodified protein N termini, the α-aminome, using the engineered enzyme, subtiligase. Here we present a database, called the DegraBase (http://wellslab.ucsf.edu/degrabase/), which compiles 8090 unique N termini from 3206 proteins directly identified in subtiligase-based positive enrichment mass spectrometry experiments in healthy and apoptotic human cell lines. We include both previously published and unpublished data in our analysis, resulting in a total of 2144 unique α-amines identified in healthy cells, and 6990 in cells undergoing apoptosis. The N termini derive from three general categories of proteolysis with respect to cleavage location and functional role: translational N-terminal methionine processing (∼10% of total proteolysis), sites close to the translational N terminus that likely represent removal of transit or signal peptides (∼25% of total), and finally, other endoproteolytic cuts (∼65% of total). Induction of apoptosis causes relatively little change in the first two proteolytic categories, but dramatic changes are seen in endoproteolysis. For example, we observed 1706 putative apoptotic caspase cuts, more than double the total annotated sites in the CASBAH and MEROPS databases. In the endoproteolysis category, there are a total of nearly 3000 noncaspase nontryptic cleavages that are not currently reported in the MEROPS database. These studies significantly increase the annotation for all categories of proteolysis in human cells and allow public access for investigators to explore interesting proteolytic events in healthy and apoptotic human cells. Proteolysis is a critical post-translational modification for regulation of cellular processes. Our lab has previously developed a technique for specifically labeling unmodified protein N termini, the α-aminome, using the engineered enzyme, subtiligase. Here we present a database, called the DegraBase (http://wellslab.ucsf.edu/degrabase/), which compiles 8090 unique N termini from 3206 proteins directly identified in subtiligase-based positive enrichment mass spectrometry experiments in healthy and apoptotic human cell lines. We include both previously published and unpublished data in our analysis, resulting in a total of 2144 unique α-amines identified in healthy cells, and 6990 in cells undergoing apoptosis. The N termini derive from three general categories of proteolysis with respect to cleavage location and functional role: translational N-terminal methionine processing (∼10% of total proteolysis), sites close to the translational N terminus that likely represent removal of transit or signal peptides (∼25% of total), and finally, other endoproteolytic cuts (∼65% of total). Induction of apoptosis causes relatively little change in the first two proteolytic categories, but dramatic changes are seen in endoproteolysis. For example, we observed 1706 putative apoptotic caspase cuts, more than double the total annotated sites in the CASBAH and MEROPS databases. In the endoproteolysis category, there are a total of nearly 3000 noncaspase nontryptic cleavages that are not currently reported in the MEROPS database. These studies significantly increase the annotation for all categories of proteolysis in human cells and allow public access for investigators to explore interesting proteolytic events in healthy and apoptotic human cells. Annotation of the human α-aminome, the full set of unmodified protein N termini, can provide a wealth of information regarding protein turnover, protein trafficking, and protease activity (1Arnesen T. Towards a functional understanding of protein N-terminal acetylation.PLoS Biol. 2011; 9: e1001074Crossref PubMed Scopus (150) Google Scholar). The vast majority of protein N termini in eukaryotic cells are cotranslationally blocked by acetylation through the action of N-acetyl transferases (2Starheim K.K. Gevaert K. Arnesen T. Protein N-terminal acetyltransferases: when the start matters.Trends Biochem. Sci. 2012; 37: 152-161Abstract Full Text Full Text PDF PubMed Scopus (205) Google Scholar). Free α-amines occur on some proteins that are never N-terminally acetylated, and can also be regenerated by signal or transit peptide removal during protein trafficking, and endo- or exoproteolysis during protein maturation and signaling. Thus, there has been considerable effort to develop unbiased proteomic methods to characterize the α-aminome in healthy and diseased states (3van den Berg B.H. Tholey A. Mass spectrometry-based proteomics strategies for protease cleavage site identification.Proteomics. 2012; 12: 516-529Crossref PubMed Scopus (30) Google Scholar, 4Staes A. Impens F. Van Damme P. Ruttens B. Goethals M. Demol H. Timmerman E. Vandekerckhove J. Gevaert K. Selecting protein N-terminal peptides by combined fractional diagonal chromatography.Nat. Protoc. 2011; 6: 1130-1141Crossref PubMed Scopus (139) Google Scholar, 5Impens F. Colaert N. Helsens K. Plasman K. Van Damme P. Vandekerckhove J. Gevaert K. MS-driven protease substrate degradomics.Proteomics. 2010; 10: 1284-1296Crossref PubMed Scopus (56) Google Scholar, 6auf dem Keller U. Schilling O. Proteomic techniques and activity-based probes for the system-wide study of proteolysis.Biochimie. 2010; 92: 1705-1714Crossref PubMed Scopus (47) Google Scholar, 7Drag M. Bogyo M. Ellman J.A. Salvesen G.S. Aminopeptidase fingerprints, an integrated approach for identification of good substrates and optimal inhibitors.J. Biol. Chem. 2010; 285: 3310-3318Abstract Full Text Full Text PDF PubMed Scopus (88) Google Scholar, 8Mahrus S. Trinidad J.C. Barkan D.T. Sali A. Burlingame A.L. Wells J.A. Global sequencing of proteolytic cleavage sites in apoptosis by specific labeling of protein N termini.Cell. 2008; 134: 866-876Abstract Full Text Full Text PDF PubMed Scopus (369) Google Scholar). We have developed a positive enrichment method in which the α-amines of intracellular (8Mahrus S. Trinidad J.C. Barkan D.T. Sali A. Burlingame A.L. Wells J.A. Global sequencing of proteolytic cleavage sites in apoptosis by specific labeling of protein N termini.Cell. 2008; 134: 866-876Abstract Full Text Full Text PDF PubMed Scopus (369) Google Scholar) or extracellular proteins (9Wildes D. Wells J.A. Sampling the N-terminal proteome of human blood.Proc. Natl. Acad. Sci. U.S.A. 2010; 107: 4561-4566Crossref PubMed Scopus (97) Google Scholar) can be specifically and directly tagged and captured, without pretreatment or protection, using subtiligase, an engineered peptide ligase (Fig. 1A) (10Chang T.K. Jackson D.Y. Burnier J.P. Wells J.A. Subtiligase: a tool for semisynthesis of proteins.Proc. Natl. Acad. Sci. U.S.A. 1994; 91: 12544-12548Crossref PubMed Scopus (147) Google Scholar, 11Jackson D.Y. Burnier J. Quan C. Stanley M. Tom J. Wells J.A. A designed peptide ligase for total synthesis of ribonuclease A with unnatural catalytic residues.Science. 1994; 266: 243-247Crossref PubMed Scopus (254) Google Scholar). Following purification, tryptic digestion, and LC-MS/MS, the protein sequence and exact site of proteolysis are readily identified. We have applied this approach to study proteolysis by caspases, cysteine-class aspartyl specific proteases, during cellular apoptosis (8Mahrus S. Trinidad J.C. Barkan D.T. Sali A. Burlingame A.L. Wells J.A. Global sequencing of proteolytic cleavage sites in apoptosis by specific labeling of protein N termini.Cell. 2008; 134: 866-876Abstract Full Text Full Text PDF PubMed Scopus (369) Google Scholar, 12Agard N.J. Mahrus S. Trinidad J.C. Lynn A. Burlingame A.L. Wells J.A. Global kinetic analysis of proteolysis via quantitative targeted proteomics.Proc. Natl. Acad. Sci. U.S.A. 2012; 109: 1913-1918Crossref PubMed Scopus (91) Google Scholar, 13Shimbo K. Hsu G.W. Nguyen H. Mahrus S. Trinidad J.C. Burlingame A.L. Wells J.A. Quantitative profiling of caspase-cleaved substrates reveals different drug-induced and cell-type patterns in apoptosis.Proc. Natl. Acad. Sci. U.S.A. 2012; 109: 12432-12437Crossref PubMed Scopus (55) Google Scholar, 14Crawford E.D. Seaman J.E. Barber 2nd, A.E. David D.C. Babbitt P.C. Burlingame A.L. Wells J.A. Conservation of caspase substrates across metazoans suggests hierarchical importance of signaling pathways over specific targets and cleavage site motifs in apoptosis.Cell Death Differ. 2012; 19: 2040-2048Crossref PubMed Scopus (45) Google Scholar), and inflammatory response (15Agard N.J. Maltby D. Wells J.A. Inflammatory stimuli regulate caspase substrate profiles.Mol. Cell Proteomics. 2010; 9: 880-893Abstract Full Text Full Text PDF PubMed Scopus (144) Google Scholar). These studies, in a variety of cell types and apoptotic inducers, have revealed much about the targets, substrate recognition, timing, logic, and evolution of caspase cleavage events. These efforts have generated a huge amount of data that requires systematic compilation, organization, and normalization so that it can be shared and queried easily by all investigators and compared with other databases describing proteolytic events (16Luthi A.U. Martin S.J. The CASBAH: a searchable database of caspase substrates.Cell Death Differ. 2007; 14: 641-650Crossref PubMed Scopus (342) Google Scholar, 17Rawlings N.D. Barrett A.J. Bateman A. MEROPS: the database of proteolytic enzymes, their substrates and inhibitors.Nucleic Acids Res. 2012; 40: D343-350Crossref PubMed Scopus (713) Google Scholar, 18Lange P.F. Huesgen P.F. Overall C.M. TopFIND 2.0–linking protein termini with proteolytic processing and modifications altering protein function.Nucleic Acids Res. 2012; 40: D351-361Crossref PubMed Scopus (50) Google Scholar). Here we present the results of both previously published and new experiments that detect α-amines in both untreated and apoptotic human cells. These studies reveal new translational N-terminal processing, signal and transit peptide removal, and other proteolytic events associated with normal protein maturation and function in healthy cells. Comparing these data to the apoptotic dataset reveals that the greatest changes in apoptosis are caused by endoproteolysis, owing to the induction of caspases as well as other proteases. We find a total of 1706 putative caspase sites in nearly 1300 different human proteins. We further find an additional 2900 noncaspase, nontryptic, nontransit, and nonsignal peptide cleavage sites in 1415 proteins. In addition to the analyses described here, we provide a publically available database, the DegraBase, that is dynamic, expandable, searchable, and readily accessible (http://wellslab.ucsf.edu/degrabase/). With this database, investigators can query all 8090 unique α-amines detected with high confidence from 26,043 peptide observations in both previously published (8Mahrus S. Trinidad J.C. Barkan D.T. Sali A. Burlingame A.L. Wells J.A. Global sequencing of proteolytic cleavage sites in apoptosis by specific labeling of protein N termini.Cell. 2008; 134: 866-876Abstract Full Text Full Text PDF PubMed Scopus (369) Google Scholar, 12Agard N.J. Mahrus S. Trinidad J.C. Lynn A. Burlingame A.L. Wells J.A. Global kinetic analysis of proteolysis via quantitative targeted proteomics.Proc. Natl. Acad. Sci. U.S.A. 2012; 109: 1913-1918Crossref PubMed Scopus (91) Google Scholar, 13Shimbo K. Hsu G.W. Nguyen H. Mahrus S. Trinidad J.C. Burlingame A.L. Wells J.A. Quantitative profiling of caspase-cleaved substrates reveals different drug-induced and cell-type patterns in apoptosis.Proc. Natl. Acad. Sci. U.S.A. 2012; 109: 12432-12437Crossref PubMed Scopus (55) Google Scholar) and new subtiligase α-aminome labeling experiments. The DegraBase substantially expands annotated intracellular proteolytic events in healthy and apoptotic cells. Jurkat, THP-1, DB, RPMI 8226, MM1-S and U266 human cell lines were acquired from the American Type Culture Collection (ATCC, Manassas, VA) and were cultured under the recommended conditions. When cells reached a density of 1 × 106 cells/ml, an apoptotic inducer (doxorubicin, etoposide, bortezomib, FasL, CD95, staurosporine, or TRAIL) was added from 1000x stock (for individual experimental details, see supplemental Table 1A). Cell viability and caspase activity were monitored by CellTiter-Glo, Caspase-Glo (Promega, Madison, WI) and Ac-DEVD-AFC activity assays. Cells were harvested by centrifugation after 0–40 h, washed with phosphate buffered saline solution, pelleted, and stored at –80°C. For untreated experiments, healthy cells cultured under the same conditions were harvested without any inducer added. The lysis and N-terminal labeling were performed as described previously (8Mahrus S. Trinidad J.C. Barkan D.T. Sali A. Burlingame A.L. Wells J.A. Global sequencing of proteolytic cleavage sites in apoptosis by specific labeling of protein N termini.Cell. 2008; 134: 866-876Abstract Full Text Full Text PDF PubMed Scopus (369) Google Scholar, 12Agard N.J. Mahrus S. Trinidad J.C. Lynn A. Burlingame A.L. Wells J.A. Global kinetic analysis of proteolysis via quantitative targeted proteomics.Proc. Natl. Acad. Sci. U.S.A. 2012; 109: 1913-1918Crossref PubMed Scopus (91) Google Scholar, 13Shimbo K. Hsu G.W. Nguyen H. Mahrus S. Trinidad J.C. Burlingame A.L. Wells J.A. Quantitative profiling of caspase-cleaved substrates reveals different drug-induced and cell-type patterns in apoptosis.Proc. Natl. Acad. Sci. U.S.A. 2012; 109: 12432-12437Crossref PubMed Scopus (55) Google Scholar, 15Agard N.J. Maltby D. Wells J.A. Inflammatory stimuli regulate caspase substrate profiles.Mol. Cell Proteomics. 2010; 9: 880-893Abstract Full Text Full Text PDF PubMed Scopus (144) Google Scholar). For experiments not previously published, the following protocol was used. Cells were lysed in a bicine buffer with triton or SDS containing the protease inhibitors EDTA, PMSF, E-64, z-VAD-fmk, and AEBSF. Proteins were reduced with 2 mm tris(2-carboxyethyl) phosphine hydrochloride at 90°C for 15 min and alkylated once cooled with 4 mm iodoacetamide in dark for 1 h, then quenched with 10 mm dithiothreitol. Labeling was performed with 1 mm of a biotinylated synthesized peptide ester called TEVest and 1 μm subtiligase for at least an hour at room temperature (11Jackson D.Y. Burnier J. Quan C. Stanley M. Tom J. Wells J.A. A designed peptide ligase for total synthesis of ribonuclease A with unnatural catalytic residues.Science. 1994; 266: 243-247Crossref PubMed Scopus (254) Google Scholar, 19Yoshihara H.A. Mahrus S. Wells J.A. Tags for labeling protein N-termini with subtiligase for proteomics.Bioorg. Med. Chem. Lett. 2008; 18: 6000-6003Crossref PubMed Scopus (42) Google Scholar). There were four different TEVest peptide esters used to facilitate the identification of the labeled products; they were identical except for the small tag left after processing to aid in mass spectrometry recognition: serine-tyrosine (SY), glycine-tyrosine (GY), phenylalanine (Phe), or 2-aminobutyric acid (Abu). The biotinylated proteins were separated by gel filtration or precipitation and captured on NeutrAvidin agarose beads (Pierce, Rockford, Illinois, USA). The samples were digested with sequence grade modified trypsin (Promega, Madison, WI) before or after capture. After capture, the labeled peptides were released with recombinant TEV protease and collected. Samples were desalted by chromatography with C18 ZipTip Pipette Tips (Millipore, Billerica, MA) or C18 high-performance liquid chromatography (HPLC) (Waters, Milford, MA). Further offline strong cation exchange fractionation was performed on some samples. For further individual experimental details, see supplemental Table S1A. For all experiments, samples were separated by reverse phase HPLC coupled to a mass spectrometer: QSTAR Pulsar, QSTAR XL, QSTAR Elite (Applied Biosystems, Foster City, CA), LTQ-Orbitrap XL or QExactive (Thermo Fisher Scientific, San Jose, CA). Spectra were converted into peak lists for database searching using the mascot dll in Analyst for QSTAR instruments or using an in-house script based on the Raw_Extract script in Xcalibur v2.4 (Thermo Fisher Scientific). Peptide identification was performed using Protein Prospector version 5.10.0 (20Chalkley R.J. Baker P.R. Medzihradszky K.F. Lynn A.J. Burlingame A.L. In-depth analysis of tandem mass spectrometry data from disparate instrument types.Mol. Cell Proteomics. 2008; 7: 2386-2398Abstract Full Text Full Text PDF PubMed Scopus (144) Google Scholar). Search parameter mass allowances were tailored for each instrument: 100 ppm precursor and 0.15 Da fragment for QSTAR instruments, 20 ppm precursor and 0.6 Da fragment for LQT-Orbitrap XL, and 20 ppm precursor and 0.8 Da for QExactive. All searches were performed with constant modification of the peptide N terminus with the appropriate TEVest tag, variable modifications of carbamidomethylation of cysteines and oxidation of methionine, and allowing for up to three missed tryptic cleavages. All datasets were searched assuming tryptic specificity at the peptide C terminus, but no cleavage specificity at the N terminus. All fractions (including re-analysis of previously published data) were searched against the human SwissProt library release 2012_03 (20,255 entries) to provide consistent accession number annotations for all data. Maximum expectation value scores for protein and peptide of 0.02 were employed as acceptance criteria. Searches against a decoy library of random and reversed protein sequences revealed an average false discovery rate (FDR) across all datasets of 0.55%. The DegraBase framework was created using FileMakerPro version 9.0, and houses three types of data: the sample, peptide and N terminus/protein tables (Fig. 1B). Experimental parameters are entered by investigators, mass spectrometry data are imported from files created by Protein Prospector, and protein- and cleavage site-specific annotation data are imported from a number of external databases including UniProtKB (21Reorganizing the protein space at the Universal Protein Resource (UniProt).Nucleic Acids Res. 2012; 40: D71-75Crossref PubMed Scopus (1100) Google Scholar), the CASBAH (16Luthi A.U. Martin S.J. The CASBAH: a searchable database of caspase substrates.Cell Death Differ. 2007; 14: 641-650Crossref PubMed Scopus (342) Google Scholar), and MEROPS (17Rawlings N.D. Barrett A.J. Bateman A. MEROPS: the database of proteolytic enzymes, their substrates and inhibitors.Nucleic Acids Res. 2012; 40: D343-350Crossref PubMed Scopus (713) Google Scholar). Full documentation, including FileMakerPro scripts for data analysis and Perl scripts for processing of UniProtKB data before input, is available as supplemental File S1. The DegraBase also exists as an HTML-based website (http://wellslab.ucsf.edu/degrabase/) to allow for more accessible searching. Abundance data were taken from PaxDB version 2.1 using the integrated dataset called "Weighted average of 'H. sapiens PeptideAtlas Build May 2010′(weighting 50%), 'H. sapiens PeptideAtlas Build March 2009′,(weighting 50%)" available from the downloads tab at www.pax-db.org (22Wang M. Weiss M. Simonovic M. Haertinger G. Schrimpf S.P. Hengartner M.O. von Mering C. PaxDb, a Database of Protein Abundance Averages Across All Three Domains of Life.Mol. Cell Proteomics. 2012; 11: 492-500Abstract Full Text Full Text PDF PubMed Scopus (340) Google Scholar). Sequence logos were made using iceLogo with the whole human SwissProt library as background (23Colaert N. Helsens K. Martens L. Vandekerckhove J. Gevaert K. Improved visualization of protein consensus sequences by iceLogo.Nat. Methods. 2009; 6: 786-787Crossref PubMed Scopus (524) Google Scholar). All logo images were made with the percent difference scoring system, except when stated as "Filled Logos" representing amino acid frequency, not information content. Significance was determined by chi-square analyses. Data for methionine processing, mitochondrial transit peptide removal and signal peptide removal were compared with SwissProt library release 2012_03. Mitochondrial localization was determined based on the MitoCarta database (24Pagliarini D.J. Calvo S.E. Chang B. Sheth S.A. Vafai S.B. Ong S.E. Walford G.A. Sugiana C. Boneh A. Chen W.K. Hill D.E. Vidal M. Evans J.G. Thorburn D.R. Carr S.A. Mootha V.K. A mitochondrial protein compendium elucidates complex I disease biology.Cell. 2008; 134: 112-123Abstract Full Text Full Text PDF PubMed Scopus (1501) Google Scholar). Gene Ontology (GO) 1The abbreviations used are:GOgene ontologyLC-MS/MSliquid chromatograph tandem mass spectrometrymTPmitochondrial transit peptide. term enrichment was determined using the GO::TermFinder software (25Boyle E.I. Weng S. Gollub J. Jin H. Botstein D. Cherry J.M. Sherlock G. GO::TermFinder–open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes.Bioinformatics. 2004; 20: 3710-3715Crossref PubMed Scopus (1404) Google Scholar). A list of unique proteins for each dataset was created and uploaded to the database and tested for enrichment against the human SwissProt background using all evidence codes except ND (No biological Data available) and IEA (Inferred from Electronic Annotation. Enriched terms were defined using a corrected p value cutoff of less than 0.01. To compare terms between datasets, a pairwise chi-square test was performed using the Benjamini-Hochberg multiple testing correction procedure. gene ontology liquid chromatograph tandem mass spectrometry mitochondrial transit peptide. Given the massive amount of data generated from multiple experiments under different conditions, it was necessary to create a simple and normalized database. The DegraBase is a relational database built to house our α-aminomics data (Fig. 1B). It is available in three formats (see supplemental Information): a FileMakerPro file (supplemental File S2), an excel document containing worksheets for each of the major tables (supplemental File S3), and a web interface (http://wellslab.ucsf.edu/degrabase/) where users may search by substrate name or accession number. Full documentation of the database is available in Supplemental File S1. The current DegraBase contains a total of 26043 independent peptide identifications from 44 different proteomic labeling experiments (11 untreated and 33 apoptotic) (Fig. 1C and supplemental Table 1A). There are a total of 8090 unique N terminus identifications from 3206 proteins. We subdivided our data into three sets: (1) untreated, (2) apoptotic, and (3) apoptotic caspase-cleaved. In a separate study using our labeling method, we have seen that there is cell line- and drug-specific variability in the data, but most differences show up in detected abundance of cleavage product over time rather than the presence or absence (reported here) of the specific identified N termini (13Shimbo K. Hsu G.W. Nguyen H. Mahrus S. Trinidad J.C. Burlingame A.L. Wells J.A. Quantitative profiling of caspase-cleaved substrates reveals different drug-induced and cell-type patterns in apoptosis.Proc. Natl. Acad. Sci. U.S.A. 2012; 109: 12432-12437Crossref PubMed Scopus (55) Google Scholar). Therefore, we were comfortable pooling our multiple apoptotic experiments together to compare all proteins detected in all untreated cells tested versus those undergoing apoptosis. The untreated dataset contains all observations from the 11 experiments performed in five different cell lines (supplemental Table S1B). This dataset has 3732 identified N termini corresponding to 2144 unique N terminus start sites from 1239 proteins. The apoptotic dataset consists of all observations from the 33 experiments using seven different chemotherapeutic inducers in five cell lines (supplemental Table S1C). This generated a total of 22311 independent peptide identifications, corresponding to 6990 unique N terminus sites from 3020 different proteins. This reflects the dramatic activation of caspases following the induction of apoptosis of our samples, also observed with caspase activity and cell death assays (data not shown). We defined the third dataset, the apoptotic caspase-cleaved dataset, as a subset of the apoptotic dataset that includes all apoptotic aspartic acid-cleaved N termini (supplemental Table S1D). This dataset includes 1706 unique N termini from 1268 proteins, and in combination with our previous studies, MEROPS and CASBAH, increases the number of published human caspase-cleavage events to over 2200. The apoptotic dataset contains 1706 aspartate cleaved peptides compared with the 140 seen in the untreated dataset, reflecting a dramatic induction of caspase activity. To estimate to what degree the α-aminome MS data are biased by protein abundance in cells, we compared the datasets to PaxDB (22Wang M. Weiss M. Simonovic M. Haertinger G. Schrimpf S.P. Hengartner M.O. von Mering C. PaxDb, a Database of Protein Abundance Averages Across All Three Domains of Life.Mol. Cell Proteomics. 2012; 11: 492-500Abstract Full Text Full Text PDF PubMed Scopus (340) Google Scholar), a database that provides an independent estimate of relative protein abundance based on MS spectral counting data. All three α-aminome datasets cover more than six orders of magnitude of ppm (supplemental Fig. S1A–S1C). Only for the small set of low abundance proteins did our α-aminome identification tail off, which presumably reflects the limits of detection of the methodology. There is a slight enrichment for higher abundance proteins overall (supplemental Fig. S1D). At the protein level, there is a large overlap between the untreated and apoptotic datasets; 1053 of the 1239 proteins (85%) from untreated cells were also found in the apoptotic dataset (Fig. 2A). In contrast, we observed a smaller overlap between datasets when considering the particular N termini within each protein (Fig. 2B); only 1328 of the 2144 untreated N termini (62%) were labeled under apoptotic conditions. There is a small set of 361 proteins, but only 129 N termini, that overlap between the untreated and apoptotic caspase-cleaved datasets. The presence of caspase-cleaved products in healthy cells likely reflects low levels of apoptosis that occurs in any healthy cell population, and make up a very small portion of the total untreated set. The protein overlap may represent apoptotic caspase substrates that also undergo endoproteolysis in healthy cells by noncaspases. Interestingly, many of the proteolytic substrates in healthy cells are cleaved at different positions upon induction of apoptosis. To compare the functional properties of the different datasets, we performed Gene Ontology (GO) term enrichment using GO::TermFinder (supplemental Table S2) (25Boyle E.I. Weng S. Gollub J. Jin H. Botstein D. Cherry J.M. Sherlock G. GO::TermFinder–open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes.Bioinformatics. 2004; 20: 3710-3715Crossref PubMed Scopus (1404) Google Scholar). We looked at the terms unique to each dataset to identify specific process, function or component annotations related to healthy or dying cellular states. The untreated dataset was enriched in terms related to homeostatic functions like metabolic and biosynthetic processes (specifically related to ribosomal, coenzyme, amino acids and fatty acids, NADH dehydrogenase, and isomerase functions), the mitochondrial proton-transporting ATP synthase complex, and organelle envelope lumen (prominently related to the endoplasmic reticulum). In the apoptotic set, we compared the significant terms from caspase substrates to the noncaspase apoptotic terms, and to the terms unique to the apoptotic set only. The caspase substrates are enriched in the regulation of transcription, and there were many terms related to cell morphogenesis, specifically chromosome and microtubule structure, which are known to change and break down during apoptosis. The noncaspase apoptotic enriched terms in process, function and component ontologies relate to chromatin assembly (especially DNA binding, vesicle coating, and targeting), signal transduction involved in DNA damage and cell cycle checkpoints, and nucleotide catabolic processes. We also saw enrichment in the non-caspase apoptotic set for proteins associated with terms for proteolysis and cell death. We next analyzed the precise sequences surrounding the N termini identified in each dataset. We used iceLogo (23Colaert N. Helsens K. Martens L. Vandekerckhove J. Gevaert K. Improved visualization of protein consensus sequences by iceLogo.Nat. Methods. 2009; 6: 786-787Crossref PubMed Scopus (524) Google Scholar) to visualize the sequence specificity for cleavage events for each dataset using the human SwissProt database to establish background amino acid frequencies (Fig. 3). The cleavage sites are presented in the standard Schechter-Berger form, with the scissile bond between the P1 residue and the P1′ residue (26Schechter I. Berger A. On the size of the active site in proteases. I. Papain.Biochem. Biophys. Res. Commun. 1967; 27: 157-162Crossref PubMed Scopus (4760) Google Scholar). All three logos show a strong preference for small amino acids (glycine, serine, or alanine) at the P1′ position, but significant differences at the P1 position. In healthy cells, there is enrichment for cleavage sites following lysine, arginine, and methionine. The methionine cleavages mainly represent N-terminal methionine processing. The large number of cuts following basic residues is consistent with a high activity of trypsin-like enzymes in both healthy and apoptotic cells. In apoptotic cells this tryptic-like activity is overshadowed by the large number of caspase cleavages following aspartic acid residues. The apoptotic caspase-cleaved dataset shows a degenerate spe

Referência(s)