Artigo Acesso aberto Revisado por pares

Proteomics Reveals N-Linked Glycoprotein Diversity in Caenorhabditis elegans and Suggests an Atypical Translocation Mechanism for Integral Membrane Proteins

2007; Elsevier BV; Volume: 6; Issue: 12 Linguagem: Inglês

10.1074/mcp.m600392-mcp200

ISSN

1535-9484

Autores

Hiroyuki Kaji, Junichi Kamiie, Hirotaka Kawakami, Kazuki Kido, Yoshio Yamauchi, Takashi Shinkawa, Masato Taoka, Nobuhiro Takahashi, Toshiaki Isobe,

Tópico(s)

Enzyme Structure and Function

Resumo

Protein glycosylation is one of the most common post-translational modifications in eukaryotes and affects various aspects of protein structure and function. To facilitate studies of protein glycosylation, we paired glycosylation site-specific stable isotope tagging of lectin affinity-captured N-linked glycopeptides with mass spectrometry and determined 1,465 N-glycosylated sites on 829 proteins expressed in Caenorhabditis elegans. The analysis shows the diversity of protein glycosylation in eukaryotes in terms of glycosylation sites and oligosaccharide structures attached to polypeptide chains and suggests the substrate specificity of oligosaccharyltransferase, a single multienzyme complex in C. elegans that incorporates an oligosaccharide moiety en bloc to newly synthesized polypeptides. In addition, topological analysis of 257 N-glycosylated proteins containing a putative single transmembrane segment that were identified based on the relative positions of glycosylation sites and transmembrane segments suggests that an atypical non-cotranslational mechanism translocates large N-terminal segments from the cytosol to the endoplasmic reticulum lumen in the absence of signal sequence function. Protein glycosylation is one of the most common post-translational modifications in eukaryotes and affects various aspects of protein structure and function. To facilitate studies of protein glycosylation, we paired glycosylation site-specific stable isotope tagging of lectin affinity-captured N-linked glycopeptides with mass spectrometry and determined 1,465 N-glycosylated sites on 829 proteins expressed in Caenorhabditis elegans. The analysis shows the diversity of protein glycosylation in eukaryotes in terms of glycosylation sites and oligosaccharide structures attached to polypeptide chains and suggests the substrate specificity of oligosaccharyltransferase, a single multienzyme complex in C. elegans that incorporates an oligosaccharide moiety en bloc to newly synthesized polypeptides. In addition, topological analysis of 257 N-glycosylated proteins containing a putative single transmembrane segment that were identified based on the relative positions of glycosylation sites and transmembrane segments suggests that an atypical non-cotranslational mechanism translocates large N-terminal segments from the cytosol to the endoplasmic reticulum lumen in the absence of signal sequence function. Protein post-translational modifications (PTMs) 1The abbreviations used are: PTM, post-translational modification; PNGase, peptide-N-glycanase; IGOT, isotope-coded glycosylation site-specific tagging; Con A, concanavalin A; WGA, wheat germ agglutinin; GaL6, galectin 6; ER, endoplasmic reticulum; 2D, two-dimensional; OST, oligosaccharyltransferase; SRP, signal recognition particle; LDL, low density lipoprotein. 1The abbreviations used are: PTM, post-translational modification; PNGase, peptide-N-glycanase; IGOT, isotope-coded glycosylation site-specific tagging; Con A, concanavalin A; WGA, wheat germ agglutinin; GaL6, galectin 6; ER, endoplasmic reticulum; 2D, two-dimensional; OST, oligosaccharyltransferase; SRP, signal recognition particle; LDL, low density lipoprotein. such as proteolysis or addition of a chemical group to one or more amino acid residues may change the properties of a protein. A large body of evidence suggests that PTMs are critical for various cell regulatory and signaling processes (1Mann M. Jensen O.N. Proteomic analysis of post-translational modifications.Nat. Biotechnol. 2003; 21: 255-261Crossref PubMed Scopus (1577) Google Scholar), and thus the analysis of the status of PTMs on proteins is a major objective of proteomics research. The development of certain new technologies for mapping PTMs on a proteomic scale has begun to yield fruitful results (2Ballif B.A. Villén J. Beausoleil S.A. Schwartz D. Gygi S.P. Phosphoproteomic analysis of the developing mouse brain.Mol. Cell. Proteomics. 2004; 3: 1093-1101Abstract Full Text Full Text PDF PubMed Scopus (319) Google Scholar, 3Gruhler A. Olsen J.V. Mohammed S. Mortensen P. Færgeman N.J. Mann M. Jensen O.N. Quantitative phosphoproteomics applied to the yeast pheromone signaling pathway.Mol. Cell. Proteomics. 2005; 4: 310-327Abstract Full Text Full Text PDF PubMed Scopus (693) Google Scholar, 4Liu T. Qian W.-J. Gritsenko M.A. Camp II, D.G. Monroe M.E. Moore R.J. Smith R.D. Human plasma N-glycoproteome analysis by immunoaffinity subtraction, hydrazide chemistry, and mass spectrometry.J. Proteome Res. 2005; 4: 2070-2080Crossref PubMed Scopus (370) Google Scholar, 5Liu T. Qian W.-J. Gritsenko M.A. Xiao W. Moldawer L.L. Kaushal A. Monroe M.E. Varnum S.M. Moore R.J. Purvine S.O. Maier R.V. Davis R.W. Tompkins R.G. Camp II, D.G. Smith R.D. Inflammation and the Host Response to Injury Large Scale Collaborative Research ProgramHigh dynamic range characterization of the trauma patient plasma proteome.Mol. Cell. Proteomics. 2006; 5: 1899-1913Abstract Full Text Full Text PDF PubMed Scopus (137) Google Scholar, 6Wang L. Li F. Sun W. Wu S. Wang X. Zhang L. Zheng D. Wang J. Gao Y. Concanavalin A-captured glycoproteins in healthy human urine.Mol. Cell. Proteomics. 2006; 5: 560-562Abstract Full Text Full Text PDF PubMed Scopus (103) Google Scholar, 7Matsumoto M. Hatakeyama S. Oyamada K. Oda Y. Nishimura T. Nakayama K.I. Large-scale analysis of the human ubiquitin-related proteome.Proteomics. 2005; 5: 4145-4151Crossref PubMed Scopus (156) Google Scholar, 8Wykoff D.D. O'Shea E.K. Identification of sumoylated proteins by systematic immunoprecipitation of the budding yeast proteome.Mol. Cell. Proteomics. 2005; 4: 73-83Abstract Full Text Full Text PDF PubMed Scopus (91) Google Scholar). Because all PTMs accompany changes in molecular mass of a protein, MS-based analysis is often selected for large scale PTM analyses. Among the approximately 200 different known PTMs (9Krishna R.G. Wold F. Post-translational modification of proteins.Adv. Enzymol. Relat. Areas Mol. Biol. 1993; 67: 265-298PubMed Google Scholar), protein glycosylation is one of the most common in eukaryotes: on average there are potential targets in more than half of the genes encoded in eukaryotic genomes (10Apweiler R. Hermjakob H. Sharon N. On the frequency of protein glycosylation, as deduced from analysis of the SWISS-PROT database.Biochim. Biophys. Acta. 1999; 1473: 4-8Crossref PubMed Scopus (1456) Google Scholar). Protein glycosylation plays a role in protein folding, subcellular localization, turnover, activity, protein-protein interactions, etc. and contributes significantly to physiology as evidenced by the growing number of human diseases with defects in glycoconjugate assembly and processing (11Freeze H.H. Aebi M. Altered glycan structures: the molecular basis of congenital disorders of glycosylation.Curr. Opin. Struct. Biol. 2005; 15: 490-498Crossref PubMed Scopus (201) Google Scholar). Thus, the analysis of protein glycosylation is important for both basic biology and clinical applications, including the discovery of protein biomarkers for diagnosis and drug discovery. Previous studies show that protein glycosylation is quite diverse because the oligosaccharide structure may vary widely between different proteins. In addition, a single protein can be glycosylated at multiple sites, and subsequent processing may differentially or partially modify an oligosaccharide attached at each site. These factors generate the observed complexity of glycoprotein structure and cause difficulties in characterizing protein glycosylation on a proteomic scale. At present, little is known about the final structure of most glycoproteins; however, the specific structure of each oligosaccharide and the rate of the modification(s) are often critical to individual glycoprotein function, and defects in these processes may cause disease (13Moloney D.J. Panin V.M. Johnston S.H. Chen J. Shao L. Wilson R. Wang Y. Stanley P. Irvine K.D. Haltiwanger R.S. Vogt T.F. Fringe is a glycosyltransferase that modifies Notch.Nature. 2000; 406: 369-375Crossref PubMed Scopus (713) Google Scholar). Thus, the mechanisms by which protein glycosylation is regulated remain a challenging problem for proteomics research. Currently two methods allow large scale glycoprotein analysis directly from a complex biological mixture, and both methods utilize MS-based shotgun technology but differ in the way glycopeptides are collected. One of the methods captures glycopeptides, regardless of the glycan structure, on a solid support by chemical coupling between the cis-diol group of the glycan and hydrazide on the support, and then N-linked glycopeptides are released specifically from the support by peptide-N-glycanase (PNGase) digestion (14Zhang H. Li X.J. Martin D.B. Aebersold R. Identification and quantification of N-linked glycoproteins using hydrazide chemistry, stable isotope labeling and mass spectrometry.Nat. Biotechnol. 2003; 21: 660-666Crossref PubMed Scopus (1251) Google Scholar, 15Ramachandran P. Boontheung P. Xie Y. Sondej M. Wong D.T. Loo J.A. Identification of N-linked glycoproteins in human saliva by glycoprotein capture and mass spectrometry.J. Proteome Res. 2006; 5: 1493-1503Crossref PubMed Scopus (181) Google Scholar). Another method captures a subset of glycopeptides by lectin affinity chromatography (16Kaji H. Saito H. Yamauchi Y. Shinkawa T. Taoka M. Hirabayashi J. Kasai K. Takahashi N. Isobe T. Lectin affinity capture, isotope-coded tagging and mass spectrometry to identify N-linked glycoproteins.Nat. Biotechnol. 2003; 21: 667-672Crossref PubMed Scopus (562) Google Scholar, 17Bunkenborg J. Pilch B.J. Podtelejnikov A.V. Wisniewski J.R. Screening for N-glycosylated proteins by liquid chromatography mass spectrometry.Proteomics. 2004; 4: 454-465Crossref PubMed Scopus (169) Google Scholar, 18Lewandrowski U. Moebius J. Walter U. Sickmann A. Elucidation of N-glycosylation sites on human platelet proteins. A glycoproteomic approach.Mol. Cell. Proteomics. 2006; 5: 226-233Abstract Full Text Full Text PDF PubMed Scopus (133) Google Scholar). The type of glycopeptides captured by this method depends on the specificity of the lectin used; however, comprehensive analysis of glycoproteins can be achieved by using multiple lectin columns with distinct binding specificity (e.g. non-reducing end oligosaccharides). This approach, termed isotope-coded glycosylation site-specific tagging (IGOT), includes a step to remove the glycan moiety of glycopeptides with PNGase in 18O-labeled water (16Kaji H. Saito H. Yamauchi Y. Shinkawa T. Taoka M. Hirabayashi J. Kasai K. Takahashi N. Isobe T. Lectin affinity capture, isotope-coded tagging and mass spectrometry to identify N-linked glycoproteins.Nat. Biotechnol. 2003; 21: 667-672Crossref PubMed Scopus (562) Google Scholar). When the enzyme releases N-linked glycans in H218O, the glycosylated Asn residue (in the consensus tripeptide sequence for N-linked glycosylation, Asn-Xaa-(Ser/Thr) where Xaa is any amino acid except Pro) is converted to Asp with concomitant incorporation of 18O from water (19Gonzalez J. Takao T. Hori H. Besada V. Rodriguez R. Padron G. Shimonishi Y. A method for determination of N-glycosylation sites in glycoproteins by collision-induced dissociation analysis in fast atom bombardment mass spectrometry: identification of the positions of carbohydrate-linked asparagine in recombinant alpha-amylase by treatment with peptide-N-glycosidase F in 18O-labeled water.Anal. Biochem. 1992; 205: 151-158Crossref PubMed Scopus (109) Google Scholar). This PNGase-mediated incorporation of the 18O-tag distinguishes glycosylated peptides from non-glycosylated peptides that have non-enzymatically deamidated Asp residues. The conversion of Asn to Asp via 18O incorporation in the glycosylation consensus sequence strongly indicates that the peptide was formerly N-glycosylated. In this study, we paired IGOT with automated multidimensional liquid chromatography-MS technology and identified 1,465 N-glycosylated sites on 829 proteins expressed in Caenorhabditis elegans. We report here the diversity of protein glycosylation and the specificity of the oligosaccharyltransferase of C. elegans that incorporates an oligosaccharide moiety en bloc into nascent polypeptide chains. Based on the analysis of the relative positions of N-glycosylation sites and putative transmembrane segments of 257 potential integral membrane glycoproteins identified in this study, we also suggest that an atypical, non-cotranslational mechanism determines the topology of integral membrane glycoproteins. The coding sequence of the C. elegans galectin 6 (GaL6) cDNA (provided by Dr. Hirabayashi, National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki, Japan) was inserted into the Escherichia coli expression vector pET and introduced into E. coli BL21(DE3)pLysS (20Hirabayashi J. Ubukata T. Kasai K. Purification and molecular characterization of a novel 16-kDa galectin from the nematode Caenorhabditis elegans.J. Biol. Chem. 1996; 271: 2497-2505Abstract Full Text Full Text PDF PubMed Scopus (65) Google Scholar). The transformant was cultured in M9CA medium containing 0.2 mg/ml ampicillin at 37 °C, and gene expression was induced with 1 mm isopropyl 1-thio-β-d-galactopyranoside at a midlog phase of growth (A600 = 0.6∼0.8). After further cultivation for 3 h, E. coli cells were lysed by sonication at 4 °C in 50 mm sodium phosphate buffer, pH 7.5, and centrifuged at 10,000 × g for 30 min. The supernatant was then applied to an asialofetuin column (Toyopearl 650M, 2.5-cm inner diameter × 5 cm) equilibrated with 50 mm sodium phosphate buffer, pH 7.5, at a flow rate of 0.5 ml/min. After washing the column with the equilibration buffer, the adsorbed GaL6 was eluted with the same buffer containing 0.2 m lactose. The purified GaL6 (20 mg) was immobilized on TSK-GEL Tresyl-5PW (2 ml; TOSOH) according to the protocol provided by the supplier and was packed into a 4.6-mm-inner diameter × 10-cm column. C. elegans strain N2 was cultured in liquid medium at 20 °C as described previously (16Kaji H. Saito H. Yamauchi Y. Shinkawa T. Taoka M. Hirabayashi J. Kasai K. Takahashi N. Isobe T. Lectin affinity capture, isotope-coded tagging and mass spectrometry to identify N-linked glycoproteins.Nat. Biotechnol. 2003; 21: 667-672Crossref PubMed Scopus (562) Google Scholar, 21Mawuenyega K.G. Kaji H. Yamauchi Y. Shinkawa T. Saito H. Taoka M. Takahashi N. Isobe T. Large-scale identification of Caenorhabditis elegans proteins by multidimensional liquid chromatography-tandem mass spectrometry.J. Proteome Res. 2003; 2: 23-35Crossref PubMed Scopus (115) Google Scholar). A mixed growth phase culture of the worm (5–20 g, wet weight) was lysed by sonication in 5 volumes of TBS (50 mm Tris-HCl, pH 7.5, 150 mm NaCl) containing a protease inhibitor mixture (Sigma), and the homogenate was centrifuged at 1,000 × g for 10 min at 4 °C to remove cell debris. The soluble extract was then centrifuged at 100,000 × g for 30 min at 4 °C to separate the soluble and insoluble protein fractions. Each fraction was solubilized in 7 m guanidine HCl in 0.5 m Tris-HCl, pH 8.6, containing 50 mm EDTA, and the proteins were reduced with dithiothreitol and S-carbamoylmethylated with iodoacetamide (22Taoka M. Yamauchi Y. Shinkawa T. Kaji H. Motohashi W. Nakayama H. Takahashi N. Isobe T. Only a small subset of the horizontally transferred chromosomal genes in Escherichia coli are translated into proteins.Mol. Cell. Proteomics. 2004; 3: 780-787Abstract Full Text Full Text PDF PubMed Scopus (75) Google Scholar). The S-carbamoylmethylated proteins were dialyzed against 10 mm HEPES-NaOH, pH 7.5, and digested with Nα-tosylphenylalanyl chloromethyl ketone-treated trypsin (Pierce) at an enzyme:substrate ratio of 1:50 at 37 °C. After 18 h, an aliquot of protease inhibitor mixture (Sigma) was added to the mixture to stop digestion and to protect the lectin columns. In our earlier attempts, we prepared an N-glycosylated protein fraction by lectin affinity chromatography of C. elegans crude extract and then obtained N-glycosylated peptides from a tryptic digest of the glycosylated protein fraction by a second round of lectin affinity chromatography (16Kaji H. Saito H. Yamauchi Y. Shinkawa T. Taoka M. Hirabayashi J. Kasai K. Takahashi N. Isobe T. Lectin affinity capture, isotope-coded tagging and mass spectrometry to identify N-linked glycoproteins.Nat. Biotechnol. 2003; 21: 667-672Crossref PubMed Scopus (562) Google Scholar). In this study, however, we modified the procedure to more efficiently identify the integral membrane glycoproteins; the crude protein extract was first digested with trypsin after S-carbamoylmethylation in 7 m guanidine HCl, and then the N-glycosylated peptides were recovered by lectin affinity chromatography. To increase the purity of glycopeptides, we incorporated an additional "hydrophilic interaction" chromatography step (23Wada Y. Tajiri M. Yoshida S. Hydrophilic affinity isolation and MALDI multiple-stage tandem mass spectrometry of glycopeptides for glycoproteomics.Anal. Chem. 2004; 76: 6560-6565Crossref PubMed Scopus (285) Google Scholar) before PNGase-mediated 18O labeling (described later). To collect N-glycosylated peptides, the tryptic digests of soluble and insoluble protein fractions of C. elegans were subjected to affinity chromatography on three lectin columns, concanavalin A (Con A) (LA-Con A; 4.6-mm inner diameter × 15 cm; Seikagaku Corp., Tokyo, Japan), wheat germ agglutinin (WGA) (LA-WGA; 4.6-mm inner diameter × 15 cm; Seikagaku Corp.), or GaL6 (4.6-mm inner diameter × 10 cm). Approximately 50–200 mg of peptide mixture was applied to each column equilibrated with 10 mm HEPES-NaOH, pH 7.5. After washing the column with the equilibration buffer, adsorbed glycopeptides were recovered by elution with the buffer containing a cognate sugar: 0.2 m α-methyl mannopyranoside for the Con A column, 0.2 m N-acetyl-d-glucosamine (GlcNAc) for the WGA column, or 0.2 m lactose for the GaL6 column. To maximize the recovery of glycosylated peptides, the flow-through fraction of the first chromatography was applied again to the same lectin column, and the chromatography was repeated as described above. The glycopeptide fractions from individual lectin columns of the first and second rounds of chromatography were combined for subsequent steps. The N-glycosylated peptide mixture recovered by lectin affinity chromatography (10–20 ml containing 200–500 μg of peptides) was added to an equal volume of ethanol (EtOH) and 4 volumes of 1-butanol (BuOH) and was applied immediately to a Sepharose CL-4B column (5-mm inner diameter × 50 mm) equilibrated with the solvent H2O:EtOH:BuOH = 1:1:4 (v/v/v). After washing the column with the same solvent, adsorbed glycopeptides were eluted with H2O:EtOH, 1:1 (v/v). The column eluent was monitored at 220 nm, and the recovered glycopeptides were quantitated fluorometrically after reaction with o-phthalaldehyde (24Ishida Y. Fujita T. Asai K. New detection and separation method for amino acids by high-performance liquid chromatography.J. Chromatogr. 1981; 204: 143-148Crossref PubMed Scopus (278) Google Scholar). N-Glycosylated peptides were labeled specifically with 18O by IGOT as described previously (16Kaji H. Saito H. Yamauchi Y. Shinkawa T. Taoka M. Hirabayashi J. Kasai K. Takahashi N. Isobe T. Lectin affinity capture, isotope-coded tagging and mass spectrometry to identify N-linked glycoproteins.Nat. Biotechnol. 2003; 21: 667-672Crossref PubMed Scopus (562) Google Scholar). Briefly the sample glycopeptides were dried under vacuum to remove solvent containing H216O and then redissolved in 0.1 m Tris base prepared in H218O (≥99 atom % 18O; Taiyo Nippon Sanso Corp., Tokyo, Japan). The peptide solution was then adjusted to pH 8–9 with a minimal volume of acetic acid, and then PNGase-A (lyophilized; Seikagaku Corp.), dissolved in H218O, was added to a final concentration of 1 milliunit/10 μg of peptide. The reaction was incubated overnight at 37 °C in a sealed polypropylene tube. The deglycosylated 18O-tagged peptide mixture (approximately 5–10 μg) was analyzed by automated 2D LC-MS/MS. The instrument used was a miniaturized version of that described previously (25Isobe T. Yamauchi Y. Taoka M. Takahashi N. Automated two-dimensional liquid chromatography/tandem mass spectrometry for large-scale protein analysis.in: Simpson R.D. Proteins and Proteomics. Cold Spring Harbor Press, Cold Spring Harbor, NY2003: 869-876Google Scholar, 26Natsume T. Yamauchi Y. Nakayama H. Shinkawa T. Yanagida M. Takahashi N. Isobe T. A direct nanoflow liquid chromatography-tandem mass spectrometry system for interaction proteomics.Anal. Chem. 2002; 74: 4725-4733Crossref PubMed Scopus (179) Google Scholar) and was equipped with a first dimensional microscale cation-exchange column (1-mm inner diameter × 50 mm) of Bioassist-S (7-μm particles; TOSOH) and a second-dimensional direct nanoflow spray tip reversed phase column (150-μm inner diameter × 50 mm) of Mightysil-C18 (3-μm particles; Kanto Chemicals) connected in tandem through an electric column switching valve and an automated solvent desalting device. The chromatography was performed automatically under the time-dependent control program, and the eluate was directly sprayed into a high resolution Q-TOF hybrid mass spectrometer (Q-TOF Ultima; Waters-Micromass) at a flow rate of 100 nl/min. The spectrometer was operated in a data-dependent MS/MS mode where a full MS scan (1 s, m/z 400–1500) was followed by two MS/MS scans (1 s each, m/z 100–1500). The two most intensive precursor ions with a charge state (z) of +2 or +3 were dynamically selected and subjected to collision-induced dissociation with a collision energy as recommended by the manufacturer and a dynamic exclusion duration of 30 s. The total analysis time for a single 2D nano-LC-MS/MS process was 24 h. The large volume of MS/MS data generated by the 2D nano-LC-MS/MS analysis was converted to text files using MassLynx software (version 4.0, Micromass). The peak list files were then created with smoothing by the Savitzky-Golay method (window channels, ±3) using the same software and processed by the Mascot algorithm (version 1.9, Matrix Science, Ltd.) to assign peptides on the C. elegans Wormpep 124 protein sequence database (22,259 entries, www.sanger.ac.uk/Projects/C_elegans/WORMBASE/current/wormpep.shtml). The database search was performed with the parameters as described previously (16Kaji H. Saito H. Yamauchi Y. Shinkawa T. Taoka M. Hirabayashi J. Kasai K. Takahashi N. Isobe T. Lectin affinity capture, isotope-coded tagging and mass spectrometry to identify N-linked glycoproteins.Nat. Biotechnol. 2003; 21: 667-672Crossref PubMed Scopus (562) Google Scholar, 21Mawuenyega K.G. Kaji H. Yamauchi Y. Shinkawa T. Saito H. Taoka M. Takahashi N. Isobe T. Large-scale identification of Caenorhabditis elegans proteins by multidimensional liquid chromatography-tandem mass spectrometry.J. Proteome Res. 2003; 2: 23-35Crossref PubMed Scopus (115) Google Scholar) except that we defined a custom modification, "deamidation with 18O (asparagine + 3 Da)," for the deamidation of Asn incorporating 18O. We first screened the candidate peptides with probability-based Mowse scores that exceeded their thresholds (p < 0.05) and with MS/MS signals for y- or b-ions >3; finally we selected "identified peptides" that contained one or more aspartic acid tagged with 18O atoms on the basis of their MS/MS spectra. If a prospective "identified peptide" did not contain the consensus tripeptide sequence for N-linked glycosylation (Asn-Xaa-(Ser/Thr)), the data were eliminated regardless of the match score. The resulting dataset was finally evaluated by in-house software STEM (27Shinkawa T. Taoka M. Yamauchi Y. Ichimura T. Kaji H. Takahashi N. Isobe T. STEM: a software tool for large-scale proteomic data analyses.J. Proteome Res. 2005; 4: 1826-1831Crossref PubMed Scopus (50) Google Scholar) to remove unreliable Mascot peptide identifications and redundant assignments and to integrate the results with key parameters of the experiment. The transmembrane segment and the signal peptide of proteins were predicted by SignalP 3.0 (28Bendtsen J.D. Nielsen H. von Heijne G. Brunak S. Improved prediction of signal peptides: SignalP 3.0.J. Mol. Biol. 2004; 340: 783-795Crossref PubMed Scopus (5586) Google Scholar) and/or ConPredII (29Arai M. Mitsuke H. Ikeda M. Xia J.-X. Kikuchi T. Satake M. Shimizu T. ConPred II: a consensus prediction method for obtaining transmembrane topology models with high reliability.Nucleic Acids Res. 2004; 32: W390-W393Crossref PubMed Scopus (188) Google Scholar) bioinformatics tools. Because the glycosylation reaction takes place within the lumen of the endoplasmic reticulum (ER), N-linked glycoproteins should also have a signal sequence and/or a transmembrane segment as discussed later. We identified these two structural elements in ∼25% of the 22,500 genes predicted from the genome sequence of C. elegans (Wormpep), suggesting that there are ∼6,000 potential targets for N-linked glycosylation. To catalog N-glycosylated proteins expressed in C. elegans and to study details of protein glycosylation, we used IGOT coupled with MS-based proteomics. To increase the coverage, we used three types of lectin columns with different binding specificity for the oligosaccharide attached to the polypeptide chain; thus, the columns contained immobilized Con A, WGA, and GaL6 (20Hirabayashi J. Ubukata T. Kasai K. Purification and molecular characterization of a novel 16-kDa galectin from the nematode Caenorhabditis elegans.J. Biol. Chem. 1996; 271: 2497-2505Abstract Full Text Full Text PDF PubMed Scopus (65) Google Scholar), which are specific for the non-reducing end of Man, GlcNAc, and Gal, respectively. In addition, the lectin affinity chromatography was performed with tryptic peptide mixtures derived from soluble and insoluble protein fractions of C. elegans crude extract (see "Experimental Procedures"). The glycopeptide mixtures were further purified by hydrophilic interaction chromatography on Sepharose CL-4B, subjected to IGOT (i.e. N-glycanase-mediated 18O labeling), and analyzed by automated 2D nano-LC-MS/MS shotgun technology to identify 18O-labeled formerly N-glycosylated peptides. To maximize the number of identifications, the shotgun analysis was repeated three times for each peptide mixture prepared by Con A, WGA, and GaL6 affinity chromatography of the soluble/insoluble fractions. Supplemental Table 1 lists all the candidate glycosylated peptides in C. elegans identified in this study and all their MS/MS spectra are shown in Supplemental Fig. 1-1 to 1-9. Supplemental Table 2 lists the C. elegans N-glycosylated proteins and the number of glycosylation sites identified in this study. We identified 1,204 N-glycosylated sites on 686 proteins from Con A-captured glycopeptide mixtures and likewise 474 sites on 276 proteins from WGA- and 382 sites on 330 proteins from GaL6-captured glycopeptide mixtures. After eliminating redundant identifications, we had identified 1,465 N-glycosylated sites on 829 unique proteins (Fig. 1 and Supplemental Table 2). The number of glycosylated sites assigned on each protein ranged from 1 to 24 with an average of 1.5. The glycoproteins we identified were quite diverse in terms of subcellular localization and function, etc., yet many (approximately 50%) were integral membrane proteins such as cell surface receptors, transporters, channels, extracellular matrix proteins, and proteases. Previous studies have shown that most glycans liberated from C. elegans membrane proteins contain neutral sugars and have an oligomannose-type structure and that approximately 80% of N-linked glycans in C. elegans have a non-reducing end mannose that is recognized by Con A (30Natsuka S. Adachi J. Kawaguchi M. Nakakita S. Hase S. Ichikawa A. Ikura K. Structural analysis of N-linked glycans in Caenorhabditis elegans.J. Biochem. 2002; 131: 807-813Crossref PubMed Scopus (62) Google Scholar, 31Hirabayashi J. Hayama K. Kaji H. Isobe T. Kasai K. Affinity capturing and gene assignment of soluble glycoproteins produced by the nematode Caenorhabditis elegans.J. Biochem. 2002; 132: 103-114Crossref PubMed Scopus (66) Google Scholar). These glycans lack sialic acid as the C. elegans genome has no sialyltransferase gene, implying that the glycan structure is relatively simple as compared with that of mammalian cells (32Natsuka S. Comparative biochemical view of N-glycans.Trends Glycosci. Glycotechnol. 2005; 17: 229-236Crossref Scopus (6) Google Scholar). Thus, our lectin affinity analysis of glycopeptides showed that the largest subset of N-glycoproteins was identified from the Con A-captured peptide mixtures (Fig. 1); however, the glycopeptides collected by each lectin column overlapped significantly (Supplemental Table 3). For example, 24 glycopeptides assigned for him-4 (F15G9.4) were identified using the Con A column, whereas some of these glycopeptides were also recovered from the WGA and GaL6 columns, suggesting that the him-4 product has a highly heterogeneous glycan structure. Of the 1,465 glycosylated sites we determined, 138 sites on 105 proteins were found redundantly in the peptides captured by the three lectin columns, and 317 sites on 228 proteins were found in the peptides captured by two of the lectin columns. Although a subset of those peptides should have hybrid-type glycan structures that would be recognized by multiple lectins, our study implies that most of the worm glycoproteins have complex glycoforms that are typical of eukaryotes. It should be noted that glycan structures may be heterogeneous not only at the protein level but also that each glycosylation site may carry a complex series of N-linked glycans if one particular peptide on a single protein is recovered by multiple lectin columns (e.g.

Referência(s)