Artigo Acesso aberto Revisado por pares

Mass Spectrometric Characterization of Proteins from the SARS Virus

2003; Elsevier BV; Volume: 2; Issue: 5 Linguagem: Inglês

10.1074/mcp.m300048-mcp200

ISSN

1535-9484

Autores

Oleg V. Krokhin, Yan Li, Anton Andonov, Heinz Feldmann, Ramon Flick, Steven J.M. Jones, Ute Ströeher, Nathalie Bastien, Kumar Dasuri, Keding Cheng, J. Neil Simonsen, Hélène Perreault, John A. Wilkins, Werner Ens, Frank Plummer, Kenneth G. Standing,

Tópico(s)

SARS-CoV-2 detection and testing

Resumo

A new coronavirus has been implicated as the causative agent of severe acute respiratory syndrome (SARS). We have used convalescent sera from several SARS patients to detect proteins in the culture supernatants from cells exposed to lavage another SARS patient. The most prominent protein in the supernatant was identified by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) as a ∼46-kDa species. This was found to be a novel nucleocapsid protein that matched almost exactly one predicted by an open reading frame in the recently published nucleotide sequence of the same virus isolate (>96% coverage). A second viral protein corresponding to the predicted ∼139-kDa spike glycoprotein has also been examined by MALDI-TOF MS (42% coverage). After peptide N-glycosidase F digestion, 12 glycosylation sites in this protein were confirmed. The sugars attached to four of the sites were also identified. These results suggest that the nucleocapsid protein is a major immunogen that may be useful for early diagnostics, and that the spike glycoprotein may present a particularly attractive target for prophylactic intervention in combating SARS. A new coronavirus has been implicated as the causative agent of severe acute respiratory syndrome (SARS). We have used convalescent sera from several SARS patients to detect proteins in the culture supernatants from cells exposed to lavage another SARS patient. The most prominent protein in the supernatant was identified by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) as a ∼46-kDa species. This was found to be a novel nucleocapsid protein that matched almost exactly one predicted by an open reading frame in the recently published nucleotide sequence of the same virus isolate (>96% coverage). A second viral protein corresponding to the predicted ∼139-kDa spike glycoprotein has also been examined by MALDI-TOF MS (42% coverage). After peptide N-glycosidase F digestion, 12 glycosylation sites in this protein were confirmed. The sugars attached to four of the sites were also identified. These results suggest that the nucleocapsid protein is a major immunogen that may be useful for early diagnostics, and that the spike glycoprotein may present a particularly attractive target for prophylactic intervention in combating SARS. The recent clinical identification of a novel type of atypical pneumonia without a clearly defined etiology, together with epidemiological evidence of high transmissibility, have provoked the World Health Organization to issue a rare travel advisory. The new entity has been called severe acute respiratory syndrome (SARS) 1The abbreviations used are: SARSsevere acute respiratory syndromeNMLNational Microbiology LaboratoryMALDImatrix-assisted laser desorption/ionizationPNGase Fpeptide N-glycosidase FHPLChigh-performance liquid chromatographyMSmass spectrometryMS/MStandem mass spectrometryQqTOFquadrupole/time-of-flightTOFtime-of-flight. ; it apparently began in Guangdong province in China in November of 2002 and has since spread to Hong Kong, Singapore, Vietnam, Canada, the U.S., Taiwan, and several European countries. severe acute respiratory syndrome National Microbiology Laboratory matrix-assisted laser desorption/ionization peptide N-glycosidase F high-performance liquid chromatography mass spectrometry tandem mass spectrometry quadrupole/time-of-flight time-of-flight The outbreak in Canada began in late February 2003 in a traveler returning from Hong Kong whose exposure was to the index case in the Hong Kong epidemic (a physician who had cared for SARS cases in Guangdong province in the People's Republic of China). The Canadian index case died 9 days after the disease onset, and a 43-year-old male relative became ill 2 days after exposure and died of the adult respiratory distress syndrome 15 days after the illness began (1Poutanen S.M. Low D.E. Henry B. Finkelstein S. Rose D. Green K. Tellier R. Draker R. Adachi D. Ayers M. Chan A.K. Skowronski D.M. Salit I. Simor A.E. Slutsky A.S. Doyle P.W. Krajden M. Petric M. Brunham R.C. McGeer A.J. Canada National Microbiology Laboratory the Canadian Severe Acute Respiratory Syndrome Study Team Identification of severe acute respiratory syndrome in Canada.N. Engl. J. Med. 2003; 348: 1995-2005Google Scholar). Subsequently, Canada has faced the largest SARS outbreak outside of Asia, with at least 351 probable or suspected cases and 27 deaths, mostly in the Toronto area (2Booth C.M. Matukas L.M. Tomlinson G.A. Rachlis A.R. Rose D.B. Dwosh H.A. Walmsley S.L. Mazzulli T. Avendano M. Derkach P. Ephtimios I.E. Kitai I. Mederski B.D. Shadowitz S.B. Gold W.L. Hawryluck L.A. Rea E. Chenkin J.S. Cescon D.W. Poutanen S.M. Detsky A.S. Clinical features and short-term outcomes of 144 patients with SARS in the greater Toronto area.J. Am. Med. Assoc. May 6, 2003; 289: 2801-2809Google Scholar, 3Health Canada. Latest Canadian Numbers on SARS.http://www.hc-sc.gc.ca/english/protection/warnings/sars/sars_numbers.htmlGoogle Scholar). Samples from patients with suspected or probable SARS in Canada have been referred to the National Microbiology Laboratory (NML), Health Canada, for laboratory diagnostics. This laboratory, part of the Canadian Science Centre for Human and Animal Health, is Canada's national reference center for infectious diseases and houses the only Class 4 containment facilities in the country. NML has played an active role in an intensive international collaborative effort among 11 laboratories around the world that suggested a distinct coronavirus may be etiologically involved. In particular, the laboratory prepared the nucleotide samples for the first successful effort to determine the genome sequence for the coronavirus (4Marra M.A. Jones S.J. Astell C.R. Holt R.A. Brooks-Wilswon A. Butterfield Y.S. Khattra J. Asano J.K. Barber S.A. Chan S.Y. Cloutier A. Coughlin S.M. Freeman D. Girn N. Griffith O.L. Leach S.R. Mayo M. McDonald H. Montgomery S.B. Pandoh P.K. Petrescu A.S. Robertson A.G. Schein J.E. Siddiqui A. Smailus D.E. Stott J.M. Yang G.S. Plummer F. Andonov A. Artsob H. Bastien N. Bernard K. Booth T.F. Bowness D. Czub M. Drebot M. Fernando L. Flick R. Garbutt M. Gray M. Grolla A. Jones S. Feldmann H. Meyers A. Kabani A. Li Y. Normand S. Stroher U. Tipples G.A. Tyler S. Vogrig R. Ward D. Watson B. Brunham R.C. Krajden M. Petric M. Skowronski D.M. Upton C. Roper R.L. The genome sequence of the SARS-associated coronavirus.Science. May 1, 2003; 300: 1399-1404Google Scholar), a result soon confirmed by several other laboratories (see, for example, Ref. 5Rota P.A. Oberste M.S. Monroe S.S. Nix W.A. Campagnoli R. Icenogle J.P. Penaranda S. Bankamp B. Maher K. Chen M.H. Tong S. Tamin A. Lowe L. Frace M. DeRisi J.L. Chen Q. Wang D. Erdman D.D. Peret T.C. Burns C. Ksiazek T.G. Rollin P.E. Sanchez A. Liffick S. Holloway B. Limor J. McCaustland K. Olsen-Rasmussen M. Fouchier R. Gunther S. Osterhaus A.D. Drosten C. Pallansch M.A. Anderson L.J. Bellini W.J. Characterization of a novel coronavirus associated with severe acute respiratory syndrome.Science. May 1, 2003; 300: 1394-1399Google Scholar). Nevertheless, the genome sequence merely provides a template for the construction of the viral proteins. Thus, an alternative strategy is to examine the proteins themselves, and mass spectrometry has proved to be an efficient tool for this purpose (6Thomas J.J. Bakhtiar R., R. Siuzdak G. Mass spectrometry in viral proteomics.Acc. Chem. Res. 2001; 33: 179-187Google Scholar). The University of Manitoba time-of-flight mass spectrometry laboratory has already been active in characterizing viral proteins (7Seifers D.L. Harvey T.L. Haber S. She Y.-M. Chernushevich I. Ens W. Standing K.G. Natural infection of sorghum by foxtail mosaic disease in Kansas.Plant Dis. 1999; 83: 905-912Google Scholar, 8Seifers D.L. Salomon R. Marie-Jeanne V. Alliot B. Signoret P. Haber S. Loboda A. Ens W. She Y.-M. Standing K.G. Characterization of a novel potyvirus isolated from maize in Israel.Phytopathology. 2000; 90: 505-513Google Scholar, 9She Y.-M Haber S. Seifers D.L. Loboda A. Chernushevich I. Perreault H. Ens W. Standing K.G. Determination of the complete amino acid sequence for the coat protein of brome mosaic virus by time-of-flight mass spectrometry.J. Biol. Chem. 2001; 276: 20039-20047Google Scholar, 10Swanson M.I. She Y.-M Ens W. Brown E.G. Coombs K.M. Mammalian reovirus core protein micro 2 initiates at the first start codon and is acetylated.Rapid Commun. Mass Spectrom. 2002; 16: 2317-2324Google Scholar, 11Mendez, I. I., She, Y. M., Ens, W., and Coombs, K. M. (2003) Digestion pattern of reovirus outer capsid protein σ3 determined by mass spectrometry. Virology, in pressGoogle Scholar), so it was natural for NML to enlist the university laboratory (late in March) as a collaborator in the analysis of the SARS proteins. The first results of this collaboration are described below. Clinical specimens obtained from the original case cluster were extensively investigated for the presence of bacterial and viral pathogens (1Poutanen S.M. Low D.E. Henry B. Finkelstein S. Rose D. Green K. Tellier R. Draker R. Adachi D. Ayers M. Chan A.K. Skowronski D.M. Salit I. Simor A.E. Slutsky A.S. Doyle P.W. Krajden M. Petric M. Brunham R.C. McGeer A.J. Canada National Microbiology Laboratory the Canadian Severe Acute Respiratory Syndrome Study Team Identification of severe acute respiratory syndrome in Canada.N. Engl. J. Med. 2003; 348: 1995-2005Google Scholar). Nasopharyngeal swab and bronchoalveolar lavage fluids from several of these patients were found to be positive by reverse transcription-PCR for human metapneumovirus and the novel coronavirus (1Poutanen S.M. Low D.E. Henry B. Finkelstein S. Rose D. Green K. Tellier R. Draker R. Adachi D. Ayers M. Chan A.K. Skowronski D.M. Salit I. Simor A.E. Slutsky A.S. Doyle P.W. Krajden M. Petric M. Brunham R.C. McGeer A.J. Canada National Microbiology Laboratory the Canadian Severe Acute Respiratory Syndrome Study Team Identification of severe acute respiratory syndrome in Canada.N. Engl. J. Med. 2003; 348: 1995-2005Google Scholar). Inoculation of the bronchoalveolar lavage fluid from the 43-year-old male patient in Vero E6 cells produced a strong cytopathic effect on day 4 after infection. The second passage of this viral isolate was further used to produce large quantities of the virus. Initially, this virus material was used to assess its antigenicity with convalescent serum samples from SARS patients. The convalescent sera that were previously found to be positive for antibodies to the virus by indirect immunofluorescence assay 2Y. Li, unpublished data. strongly reacted in Western blot with a ∼46-kDa protein (Fig. 1A) similar in size to the nucleocapsid protein of coronaviruses (12Lai M.M.C. Holmes K.V. Fields Virology. 4th Ed. Lippincott Williams & Wilkins, Hagerstown, MD2001: 1163-1185Google Scholar). In order to prepare this (and perhaps other SARS-related proteins) for proteolytic digestion, the virus was purified by 20–60% linear sucrose gradient. Western blotting of the gradient fractions showed that fraction 4 (density, 1.18g/cm3) reacted strongly with a convalescent serum from a SARS patient. This fraction was run on a Novex 4–12% Bis-Tris gel in 4-morpholinepropanesulfonic acid running buffer (Invitrogen), and stained with Coomassie blue (Fig. 1B). Two bands were then excised from the gel (indicated by arrowheads), one containing the prominent ∼46-kDa protein and the other containing a much weaker protein band with an apparent mass of ∼180 kDa. These were transferred to the university laboratories for in-gel digestion with various proteolytic enzymes. The excised protein bands were in-gel digested with one of three different enzymes (sequencing grade-modified trypsin (Promega, Madison, WI), Lys-C, or Asp-N (both from Roche Molecular Biochemicals)). Digestions were performed according to the procedure described by Shevchenko et al. (13Shevchenko A. Wilm M. Vorm O. Mann M. Mass spectrometric sequencing of proteins from polyacrylamide gels.Anal. Chem. 1996; 68: 850-858Google Scholar) either in ordinary water or else in a 1:1 H216O:H218O mixture (14Shevchenko A. Chernushevich I. Ens W. Standing K.G. Thomson B. Wilm M. Mann M. Rapid 'de novo' peptide sequencing by a combination of nanoelectrospray, isotopic labeling and a quadrupole/time-of-flight mass spectrometer.Rapid Commun. Mass Spectrom. 1997; 11: 1015-1024Google Scholar, 15Stewart II Thomson T. Figeys D. 18O labeling: A tool for proteomics.Rapid Commun. Mass Spectrom. 2001; 15: 2456-2465Google Scholar, 16Yao X. Alfonso C. Fenselau C. Dissection of proteolytic 18O labeling: Endoprotease catalyzed 16O to 18O exchange of truncated peptide substrates.J. Proteome Res. 2003; 2: 147-152Google Scholar) prepared from 98% H218O (Isotec, Miamisburg, OH) and ordinary water. Unless otherwise noted, all other chemicals were purchased from Sigma. The extracts containing the peptide mixture were lyophilized and resuspended in 5.5 μl of 0.5% trifluoroacetic acit in water, then 0.5 μl of the resulting sample was mixed 1:1 with 2,5-dihydroxybenzoic acid (150 mg/ml in water:acetonitrile 1:1) matrix solution and deposited on the gold surface of a matrix-assisted laser desorption/ionization (MALDI) target. The remaining 5 μl was separated into fractions by micro-high-performance liquid chromatography (μHPLC), and the individual fractions were deposited on a target for subsequent mass spectrometric analysis. Chromatographic separations were performed using an Agilent 1100 Series system (Agilent Technologies, Wilmington, DE). Deionized (18 mΩ) water and HPLC-grade acetonitrile were used for the preparation of eluents. Samples (5 μl) were injected onto a 150 μm × 150 mm column (Vydac 218 TP C18, 5μ; Grace Vydac, Hesperia, CA) and eluted with a linear gradient of 1–80% acetonitrile (0.1% trifluoroacetic acid) in 60 min. The column effluent (4 μl/min) was mixed on-line with dihydroxybenzoic acid matrix solution (0.5 μl/min) and deposited by a small computer-controlled robot onto a movable gold target at 1-min intervals (17Krokhin O. Qian Y. McNabb J.R. Spicer V. Standing K.G. Ens W. An off-line interface between HPLC and orthogonal MALDI TOF.American Society for Mass Spectrometry, 50th ASMS Conference on Mass Spectrometry and Allied Topics. American Society for Mass Spectrometry, Santa Fe, NM2002Google Scholar). The vast majority of the tryptic fragments were eluted within 40 min under the HPLC condition used, so 40 fractions were normally collected. Our original intention was to postpone any detailed analysis of the higher mass protein to a subsequent investigation. Later, when we decided to include this effort in the present measurements, the only materials that we had available were two lyophilized samples from digests of the larger protein (∼180-kDa band), one from a tryptic digest and one from a Lys-C digest. The sample from the tryptic digest was separated by HPLC and used for analysis of the glycosylated peptides. The sample from the Lys-C digest was digested twice more, first by peptide N-glycosidase F (PNGase F; Roche Molecular Biochemicals) to remove the asparagine-linked glycosylation (18Tarentino A.L. Gomez C.M. Plummer Jr., T.H. Deglycosylation of asparagine-linked glycans by peptide:N-glycosidase F.Biochemistry. 1985; 24: 4665-4671Google Scholar), then by trypsin to produce smaller fragments (both digestions in ordinary water). The spots on the gold targets were analyzed individually, both by single mass spectrometry (MS) and by tandem mass spectrometry (MS/MS) in the Manitoba/Sciex prototype quadrupole/TOF (QqTOF) mass spectrometer (subsequent commercial model sold as QSTAR by Applied Biosystems/MDS Sciex, Foster City, CA) (19Loboda A.V. Krutchinsky A.N. Bromirski M. Ens W. Standing K.G. A tandem quadrupole/time-of-flight mass spectrometer with a matrix-assisted laser desorption/ionization source: Design and performance.Rapid Commun. Mass Spectrom. 2000; 14: 1047-1057Google Scholar). In this instrument, ions are produced by irradiation of the target with photon pulses from a 20-Hz nitrogen laser (VSL 337ND, Spectra-Physics, Mountain View, CA) with 300 μJ energy per pulse. Orthogonal injection of ions from the quadrupole into the TOF section normally produce a mass resolving power 10,000 full-width half-maximum and accuracy within a few millidaltons in the TOF spectra in both MS and MS/MS modes, as long as the ion peak is reasonably intense. Fig. 2A shows the m/z spectrum of the mixture of peptides resulting from tryptic digestion of the ∼46-kDa protein in ordinary water, before HPLC fractionation. Note that Tx-y indicates a tryptic fragment containing amino acid residues x to y in Fig. 2 and in subsequent tables and discussion. A small region of this spectrum is expanded in Fig. 2B, and an HPLC fraction containing some of the same ions is shown in Fig. 2C. Here, the most intense ion in Fig. 2B has moved to a different fraction, but some of the weaker ions are much more prominent. It is clear that individual peptide peaks are considerably easier to distinguish after HPLC separation; spectra of the fractions are dramatically simpler and have a signal-to-noise ratio improved by a factor ∼10 or more. Initial efforts to identify the protein (based on data base searching against the peptide fingerprint), failed to yield any significant matches, suggesting that it was a novel protein. De novo peptide sequencing was therefore undertaken in order to characterize it. For this purpose, samples were digested in the presence of a 50/50 mixture of ordinary water and H218O, as described above, because the addition of either 18O or 16O during enzymatic cleavage yields spectra containing both species and thus distinguishes fragments containing the C terminus from those containing the N terminus by their distinctive isotopic patterns (14Shevchenko A. Chernushevich I. Ens W. Standing K.G. Thomson B. Wilm M. Mann M. Rapid 'de novo' peptide sequencing by a combination of nanoelectrospray, isotopic labeling and a quadrupole/time-of-flight mass spectrometer.Rapid Commun. Mass Spectrom. 1997; 11: 1015-1024Google Scholar, 15Stewart II Thomson T. Figeys D. 18O labeling: A tool for proteomics.Rapid Commun. Mass Spectrom. 2001; 15: 2456-2465Google Scholar, 16Yao X. Alfonso C. Fenselau C. Dissection of proteolytic 18O labeling: Endoprotease catalyzed 16O to 18O exchange of truncated peptide substrates.J. Proteome Res. 2003; 2: 147-152Google Scholar). In order to determine the amino acid sequence of the proteolytic fragments, each clearly observed peptide ion was selected in turn as a parent by the mass-selecting quadrupole of the QqTOF instrument and subjected to collisionally induced dissociation in the collision cell. For example, the resulting daughter ion spectrum from the m/z = 2297 parent is shown in Fig. 3, where the advantages of the 16O/18O addition for distinguishing the C- and N-terminal ions are clearly evident. The y ions, which contain the C terminus, all show the doublet structure superimposed on the usual isotopic pattern, whereas the b ions, containing the N terminus, have a normal pattern. A comparison between the measured m/z values and the masses calculated from the deduced sequence is given in Table I.Table ICalculated and measured masses for b and y ions from MS/MS measurements of the 2297.092-Da tryptic fragmenty-ionm/z foundMH+ calculatedΔm (mDa)ResidueΔm (mDa)m/z foundMH+ calculatedb-ion y1–147.113–K22279.0832279.081b20 y2261.138261.156−18N––2150.986b19 y3332.193332.1930A––2036.943b18 y4389.197389.215−18G––1965.906b17 y5552.283552.2785Y01908.8851908.885b16 y6649.328649.331−3P−131745.8091745.822b15 y7762.413762.415−2L211648.7901648.769b14 y8849.438849.447−9S81535.6941535.685b13 y9920.478920.484−6A−151448.6381448.653b12y101049.5221049.526−4E191377.6371377.616b11y111146.5711146.579−8P––1248.573b10y121203.5921203.601−9G−41151.5161151.520 b9y131304.6361304.648−8T−51094.4941094.499 b8y141361.6541361.670−16G1993.452993.451 b7y151474.7391474.754−15L−21936.409936.430 b6y161637.8181637.8171Y11823.357823.346 b5y171800.8841800.8813Y9660.291660.282 b4y181947.9701947.94921F1497.220497.219 b3y192111.0222111.01210Y−5350.145350.150 b2y202297.0862297.092−6W––187.087 b1 Open table in a new tab Further examples are provided in the supplemental material. Fig. S1 shows the daughter ions from dissociation of the 1144-Da N-terminal peptide, indicating deletion of the N-terminal methionine and acetylation of the resulting N-terminal serine. Fig. S2 shows a comparison between HPLC-separated ions from tryptic and Lys-C digestions, respectively, showing alternate cleavages at adjacent lysines. Fig. S3 shows a spectrum of the parent ion containing the C terminus, the one C-terminal peptide that shows no doublet structure. A comparison of experimental m/z values and masses calculated for the deduced sequences of all the peptides observed in tryptic digests is given in Table II. In both Tables I and II, most observed m/z values and the masses calculated for the deduced amino acid sequences agree within ∼10 mDa, lending credibility to the assignments; the anomalously high values observed for a few ions in Table I correspond to peaks of very low intensity.Table IIMeasured m/z and calculated MH+ values of peptides found in tryptic digest of the 46-kDa proteinm/z measuredMH+ calculated (Da)Δm (mDa)Residues start-endPeptide sequence601.303601.305−3204–209GNSPAR601.322601.3311103–107ELSPR698.357698.358−1144–149DHIGTR708.330708.335−5 96–102GGDGKM*K711.333711.3312294–299QGTDYK746.387746.3834356–361HIDAYK 749.354749.3540178–185GGSQASSR805.378805.380−2196–203NSTPGSSR831.462831.4575227–233LNQLESK876.452876.461−9101–107M*KELSPR886.401886.406−5170–177GFYAEGSR916.478916.4780362–369TFPPTEPK928.543928.546−3348–355DNVILLNK946.513946.5112 62–68EELRFPR1105.5481105.553−5339–347LDDKDPQFK1144.4931144.499−61–10Ac-SDNGPQSNQR1154.5791154.580−1376–385TDEAQPLPQR1166.5571166.559−2267–276Q**YNVTQAFGR1183.5891183.5863267–276QYNVTQAFGR1202.6101202.613−3238–248GQQQQGQTVTK1282.6781282.6753375–385KTDEAQPLPQR1330.6981330.708−10238–249GQQQQGQTVTKK1410.7741410.7713376–387KKTDEAQPLPQR1611.6981611.6926406–421QLQNSM*SGASADSTQA1684.8951684.8914128–143EGIVWVATEGALNTPK1687.8981687.905−7210–226MASGGGETALALLLLDR1703.8971703.900−3210–226M*ASGGGETALALLLLDR1774.8381774.8362278–293GPEQTQGNFGDQDLIR1850.8331850.8276 15–32ITFGGPTDSTDNNQNGGR1851.8141850.8113 15–32ITFGGPTDSTDNNQN***GGR1875.8791875.8790389–405Q**PTVTLLPAADM*DDFSR1892.9051892.906−1389–405QPTVTLLPAADM*DDFSR1930.9441930.9377277–293RGPEQTQGNFGDQDLIR2005.0082005.0062388–405KQPTVTLLPAADMDDFSR2015.0812015.0810339–355LDDKDPQFKDNVILLNK2021.0052021.0014388–405KQPTVTLLPAADM*DDFSR2077.0482077.0435320–338IGM*EVTPSGTWLTYHGAIK2091.1262091.120−2150–169NPNNNAATVLQLPQGTTLPK2151.0022151.0106 69–88GQGVPINTNSGPDDQIGYYR2252.0622252.071−9300–319HWPQIAQFAPSASAFFGMSR2297.0782297.092−14108–127WYFYYLGTGPEASLPYGANK2307.1282307.11117 69–89GQGVPINTNSGPDDQIGYYRR2324.1872324.190−3 41–61RPQGLPNNTASWFTALTQHGK2516.3252516.339−14210–233M*ASGGGETALALLLLDRLNQLESK Open table in a new tab The MS and MS/MS measurements just described were applied first to the peptides resulting from tryptic digests of the gel band, listed in Table I, and then to the products of a Lys-C digest. BLAST searching (20Altschul S.F. Stephen F. Madden T.L. Zhang Schaeffer Miller J.W. Lipman D.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs.Nucleic Acids Res. 1997; 25: 3389-3402Google Scholar, 21Shevchenko A. Loboda A. Sunyaev A. Shevchenko S. Bork A. Ens P.W. Standing K.G. Charting the proteomes of organisms with unsequenced genomes by MALDI-quadrupole time-of-flight mass spectrometry and BLAST homology searching.Anal. Chem. 2001; 73: 1917-1926Google Scholar) of the total GenBank™ protein data base with these peptides was then undertaken in order to search for homology. The most definitive example was provided by the 2297-Da tryptic peptide. In that case, the highest rated results of the BLAST search are shown in Fig. 4; all are coronavirus nucleocapsid proteins, and all yield BLAST scores of 40 to 41, with E values of 0.003. Moreover, the highest rated hit in the BLAST search that is not a coronavirus protein (a bacterial protein, in this case) had a score of only 29 and a high E value of 9.4. Thus, the ∼46-kDa protein is clearly a coronavirus nucleocapsid protein; indeed, there is complete agreement between the first 10 residues and those found by BLAST in a region of the coronaviruses that is highly conserved. On the other hand, only three out of the next nine residues agree with any of the other viruses, so the SARS virus is significantly different from any of the other coronaviruses. BLAST searches with the other peptides led to similar conclusions; in particular, they strengthened the evidence for significant differences between the SARS coronavirus and any other coronavirus in the data base. By April 12, these measurements had been carried out and most of them analyzed, yielding almost complete sequence information on the individual peptides, as summarized in Table II. The task of fitting together the peptides was not yet done, however, because there were still a number of ambiguities in their order. To sort out this problem, an Asp-N digestion had also been carried out (but not yet separated on the HPLC), and Glu-C and perhaps Arg-C digestions were planned as soon as sufficient material was available. However, these measurements turned out to be unnecessary, because at that stage a nucleotide sequence of infectious material (also prepared by NML), was obtained by a group at the Michael Smith Genome Centre in Vancouver (4Marra M.A. Jones S.J. Astell C.R. Holt R.A. Brooks-Wilswon A. Butterfield Y.S. Khattra J. Asano J.K. Barber S.A. Chan S.Y. Cloutier A. Coughlin S.M. Freeman D. Girn N. Griffith O.L. Leach S.R. Mayo M. McDonald H. Montgomery S.B. Pandoh P.K. Petrescu A.S. Robertson A.G. Schein J.E. Siddiqui A. Smailus D.E. Stott J.M. Yang G.S. Plummer F. Andonov A. Artsob H. Bastien N. Bernard K. Booth T.F. Bowness D. Czub M. Drebot M. Fernando L. Flick R. Garbutt M. Gray M. Grolla A. Jones S. Feldmann H. Meyers A. Kabani A. Li Y. Normand S. Stroher U. Tipples G.A. Tyler S. Vogrig R. Ward D. Watson B. Brunham R.C. Krajden M. Petric M. Skowronski D.M. Upton C. Roper R.L. The genome sequence of the SARS-associated coronavirus.Science. May 1, 2003; 300: 1399-1404Google Scholar) (GenBank™ accession number AY274119), soon followed by similar results from several other laboratories (see for example Ref. 5Rota P.A. Oberste M.S. Monroe S.S. Nix W.A. Campagnoli R. Icenogle J.P. Penaranda S. Bankamp B. Maher K. Chen M.H. Tong S. Tamin A. Lowe L. Frace M. DeRisi J.L. Chen Q. Wang D. Erdman D.D. Peret T.C. Burns C. Ksiazek T.G. Rollin P.E. Sanchez A. Liffick S. Holloway B. Limor J. McCaustland K. Olsen-Rasmussen M. Fouchier R. Gunther S. Osterhaus A.D. Drosten C. Pallansch M.A. Anderson L.J. Bellini W.J. Characterization of a novel coronavirus associated with severe acute respiratory syndrome.Science. May 1, 2003; 300: 1394-1399Google Scholar). It soon became clear to us that the open reading frame identified by the Vancouver group as specifying the coronavirus nucleocapsid protein did in fact predict the amino acid sequence of the ∼46-kDa protein that we were analyzing, as might be expected from the BLAST homology reported above. Consequently, we were able to remove the remaining ambiguities in ordering the proteolytic fragments listed in Table II. A comparison of our results with the predicted sequence is shown in Fig. S4A; the mass spectral data cover more than 96% of the predicted sequence and include both C and N termini. The mass spectra also indicate removal of the N-terminal methionine and oxidation of all other methionines, as well as acetylation of the resulting N-terminal serine, as shown in Fig. S1. The N-terminal deletion and acetylation presumably occur as a result of post-translational modifications (22Krishna R. Wold F. Posttranslational modifications.in: Angeletti R.H. Proteins—Analysis and Design. Elsevier, Academic Press, San Diego, CA1998: 121-206Google Scholar), which of course could not be predicted by the nucleotide data. Otherwise, our results confirm the predicted sequence (GenBank™ accession number AY274119), a result consistent with the samples being derived from the same infectious source at NML. In addition to the almost completely defined ∼46-kDa protein, we have partially characterized a protein that appeared as a very weak band at an apparent mass of ∼180 kDa in the gel separation (Fig. 1B). Despite the low intensity, 39 peptides in the initial tryptic digest were found to belong to the ∼139-kDa "spike protein" predicted by the nucleotide sequence (GenBank™ accession number AY274119), and 36 of these were sufficiently intense for MS/MS measurements, which confirmed the identification (30% coverage). A summary of the data and the coverage for this protein is given in Table S1. This protein is homologous to spike proteins in other coronaviruses, which contain a large number of potential glycosylation sites (NXT or NXS). Thus, they are usually assumed to be extensively glycosylated and to act as attachment proteins. Indeed, the predicted sequence of the spike protein of the SARS coronavirus contains 23 of these potential N-glycosylation sites, of which 17 are identified as likely sites by the Netglyc 1.0 server (available at www.cbs.dtu.dk/services/NetNGlyc). (O-glycosylation may also be possible, but has not been examined here.) To investigate glycosylation in the spike protein, a tryptic digest was treated with PNGase F to remove the glycans, as described above. This step converts asparagine residues to

Referência(s)