NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Nishihara S, Angata K, Aoki-Kinoshita KF, et al., editors. Glycoscience Protocols (GlycoPODv2) [Internet]. Saitama (JP): Japan Consortium for Glycobiology and Glycotechnology; 2021-.

Identification of intact glycopeptides by liquid chromatography/tandem mass spectrometry followed by database search

, Dr.
iGCORE, Nagoya University
Corresponding author.

Created: ; Last Revision: March 28, 2022.

Introduction

The mass spectrometric analysis of intact glycopeptides is structurally difficult because glycopeptides are composed of different oligomeric compounds, peptides, and oligosaccharide(s). To identify peptides in common MS method, partial fragmentation of (preferably single) peptide bond by e.g., collision-induced dissociation (CID) followed by the acquisition of the fragment mass spectrum (MS/MS or MS2 spectrum) is used. However, glycoside bonds between peptide and glycan and between monosaccharides are weaker than the peptide bond; thus, glycoside bonds are preferentially cleaved by CID and peptide bonds are not cleaved, thereby peptide portion could not be identified. With the first CID of N-glycopeptide ion selected, fragment ions of peptide (called Y0) and peptide having a single inner GlcNAc of chitobiose core (Y1) are often generated; thus, by selecting one of these ions to cleave by 2nd CID, peptide fragment ions will be generated. By analyzing the fragment spectrum (MS/MS/MS or MS3), peptide portion might be assignable (Figure 1). Currently, high-energy CID (HCD) can cleave peptide and glycoside bonds simultaneously. Using the HCD MS2 spectrum, intact glycopeptides can be identified at improved sensitivity (Figure 1). Several software for the identification of glycopeptides based on the HCD spectrum are available and are developing actively (1). For O-glycopeptides having relatively short glycans, such as Tn-antigen (O-GalNAc), T-antigen (core 1; Gal-GalNAc), sialyl T, disialyl T, etc, their identification is possible using HCD MS2, and glycosylated site is also assignable using electron-transfer dissociation (ETD) MS2, as described elsewhere in this protocol series, GlycoPOD. With the Orbitrap Fusion tribrid mass spectrometer, more powerful acquisition of MS2, such as HCD fragment-triggered HCD/ETD/EThcD/ETciD spectra, are available. In this section, the method for identifying N-glycopeptide with HCD MS2 spectra is introduced.

Protocol

This chapter describes a protocol for identifying intact glycopeptides in a complex glycopeptide mixture using two database search engines, Mascot and Byonic, based on MS/MS spectra acquired using liquid chromatography–mass spectrometry (LC-MS) analysis. As described in Introduction, a series of MS/MS-based search engine software have been developed (1). Byonic showed a good performance in the comparative study.

Materials

1.

Glycopeptide sample

Instruments

1.

LC-MS system: Nanoflow LC: Ultimate 3000 (Thermo Fisher Scientific, Waltham, MA, US)

2.

LC-MS system: Electrospray ionization (ESI)–tandem mass spectrometer equipped HCD: Orbitrap Fusion tribrid mass spectrometer (Thermo Fisher Scientific)

3.

Database search engine: Mascot server (Ver.2.6.2, Matrix Science, Boston, MA, US)/Byonic (Protein Metrics, Cupertino, CA, US)

Methods

1.

Preparation of a peptide or glycopeptide sample

a.

Prepare a peptide or a glycopeptide sample as described in other sections in this series.

2.

LC-MS analysis of (glyco)peptide sample

a.

Analyze the (glyco)peptide sample using LC–ESI–tandem mass spectrometer equipped HCD.

b.

Representative LC-MS conditions: Mode: positive; MS1 detector: Orbitrap; MS1 resolution: 120,000 at m/z 200; MS1 mass range: 400–2,000; Activation: HCD; Normalized collision energy: 30%; Data acquisition: data-dependent or data-dependent HCD fragment-triggered HCD acquisition; Trigger: 204.0872 (HexNAc+H+); MS2 detector: Orbitrap; MS2 resolution: 15,000; and MS2 first mass: 135.

3.

Database search by Mascot server

a.

Convert the MS raw data to Mascot generic format (MGF) using Mascot Distiller, which is performed by Mascot Daemon, and then MS2 spectra are deconvoluted.

b.

Search HCD MS2 spectra (MGF) using the Mascot server and any appropriate protein sequence database. The number of assuming variable modifications (glycan composition) should be one or two in a single search, e.g., M5 (HexNAc2Hex5), hybrid (HexNAc3Hex6), or biantennary (HexNAc4Hex5). Then, the setting of glycosylation should be modified considering neutral losses and ignore masses since CID may cause partial neutral losses of mono- or oligosaccharides and generate glycan fragments, which have no means for peptide assignment. For example, in the case of HexNAc4Hex5, set neutral losses of Hex, Hex2, Hex2HexNAc1, Hex2HexNAc2, …, Hex5HexNAc3, and Hex5HexNAc4 and ignore masses, such as H+HexNAc (204.0872), H+Hex1HexNAc1 (366.1400), etc. (Note 1)

c.

Representative search conditions: Method: MS/MS ion search; Database: SwissProt_UniProtKB_isoform; Enzyme: Trypsin; Fixed modifications: Carbamidomethy (Cys); Variable modifications: Ammonia-loss (N-term Cys), Gln->pyroGlu (N-term Gln), Oxidation (Met), and glycosylation, e.g., Hex5HexNAc2 (modified in neutral losses and ignore masses); Peptide mass tolerance: 7 ppm; Fragment mass tolerance: 0.02 Da; Maximum missed cleavage: 2; and Instrument type: Electrospray ionization–Fourier transform ion cyclotron resonance.

d.

Representative annotated MS2 spectra are shown in Figures 2 and 3.

4.

Database search using Byonic

a.

Byonic search is performed via Proteome Discoverer using MS raw data.

b.

Set the workflow of the Proteome Discoverer to use Spectrum Selector.

c.

Representative search conditions: Database: SwissProt_UniProtKB_isoform; Enzyme: Trypsin_KR (full); Maximum missed cleavage: 2; Precursor mass tolerance: 7 ppm; Fragmentation type: HCD; Fragment mass tolerance: 0.02 Da; Modifications: Static: Carbamidomethy (Cys) and Dynamic: Ammonia-loss (N-term Cys), Gln->pyroGlu (N-term Gln), and Oxidation (Met). For Byonic, the dynamic modification can be set as either “common” or “rare,” and these have separate limit numbers of occurrences per single peptide. Please set an appropriate category and a limit number for each modification according to the rules of the software. Searching with numerous modifications for O-glycans takes time; and Glycan database: e.g., N-glycan 132 human.

d.

A representative annotated MS2 spectrum is shown in Figure 4. (Note 2).

Notes

1.

On Mascot search, it is not preferable to set many glycan compositions as variable modifications. Thus, searching and selecting one or two compositions is recommended. Furthermore, customizing modifications, including neutral losses and ignore masses, is important for increasing the possibility of identification. If reliable identification is obtained, glycopeptide ions having a common peptide, but different glycan compositions may exist at near-retention time of the identified glycopeptide.

2.

Byonic is a powerful search engine as it can search while considering numerous glycan compositions. However, it has been indicated that it is difficult to set criteria of certainty (2). It may be necessary to confirm the presence of glycan diagnostic ions, match of the Y0/Y1 ions with the identified peptide, and partial sequence inspection.

References

1.
Kawahara R, Chernykh A, Alagesan K, Bern M, Cao W, Chalkley RJ, Cheng K, Choo MS, Edwards N, Goldman R, Hoffmann M, Hu Y, Huang Y, Kim JY, Kletter D, Liquet B, Liu M, Mechref Y, Meng B, Neelamegham S, Nguyen-Khuong T, Nilsson J, Pap A, Park GW, Parker BL, Pegg CL, Penninger JM, Phung TK, Pioch M, Rapp E, Sakalli E, Sanda M, Schulz BL, Scott NE, Sofronov G, Stadlmann J, Vakhrushev SY, Woo CM, Wu HY, Yang P, Ying W, Zhang H, Zhang Y, Zhao J, Zaia J, Haslam SM, Palmisano G, Yoo JS, Larson G, Khoo KH, Medzihradszky KF, Kolarich D, Packer NH, Thaysen-Andersen M. Community evaluation of glycoproteomics informatics solutions reveals high-performance search strategies for serum glycopeptide analysis. Nat Methods. 2021 Nov;18(11):1304–1316. [PMC free article: PMC8566223] [PubMed: 34725484] [CrossRef]
2.
Go EP, Zhang S, Ding H, Kappes JC, Sodroski J, Desaire H. The opportunity cost of automated glycopeptide analysis: case study profiling the SARS-CoV-2 S glycoprotein. Anal Bioanal Chem. 2021 Dec;413(29):7215–7227. [PMC free article: PMC8390178] [PubMed: 34448030] [CrossRef]

Footnotes

The authors declare no competing or financial interests.

Figures

Figure 1: . Acquisition of fragment mass spectrum (MS/MS spectrum) for the identification of intact N-glycopeptide using database search.

Figure 1:

Acquisition of fragment mass spectrum (MS/MS spectrum) for the identification of intact N-glycopeptide using database search. The peptide portion of glycopeptide is difficult to fragment using collision-induced dissociation (CID) since glycoside bonds of glycan are weaker than peptide bonds and are broken preferentially, as shown in the upper middle spectrum. Therefore, peptide bonds are not cleaved, and peptide sequence cannot be determined. With this CID, ions of peptide and peptide remaining single GlcNAc are often generated (called Y0 and Y1 ions, respectively). If Y0/Y1 ions are selected to fragment again and fragment ion spectrum (MS/MS/MS) may be acquired, the peptide sequence becomes assignable. Currently, high-energy CID (HCD) is available, and both peptide and glycoside bonds can be fragment simultaneously. Using these MS2 spectra, glycan composition and peptide sequence of intact glycopeptide can be identified using database search.

Figure 2: . An annotated high-energy collision-induced dissociation (HCD) MS2 spectrum of a glycopeptide carrying M5 (oligomannose-type glycan containing 5 mannoses) identified using Mascot search.

Figure 2:

An annotated high-energy collision-induced dissociation (HCD) MS2 spectrum of a glycopeptide carrying M5 (oligomannose-type glycan containing 5 mannoses) identified using Mascot search. A series of y-ions were assigned, and glycan fragment ions called diagnostic ions were also detected. It is important to set ignore masses to prevent the selection of meaningless glycan-derived ions for peptide-derived ion scoring.

Figure 3: . An annotated high-energy collision-induced dissociation (HCD) MS2 spectrum of a glycopeptide carrying a biantennary glycan having core fucose identified using Mascot search.

Figure 3:

An annotated high-energy collision-induced dissociation (HCD) MS2 spectrum of a glycopeptide carrying a biantennary glycan having core fucose identified using Mascot search. Upper glycan structure is the most often seen structure for the composition. From this MS2 spectrum, glycan composition of this glycopeptide is presumed to be Hex(5)HexNAc(4)Fuc(1). The presence of core fucose may be confirmed by CID MS2 spectrum showing Y1+Fuc ion.

Figure 4: . An annotated high-energy collision-induced dissociation (HCD) MS2 spectrum of a glycopeptide carrying an M5 glycan identified using Byonic search.

Figure 4:

An annotated high-energy collision-induced dissociation (HCD) MS2 spectrum of a glycopeptide carrying an M5 glycan identified using Byonic search. Like this case, if the glycan is attached near the terminal of the peptide and many peptide-fragment ions can be assigned consecutively, the possibility of glycopeptide identification increases.

Copyright Notice

Licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 Unported license. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Bookshelf ID: NBK593989PMID: 37590718