1. Field of the Invention
The present invention relates to a novel bioinformatics platform for identification and quantification of N-linked glycopeptide based on mass spectrometry.
2. Description of the Related Art                Human blood is a mixture of various proteins, among which at least 50% are glycoproteins. However, it is often not possible to characterize all glycopeptides in a sample because of the high complexity and diversity of glycoprotein samples, and the relative lower mass spectrometry intensities of glycopeptides in comparison to the non-glycopeptides. Ever since high resolution mass spectrometer was introduced, the analysis of glycan and glycoprotein has been advancing fast. Nevertheless, bioinformatics techniques necessary for the identification and quantification of glycoproteins are not yet up to date. It is known that many therapeutic proteins are glycosylated in a variety of forms and in very complicated ways. There are two main ways of techniques for the identification and quantification of glycosylation of such glycoproteins. One way is the identification and quantification of glycan by using chemical cleavage induced by an enzyme or a compound, and the other way is the identification and quantification of glycopeptide. However, the method for analyzing released glycan has a problem of not knowing glycosylation site-specific information. Should glycopeptide can be identified and quantified as it is, not only tumor identification and progression can be checked but also huge information on glycosylation site and N-linked sugar chain size and the numbers of growing side chain in addition to glycoprotein itself can be obtained.        
In general, the method for analyzing glycosylation of glycoprotein includes the step of concentration of glycoprotein at protein level. But in this invention, a standard glycoprotein sample was used to execute and complete this invention. Particularly, peptides and glycopeptides were obtained by hydrolyzing the said standard glycoprotein sample with trypsin. Then, those peptides were analyzed with high resolution mass spectrometer. The results of tandem mass spectrum (MS/MS) and mass spectrum (MS) were compared with glycopeptide database to identify and quantify the glycopeptides.
The recent softwares for the screening of glycopeptide using MS/MS or MS are exemplified by Peptoonist (David Goldberg., et. al; Automated N-Glycopeptide Identification Using a Combination of Single- and Tandem-MS. Journal of proteome research 2007, 6, 3995-4005), SimGlycan (glycotools.qa-bio.com/SimGlycan), GlycoMiner (Oliver Ozohanics., et., al; GlycoMiner: a new software tool to elucidate glycopeptide composition. Rapid Communications in Mass Spectrometry 2008, 22, 3245-3254), GlycoSpectrumScan (NandanDeshpande., et., al; GlycoSpectrumScan: Fishing Glycopeptides from MS Spectra of Protease Digests of Human Colostrum slgA. Journal of proteome research 2010, 9, 1063-1075), GlycoPep Grader (Carrie L. Woodin. et., al; GlycoPep Grader: A Web-Based Utility for Assigning the Composition of N-Linked Glycopeptides. Analytical Chemistry, 2012), etc.
Since the above software programs were designed to identify glycopeptide by using tandem mass spectrum alone, there is a doubt in accurate profiling of glycopeptide and also inconsistency in the quantitative analysis results is another problem. It has been an issue for them not to support various high resolution mass spectrometers.
To overcome the above problems, the present inventors studied and completed a novel method for more efficient identification and quantification of comparatively low abundant glycopeptides, compared with general peptides, using mass spectrum. First of all, glycopeptides of 281 glycoproteins (Terry Farrah., et., al.; A High-Confidence Human Plasma Proteome Reference Set with Estimated Concentrations in PeptideAtlas. Molecular Cell Proteomics, 2011) existing at high concentration in human serum were stored, followed by modeling of theoretical isotopic distribution of such glycopeptides. The obtained results proceeded to database. In this invention, isotopic distribution of glycopeptide was obtained by using MS/MS and MS as well, which was compared with glycopeptide database to identify glycopeptide accurately and to quantify such glycopeptide by calculating area in ion chromatograms.