Protein glycosylation is ubiquitous in biological systems and essential for cell survival. However, the heterogeneity of glycans and low abundance of many glycoproteins complicate their global analysis. Glycosylation is one of the most common and essential protein modifications in cells. It often determines protein folding, trafficking and stability, and regulates many cellular events, especially cell-cell communication, cell-matrix interactions, and cellular response to environmental cues. Glycoproteins contain a wealth of information related to cellular developmental and diseased statuses, and aberrant protein glycosylation is directly related to human disease, including cancer and infectious diseases. Global analysis of protein glycosylation is critical in understanding glycoprotein functions and identifying glycoproteins as biomarkers and drug targets. However, due to the low abundance of many glycoproteins and heterogeneity of glycans, it is extraordinarily challenging to comprehensively analyze glycoproteins in complex biological samples.
Currently mass spectrometry (MS)-based proteomics provides a unique opportunity to globally analyze protein modifications, including glycosylation. However, effective enrichment prior to MS analysis is imperative for each type of protein modification. For example, with the maturity of phosphoprotein enrichment methods, the global analysis of protein phosphorylation has advanced tremendously, from the identification of several hundred phosphorylation sites a decade ago to over ten thousand sites in recent studies.
In order to comprehensively analyze protein glycosylation in complex biological samples, several glycoprotein/glycopeptide enrichment methods have been reported, including lectin-based and hydrazide chemistry-based methods, and hydrophilic interaction liquid chromatography (HILIC). Currently lectin-based methods are most commonly used to enrich glycopeptides prior to MS analysis. Due to the inherent specificity of lectins, each type of lectin can only recognize a specific glycan structure, and thus, no single lectin or a combination of several lectins can universally enrich all glycosylated peptides or proteins. HILIC has also been extensively used to enrich glycopeptides based on their increased hydrophilicity by glycans. However, this method lacks specificity because it cannot distinguish glycopeptides from many hydrophilic non-glycopeptides. Additionally, two methods, isotope-targeted glycoproteomics (IsoTaG) and solid phase extraction of N-linked glycans and glycosite-containing peptides (NGAG), have been reported. By using IsoTaG, several N-glycopeptides and intact and fully elaborated O-glycopeptides from several proteins across three human cell lines were identified. NGAG was designed for N-glycopeptide enrichment, and several unique N-glycopeptides were identified in mammalian cells. According to prediction and computational results, protein glycosylation is the most common modification. Despite the considerable progress that has been made in the past decade, there is still a substantial gap between the number of glycoproteins reported in the literature and those existing in complex biological samples. Effective enrichment of glycopeptides/glycoproteins will profoundly advance the global analysis of protein glycosylation through MS-based proteomics.
Previously, boronic acid (BA) was demonstrated to have potential in enriching glycopeptides for the global analysis of protein glycosylation because of its reversible covalent interactions with glycans. However, the method suffers from relatively weak interactions; therefore, low-abundance glycoproteins are not effectively enriched. Accordingly, methods and compositions for enriching glycoproteins are needed that overcome the problems of the low abundance of many glycoproteins and heterogeneity of glycans.