The recognition that carbohydrates play a key role in biological processes of living organisms has made their study of great importance for medicine and basic science. The understanding of carbohydrates has lagged behind that of other types of biological molecules because of the immense complexity and variety of these molecules and the lack of availability of analytic and synthetic tools that enable scientists to differentiate one form from another.
Forms of Carbohydrates in Nature
In nature, carbohydrates exist as polymers known as polysaccharides, that consist of a series of monosaccharides that are covalently attached by glycosidic bonds to form both branched and linear macromolecules. In addition, polysaccharides or, more commonly, oligosaccharides may be coupled to macromolecules such as proteins or lipids to form glycoproteins or glycolipids. Unlike naturally occurring polysaccharides, the oligosaccharides associated with protein or lipid consist of a relatively small subset of monosaccharide types.
Oligosaccharides associated with glycoproteins have been the focus of much of the carbohydrate research to date largely because the biological properties of these molecules are diverse and their relatively short monosaccharide sequences make the oligosaccharides amenable to study.
Structural Features of Glycoproteins
Glycoproteins are characterized into two groups according to their linkage to protein. The O-glycosyl linked oligosaccharides including mucin-type oligosaccharides, the proteoglycan type, the collagen-type and the extensin-type are bonded to the hydroxyl oxygen of L-serine or L-threonine. The N-glycosyl linked oligosaccharides are bound to the amido nitrogen of asparagine in a tripeptide generally of the form Asn-Xaa-Ser/Thr (where Xaa represents any amino acid). The N-linked oligosaccharides are further differentiated into 3 subgroups these being the high mannose type, the complex type and the hybrid type. N-linked oligosaccharides are frequently branched where branching commonly occurs either at a mannose residue or at an N-acetylglucosamine residue. These branched structures are called biantennary, if there are two branches, and triantennary if there are three branches.
The oligosaccharide can be characterized by its sequence of monosaccharides. The oligosaccharide is attached at its reducing end to the amino acid sequence of the protein while the non-reducing end is found at the terminal monosaccharide at the other end of the oligosaccharide. Other important characteristics of oligosaccharides are the glycosidic bonds that connect individual monosaccharides. The glycosidic bonds obtain their numerical assignment according to the carbons in the monosaccharide ring where linkage occurs. The carbons are numbered in a clockwise direction from 1 to 6. Any of these carbons can be involved in the glycosidic bond although commonly the carbon-1 on the monosaccharide closer to the non-reducing end forms a glycosidic bond with any other carbon on the monosaccharide toward the reducing end of the oligosaccharide. Because each carbon on a monosaccharide is asymmetric, the glycosidic bond occurs in two anomeric configurations, the alpha and the beta anomer. The type of anomer is determined by the position of the reactive hydroxyl group on the carbon. FIG. 1 illustrates the possible linkage configurations that may exist between two monosaccharides.
Synthesis and Degradation of Oligosaccharides
Oligosaccharides are synthesized by a battery of enzymes in the cell known as glycosidases and glycosyltransferases. Typically, an oligosaccharide is assembled on a lipid carrier and transferred to the appropriate amino acid within the protein to be glycosylated. Glycosidase trimming and glycosyltransferase mediated synthesis follows and individual monosaccharides or preassembled oligosaccharide units are removed or added. In addition, microscopic reversibility may occur when the exoglycosidases that are usually hydrolytic enzymes, act as transferases in a synthetic role (Ichikawa et al. 1992, Anal. Biochem. 202:215-238). In some cases, removal of a monosaccharide results in a conformational change that facilitates further chain synthesis (Camirand et al. 1992, J. Biol. Chem., 266:15120-15127). While not wishing to be bound by theory, one cause of inter-cellular variability in glycosylation patterns for a single protein may arise from different amounts and types of available glycosidases and glycosyltransferases in any single cell.
The availability of individual glycosidases and glycosyltransferases depends on the nutritional environment of the cell (Goochee and Monica 1990, Bio/Technology 6:67-71) the type of cell (Sheares and Robbins 1986, PNAS 83:1993) and its homeostatic state (Kobata 1988, Gann Monogr. Cancer Res. 34:3-13). Associated with the variation in amounts and type of these intracellular enzymes is the occurrence of multiple glycoforms of a single glycoprotein (Parekh et al. 1987, EMBO 6:1233-1244). These glycoforms differ in their oligosaccharide sequence and linkage characteristics as well as in the position and number of attachment sites of the oligosaccharide to the protein. Variation in glycosylation of a single glycoprotein made in different cell types is an important aspect of recombinant protein therapeutic production because of the possible impact of structural heterogeneity on biological function (Sasaki et al. 1987, J. Biol. Chem. 262:12059-12076; Dube et al. 1988, J. Biol. Chem. 263:17516-17521; Lund et al. 1993, Human Antib. Hybridomas, 4:20-25; Parekh et al. 1989, Biochem. 28:7644-7662; Kagawa et al. 1988, J. of Biol. Chem. 263:17508-17515; Parekh et al. 1989, Biochem. 28:7662-7669; Parekh et al. 1989, Biochem. 28:7670-7679).
Not only does the glycosylation pattern of a single protein vary according to which cell it is events may be characteristic of certain evolutionarily related animal species only. Galili et al. 1987, Immunology 84:1369-1373 and Galili et al. 1988, J. Biol. Chem. 263:17755-17762 identified the occurrence of Gal.alpha.1-3Gal in non-primate mammals and New World monkeys, a glycosylation pattern that was absent in humans and Old World monkeys. The absence of this structure could be demonstrated because the disaccharide elicits an immune response in humans. The immune response to atypical glycosylation patterns presents a yet unsolved antigenicity problem that arises from using glycoproteins derived or manufactured in non-primate sources.
Oligosaccharides are degraded by glycosidases that are often highly specific for the glycosidic linkage and the stereochemistry of the oligosaccharide. An example of the influence of remotely located monosaccharides on the digestion of oligosaccharides is found in human patients suffering from fucosidosis. These patients lack the exoglycosidase required to remove fucose from N-linked oligosaccharides prior to digestion with endoglycosidase. The fucose interferes with the enzymatic activity of the endoglycosidase and causes undigested oligosaccharides to be excreted in their urine. (Kobata 1984, The Biology of Carbohydrates, Eds., Ginsberg and Robbins, Wiley, N.Y. vol. 2, pp. 87-162.)
The Biological Impact of Glycosylation of Proteins
The importance of correct synthesis and degradation of oligosaccharides for the organism has been demonstrated in diseases which result from a single defective glycosidase giving rise to incorrect processing of carbohydrate structures. In the example cited above, disease results from the absence of a Fucosidase resulting in incorrect processing of the glycoprotein. Other examples include human .alpha.-Mannosidosis in which the major lysosomal .alpha.-Mannosidase activity is severely deficient (Gasperi et al. 1992, J. of Biol. Chem. 267:9706-9712). Aberrant oligosaccharide structures have also been associated with cancer (Sano et al. 1992, J. Biol. Chem. 267:1522-1527).
The oligosaccharide side chains of glycoproteins have been implicated in such cellular processes as protection of peptide chains against proteolytic attack, facilitation of secretion to the cell surface, induction and maintenance of the protein conformation in a biologically active form, clearance of glycoproteins from plasma and antigenic determinants in differentiation and development. In fact, at any developmental stage, cells may have solved the biosynthetic problem of controlled variation by making not just one glycoprotein but by coding for large repertoires of a protein, each variant having a different covalently attached oligosaccharide (glycoform). The extent of variability that arises from multiple glycosylation sites on a peptide or indeed multiple forms of a single glycosylation site have been discussed by Rademacher et al. 1988, Ann. Rev. Biochem. 57:785-838, for recombinant proteins. Because the characteristics of glycoprotein as well as its biological properties and function vary according to the sequence and structure of the attached oligosaccharides (Cumming 1991, Glycobiology 1:115-130), the analysis of glycoprotein structure has become an important requirement in characterizing recombinant pharmaceutical proteins.
New methods of analyses are required to facilitate quality control of manufactured pharmaceutical grade recombinant protein to permit rapid, low cost and reliable characterization of oligosaccharides to distinguish between closely related structures (Spellman 1990, Anal. Chem. 62:1714-1722). New methods to manipulate and modify oligosaccharides on glycoproteins is desirable to improve production levels from cells and to optimize the biological function of proteins as therapeutic agents.
A rapid and simple method of oligosaccharide sequence and linkage analysis would have utility in directing synthesis and analyzing function of glycoproteins and carbohydrates in general as well as providing insights into the causes and implications of microheterogeneity in glycosylated molecules made in different organisms, organs or cells as well as within a single cell.
Methods of Analyzing Carbohydrate Structures
Existing methods for analyzing carbohydrate structure rely on complex multi-step procedures. These procedures involve techniques such as mass spectrometry, NMR, fast atom bombardment, complex chromatography techniques (high pressure liquid chromatography, gas phase chromatography, ion-exchange and reverse-phase chromatography) and complex series of chemical reactions (methylation analysis, periodate oxidation and various hydrolysis reactions) and have all been used in various combinations to determine the sequence of oligosaccharides and the features of their glycosidic linkage. Each method can provide certain pieces of information about carbohydrate structure but each has disadvantages. For example, fast atom bombardment (Dell 1987, Advances in Carbohydrate Chemistry and Biochemistry 45:19-73) can provide some size and sequence data but does not provide information on linkage positions or anomeric configuration. NMR is the most powerful tool for analyzing carbohydrates (Vliegenthart et al. 1983 Advances in Carbohydrate Chemistry 41:209-375) but is relatively insensitive and requires large quantities of analyte. These methods have been reviewed by Spellman 1990, Anal. Chem. 62:1714-1722; Lee et al. 1990, Applied Biochem. and Biotech. 23:53-80; Geisow 1992, Bio/technology 10:277-280; Kobata 1984. Many of the above procedures require expensive equipment as well as considerable technical expertise and technical support for their operation that limits their use to a few specialist laboratories.
Carbohydrate Analyses Using Glycosidases
Enzymes have been used at various stages of carbohydrate analysis as one step in the multi-step analyses. These enzymes include glycoamidases having the ability to cleave between the glycan portion and the amino acid (commonly Asparagine) of the protein with which it is associated. Most important are the endoglycosidases and exoglycosidases which are both hydrolases and are so named because of their ability to specifically cleave glycosidic bonds either within the carbohydrate structure (endo-) or at the terminal monosaccharides (exo-) at the non-reducing end of the molecule.
Endoglycosidases have been described that cleave oligosaccharides at the reducing end at the penultimate monosaccharide to the amino acid attachment site on the peptide. Five endo-.beta.-N-Acetylglucosaminidases have been purified sufficiently for use in structural studies each having a different substrate specificity (Kobata 1984). In addition, an endo-.alpha.-N-acetylgalactosaminidase has also been isolated (Umemoto et al. 1977, J. Biol. Chem. 252:8609-8614; Bhavanandan et al. 1976, Biochem. Biophys. Res. Commun. 70:738-745). The specificity of these endoglycosidases make them powerful tools in analyzing oligosaccharide structure. At this time, endoglycosidases have limited applicability due to the small number of characterized enzymes currently commercially available. An increased number of characterized endoglycosidases having different specificities would be of utility in carbohydrate analyses.
Oligosaccharides released by endoglycosidase digestion or by chemical means may be further characterized by exoglycosidase digestion. Exoglycosidases are hydrolases that cleave monosaccharide units from the non-reducing terminus of oligosaccharides and polysaccharides. Because exoglycosidases have known specificities for different terminal monosaccharides as well as for different anomeric forms, they have been used to sequence oligosaccharides. Sequential exoglycosidase digestion used in conjunction with gel permeation chromatography was first described by Yashita et al. in 1982 (Methods in Enzymology 83:105-126). Edge et al. (1992, PNAS 89:6338-6342) described multiplex enzyme reaction digestions and analysis of a sequence by analysis of arrays of enzyme digestions. The power of sequencing oligosaccharides using glycosidases has been limited by the availability of enzymes with well-characterized substrate specificities. The limitations of substrates for analyzing glycosidase activity has also resulted in incomplete data on glycosidic linkages between monosaccharides. As a result, it has been necessary to conduct methylation analysis to determine glycosidic linkages subsequent to sequence analysis.
Exoglycosidases have been isolated from diverse sources including bacteria, viruses, plants and mammals and have specificities for sialic acid (.alpha.anomer), galactose (.alpha. and .beta.), N-acetylglucosamine (.alpha. and .beta.), N-acetylgalactosamine (.alpha. and .beta.), mannose (.alpha. and .beta. (Sano et al. 1992, J. Biol. Chem. 267:1522-1527; Moremen et al. 1991, J. Biol. Chem. 266:16876-16885; Camirand et al. 1991, J. Biol. Chem. 266:15120-15127; Gasperi et al. 1992, J. Biol. Chem. 267:9706-9712; Ziegler et al. 1991, Glycobiology 1:605-614; Schatzle et al. 1992, J. Biol. Chem. 267:4000-4007).
Glycosidases in the prior art have been defined in most examples by their substrate specificity where the characterization of the enzyme is limited by the availability of suitable substrates and the complexity of the assay. Furthermore, enzymes in the prior art are frequently named in an arbitrary fashion, where the names suggest biological activities that have never been demonstrated. Limitations in the characterization of crude extracts or purified enzymes arise in the prior art because of the lack of suitable assays that identify what substrates are cleaved and what substrates are not cleaved by any single enzyme. Associated with the problems of characterizing the enzymes are problems associated with identifying contaminating glycosidase activity. Furthermore, not only are glycosidase preparations commonly contaminated with other glycosidases they are also contaminated with proteases. The limitations in characterizing enzymes cited in the prior art and the difficulties in obtaining substantially pure preparations of glycosidases is reflected in the sparsity of the list of commercially available glycosidases (see Table 1).
The substrates most commonly used in the prior art are derivatized monosaccharides (p-nitrophenyl-monosaccharide or 4-methylumbelliferyl monosaccharide). Whereas these substrates may provide information on some of the monosaccharides that are recognized by glycosidases, no information on glycosidic bond cleavage specificities can be obtained because the monosaccharide is chemically linked to the chromogenic marker and is not linked through a glycosidic linkage to a second monosaccharide. In addition the derivatized substrates are of limited use in characterizing the recognition site of a glycosidase. Glycosidases that cleave the monosaccharide derivative, do not always cleave the same monosaccharide in an oligosaccharide. Likewise, glycosidases that cleave an oligosaccharide may not cleave a derivatized substrate (Gasperi et al. 1992, J. Biol. Chem. 267:9706-9712).
A systematic approach is required to develop a set of labelled oligosaccharides suitable for characterizing the recognition site and the glycosidic cleavage site of a glycosidase. In addition to providing suitable substrates, simple rapid methods of analyzing the products of a single or multiple glycosidase reaction are required to accomplish the screening of a single glycosidase against multiple substrates or of multiple glycosidases against a single substrate.
Many of the glycosidases that are currently available have important limitations as analytic reagents (Jacob, et al., 1994, Methods Enzymol. 230:280-299). These include the following:
1) Contamination of exoglycosidase preparations with other exoglycosidase impurities that results in ambiguous digestion results. PA1 2) Lack of specificity of the exoglycosidase for a specific glycosidic linkage. Glycosidases that have been characterized appear to recognize multiple linkages, some of these linkages being preferentially recognized over others. It would be desirable to identify the extent of preference of any given glycosidase for a single linkage.
Furthermore, as analytic reagents, the repertoire of available exoglycosidases of varying specificities does not provide sufficient range to analyze and differentiate many of the linear or branched structures that occur in nature.
Of the available glycosidases, there is a deficit of substantially pure highly specific enzymes that have defined and reproducible substrate specificities to perform carbohydrate analyses. The deficiency in the availability of these enzymes for carbohydrate analyses is caused at least in part by the lack of available techniques to isolate novel glycosidases and to characterize their substrate specificities. The availability of a wide range of glycosidases that have defined monosaccharide and glycosidic linkage preferences would eliminate the existing requirement for additional types of analysis such as methylation analysis to fully characterize an oligosaccharide and would provide a powerful tool in rapid characterization of novel carbohydrate structures and their biological properties.
Source of Exoglycosidases
A limited number of exoglycosidases are commercially available (see Table 1). In addition, a large number of exoglycosidases have been isolated from a variety of organisms as described above. A partial list of exoglycosidases known to be useful for sequence determinations is provided by Linhardt et al. 1992, International Publication Number WO 92/02816. An additional list of exoglycosidases is provided by Haughland 1993, International Publication Number WO/93/04074. A comprehensive review of glycosidases is provided by Conzelman et al. 1987, Advances in Enzymology 60:89; Flowers et al. 1979, Advances in Enzymology 48:29; Kobata 1979, Anal. Biochem. 100:1-14.
Although glycosidases that are presently available have been generally isolated and manufactured from natural sources, Schatzle et al. 1992, J. Biol. Chem. 267:4000-4007, has reported cloning and sequencing the lysosomal enzyme .alpha.-Mannosidase isolated from Dictyostelium discoideum. Although Schatzle et al. characterized the structural properties of the enzyme, the substrate specificity with regard to glycosidic linkages was not revealed.
TABLE 1 COMMERCIALLY AVAILABLE GLYCOSIDASES LINKAGE ENZYME SOURCE SPECIFICITY .beta.-N-Acetylglucosaminidase Streptococcus pneumoniae.sup.OGS,BMB 1-2,3&gt;4,6 (+ GalNAC) Chicken liver.sup.OGS 1-3,4 (+ GalNAc) Bovine kidney.sup.BMB ? (+ GalNAc) .alpha.-Fucosidase Almond meal.sup.G,OGS 1-3,4 Streptomyces sp 142.sup.T 1-3,4 Arthrobacter.sup.T 1-2 Chicken liver.sup.OGS 1-2,4,6 Fusarium oxysporium.sup.S 1-2,4 Bovine epididymis.sup.OGS 1-6&gt;&gt;2,3,4 Bovine kidney.sup.BMB ? .alpha.-Galactosidase Coffee bean.sup.BMB,OGS 1-3,4,6 Mortieralla vinacea.sup.S 1-4,6 .beta.-Galactosidase Steptococcus pneumoniae.sup.OGS,BMB,S 1-4 Bovine testes.sup.OGS,BMB 1-3,4&gt;6 Jack bean.sup.OGS,S 1-3,4&gt;6 Chicken liver.sup.OGS 1-3,4 .alpha.-Mannosidase Jack bean.sup.OGS,BMB,S 1-2,6&gt;3 Aspergillus saitoi.sup.OGS 1-2 .sup.BMB : Boehringer Mannheim .sup.G : Genzyme .sup.OGS : Oxford GlycoSystems .sup.S : Seikagaku .sup.T : Takara
For the foregoing reasons, there is a need for novel substantially pure glycosidases suitable as reagents having defined substrate specificities and where the purified enzyme preparations are in a form that provides reproducible cleavage activity. Furthermore, there is a need for methods of isolating and manufacturing a wide array of these enzymes suitable for analyzing the wide variety of carbohydrate structures that occur in nature. Furthermore, there is a need for rapid, low cost, simple methods of carbohydrate analysis so as to characterize the substrate specificities of the enzymes; to provide rapid low cost methods of sequencing carbohydrate structures; and to modify carbohydrate moieties on glycoproteins and glycolipids for purposes of altering the biological properties of such molecules. The availability of a rapid, low cost, simple method of carbohydrate analysis would provide many opportunities to analyze the wide variety of carbohydrate structures that occur in nature, to understand the functions of these molecules and to modify their biological properties for useful purposes by manipulating their structures.