The United States Government has certain rights in this invention by virtue of National Institutes of Health grants No. CA26712, GN31318, and CA14051. Glycoproteins, proteins with covalently bound sugars, are found in plants, animals, insects, and even many unicellular eukaryotes such as yeast. They occur within cells in both soluble and membrane-bound forms, in the intercellular matrix, and in extracellular fluids. The carbohydrate moieties of these glycoproteins can participate directly in the biological activity of the glycoproteins in a variety of ways: protection from proteolytic degradation, stabilization of protein conformation, and mediation of inter- and intracellular recognition. Examples of glycoproteins include enzymes, serum proteins such as immunoglobulins and blood clotting factors, cell surface receptors for growth factors and infectious agents, hormones, toxins, lectins and structural proteins.
Natural and recombinant proteins are being used as therapeutic agents in humans and animals. In many cases a therapeutic protein will be most efficacious if it has an appreciable circulatory lifetime. At least four general mechanisms can contribute to a shortened circulatory lifetime for an exogenous protein: proteolytic degradation, clearance by the immune system if the protein is antigenic or immunogenic, clearance by cells of the liver or reticulo-endothelial system that recognizes specific exposed sugar units on a glycoprotein, and clearance through the glomerular basement membrane of the kidney if the protein is of low molecular weight. The oligosaccharides of a glycoprotein can exert a strong effect on the first three of these clearance mechanisms.
The oligosaccharide chains of glycoproteins are attached to the polypeptide backbone by either N- or O-glycosidic linkages. In the case of N-linked glycans, there is an amide bond connecting the anomeric carbon (C-1) of a reducing-terminal N-acetylglucosamine (GlcNAc) residue of the oligosaccharide and a nitrogen of an asparagine (Asn) residue of the polypeptide. In animal cells, O-linked glycans are attached via a glycosidic bond between N-acetylgalactosamine (GalNAc), galactose (Gal), or xylose and one of several hydroxyamino acids, most commonly serine (Ser) or threonine (Thr), but also hydroxyproline or hydroxylsine in some cases. The O-linked glycans in the yeast Saccharomyces cerevisiae are also attached to serine or threonine residues, but, unlike the glycans of animals, they consist of one to several .alpha.-linked mannose (Man) residues. Mannose residues have not been found in the O-linked oligosaccharides of animal cells.
The biosynthetic pathways of N- and O-linked oligosaccharides are quite different. O-linked glycan synthesis is relatively simple, consisting of a step-by-step transfer of single sugar residues from nucleotide sugars by a series of specific glycosyltransferases. The nucleotide sugars which function as the monosaccharide donors are uridine-diphospho-GalNAc (UDP-GalNAc), UDP-GlcNAc, UDP-Gal, guanidine-diphospho-fucose (GDP-Fuc), and cytidine-monophospho-sialic acid (CMP-SA). N-linked oligosaccharide synthesis, which is much more complex, is described below.
The initial steps in the biosynthesis of N-linked glycans have been preserved with little change through evolution from the level of unicellular eukaryotes such as yeast to higher plants and man. For all of these organisms, initiation of N-linked oligosaccharide assembly does not occur directly on the Asn residues of the protein, but rather involves preassembly of a lipid-linked precursor oligosaccharide which is then transferred to the protein during or very soon after its translation from mRNA. This precursor oligosaccharide, which has the composition Glc.sub.3 Man.sub.9 GlcNAc.sub.2 and the structure shown in FIG. 1A, is synthesized while attached via a pyrophosphate bridge to a polyisoprenoid carrier lipid, a dolichol. This assembly is a complex process involving at least six distinct membrane-bound glycosyltransferases. Some of these enzymes transfer monosaccharides from nucleotide sugars, while others utilize dolichol-linked monosaccharides as sugar donors. After assembly of the lipid-linked precursor is complete, another membrane-bound enzyme transfers it to sterically accessible Asn residues which occur as part of the sequence -Asn-X-Ser/Thr-. The requirement for steric accessibility is presumably responsive for the observation that denaturation is usually required for in vitro transfer of precursor oligosaccharide to exogenous proteins.
Glycosylated Asn residues of newly-synthesized glycoproteins transiently carry only one type of oligosaccharide, Glc.sub.3 Man.sub.9 GlcNAc.sub.2. Modification, or "processing," of this structure generates the great diversity of structures found on mature glycoproteins, and it is the variation in the type or extent of this processing which accounts for the observation that different cell types often glycosylate even the same polypeptide differently.
The processing of N-linked oligosaccharides is accomplished by the sequential action of a number of membrane-bound enzymes and begins immediately after transfer of the precursor oligosaccharide Glc.sub.3 Man.sub.9 -GlcNAc.sub.2 to the protein. In broad terms, N-linked oligosaccharide processing can be divided into three stages: removal of the three glucose residues, removal of a variable number of mannose residues, and addition of various sugar residues to the resulting trimmed "core," i.e., the Man.sub.3 GlcNAc.sub.2 portion of the original oligosaccharide closest to the polypeptide backbone. A simplified outline of the processing pathway is shown in FIG. 2.
Like the assembly of the precursor oligosaccharide, the removal of the glucose residues in the first stage of processing has been preserved through evolution. In yeast and in vertebrates, all three glucose residues are trimmed to generate N-linked Man.sub.9 GlcNAc.sub.2. Processing sometimes stops with this structure, but usually it continues to the second stage with removal of mannose residues. Here the pathway for yeast diverges from that in vertebrate cells.
As shown in FIG. 1B, four of the mannose residues of the Man.sub.9 GlcNAc.sub.2 moiety are bound by .alpha.1.fwdarw.2 linkages. By convention the arrow points toward the reducing terminus of an oligosaccharide, or in this case, toward the protein-bound end of the glycan; .alpha. or .beta. indicate the anomeric configuration of the glycosidic bond; and the two numbers indicate which carbon atoms on each monosaccharide are involved in the bond. The four .alpha.1.fwdarw.2-linked mannose residues can be removed by Mannosidase I to generate N-linked Man.sub.5-8 GlcNAc.sub.2, all of which are commonly found on vertebrate glycoproteins. Oligosaccharides with the composition Man.sub.5-9 GlcNAc.sub.2 are said to be of the "high-mannose" type.
As shown in FIG. 2, protein-linked Man.sub.5 GlcNAc.sub.2 (Structure M-c) can serve as a substrate for GlcNAc transferase I, which transfers a .beta.1.fwdarw.2-linked GlcNAc residue from UDP-GlcNAc to the .alpha.1.fwdarw.3-linked mannose residue to form GlcNAcMan.sub.5 GlcNAc.sub.2 (Structure M-d). Mannosidase II can then complete the trimming phase of the processing pathway by removing two mannose residues to generate a protein-linked oligosaccharide with the composition GlcNAcMan.sub.3 GlcNAc.sub.2 (Structure M-e). This structure is a substrate for GlcNAc transferase II, which can transfer a .beta.1.fwdarw.2-linked GlcNAc residue to an .alpha.1.fwdarw.6-linked mannose residue (not shown).
It is at this stage that the true complexity of the processing pathway begins to undfold. Simply stated, monosaccharides are sequentially added to the growing oligosaccharide chain by a series of membrane-bound Golgi glycosyltransferases, each of which is highly specific with respect to the acceptor oligosaccharide, the donor sugar, and the type of linkage formed between the sugars. Each type of cell has an extensive but discrete set of these glycosyltransferases. These can include at least four more distinct GlcNAc transferases (producing .beta.1.fwdarw.3, .beta.1.fwdarw.4, or .beta.1.fwdarw.6 linkages); three galactosyltransferases (producing .beta.1.fwdarw.4, .beta.1.fwdarw.3, and .alpha.1.fwdarw.3 linkages); two sialyltransferases (one producing .alpha.2.fwdarw.3 and another, .alpha.2.fwdarw.6 linkages); three fucosyltransferases (producing .alpha.1.fwdarw.2, .alpha.1.fwdarw.3, .alpha.1.fwdarw.4 or .alpha.1.fwdarw.6 linkages); and a growing list of other enzymes responsible for a variety of unusual linkages. The cooperative action of these glycosyltransferases produces a diverse family of structures collectively referred to as "complex" oligosaccharides. These may contain two (for example, Structure M-f in FIG. 2), three (for example, FIG. 1C or Structure M-g in FIG. 2), or four outer branches attached to the invaraint core pentasaccharide, Man.sub.3 GlcNAc.sub.2. These structures are referred to in terms of the number of their outer branches: biantennary (two branches), triantennary (three branches) or tetraantennary (four branches). The size of these complex glycans varies from a hexasaccharide (on rhodopsin) to very lage polylactosaminylglycans, which contain one or more outer branches with repeating (Gal.beta.1.fwdarw.4GlcNAc.beta.1.fwdarw.3) units (on several cell surface glycoproteins such as the erythrocyte glycoprotein Band 3 and the macrophase antigen Mac-2). Despite this diversity, the specificities of the glycosyltransferases do produce some frequently recurring structures. For example, the outer branches of many complex N-linked oligosaccharides consist of all or part of the sequence EQU SA.alpha.2.fwdarw.3(6)Gal.beta.1.fwdarw.4GlcNAc.beta.1.fwdarw..
One or two of these trisaccharide moieties may be attached to each of the two .alpha.-linked mannose residues of the core pentasaccharide, as in Structures M-f and M-g in FIG. 2.
Unlike transcription of DNA or translation of mRNA, which are highly reproducible events, oligosaccharide biosynthesis does not take place on a template. As a consequence, considerable heterogeneity is usually observed in the oligosaccharide structures of every glycoprotein. The differences are most commonly due to variations in the extent of processing. The singly glycosylation site of the chicken egg glycoprotein ovalbumin, for example, contains a structurally related "family" of at least 18 different oligosaccharides, the great majority of which are of the high-mannose or related "hybrid" type (for example, Structure M-h in FIG. 2). Many glycoproteins contain multiple glycosylated Asn residues, and each of these may carry a distinct family of oligosaccharides. For example, one site may carry predominantly high-mannose glycans, another may carry mostly fucosylated biantennary complex chains, and a third may carry fucose-free tri- and tetraantennary complex structures. Again, all of these glycans will contain the invariant Man.sub.3 GlcNAc.sub.2 core.
As discussed above, the initial stages of N-linked oligosaccharide synthesis in the yeast Saccharmoyces cerevisiae closely resemble those occurring in vertebrate cells. As in higher organisms, lipid-linked Glc.sub.3 Man.sub.9 GlcNAc.sub.2 is assembled, its oligosaccharide chain transferred to acceptor Asn residues of proteins, and its three glucose residues are removed soon after transfer. Yeast cells can remove only a single mannose residue, however, so that the smallest and least-processed N-linked glycans have the composition Man.sub.8-9 GlcNAc.sub.2. Processing can stop at this stage or continue with the addition of as many as 50 or more .alpha.-linked mannose residues to Man.sub.8 GlcNAc.sub.2 (FIG. 2, Structure Y-c) to generate a mannan (for example, Structure Y-d). Just as glycoproteins in mammalian cells may have predominantly high-mannose oligosaccharides at one glycosylated Asn residue and highly processed complex glycans at another, yeast glycoproteins such as external invertase commonly have some glycosylation sites with Man.sub.8-9 GlcNAc.sub.2 chains, while other sites carry mannans.
Unlike eukaryotic cells, bacteria lack the enzymatic machinery to assemble lipid-linked Glc.sub.3 Man.sub.9 GlcNAc.sub.2 or transfer it to proteins. Thus, although proteins synthesized in E. coli contain many -Asn-X-Ser/Thr- sequences, they are not glycosylated.
From the foregoing discussion, it is apparent that the glycosylation status of a glycoprotein will depend on the cell in which it is produced. The glycans of a protein synthesized in cultured mammalian cells will resemble those of the same protein isolated from a natural animal source such as a tissue but are unlikely to be identical. Proteins glycosylated by yeast contain high-mannose oligosaccharides and mannans, and proteins synthesized in a bacterium such as E. coli will not be glycosylated because the necessary enzymes are absent.
The precise composition and structure of the carbohydrate chain(s) on a glycoprotein can directly influence its serum lifetime, since cells in the liver and reticulo-endothelial system can bind and internalize circulating glycoproteins with specific carbohydrates. Hepatocytes have receptors on their surfaces that recognize oligosaccharide chains with terminal (i.e., at the outermost end(s) of glycans relative to the polypeptide) Gal residues, macrophages contain receptors for terminal Man or GlcNAc residues, and heptocytes and lymphocytes have receptors for exposed fucose residues. No sialic acid-specific receptors have been found, however. Although somewhat dependent on the spatial arrangement of the oligosaccharides, as a general rule, the greater the number of exposed sugar residues recognized by cell surface receptors in the liver and reticulo-endothelial system, the more rapidly a glycoprotein will be cleared from the serum. Because of the absence of sialic acid-specific receptors, however, oligosaccharides with all branches terminated, or "capped," with sialic acid will not promote the clearance of the protein to which they are attached.
The presence and nature of the oligosaccharide chain(s) on a glycoprotein can also affect important biochemical properties in addition to its recognition by sugar-specific receptors on liver and reticulo-endothelial cells. Removal of the carbohydate from a glycoprotein will usually decrease its solubility, and it may also increase its susceptibility to proteolytic degradation by destabilizing the correct polypeptide folding pattern and/or unmasking protease-sensitive sites. For similar reasons, the glycosylation status of a protein can affect its recognition by the immune system.
It is therefore an objective of the present invention to provide a method for modifying oligosaccharide chains of glycoproteins isolated from natural sources or produced from recombinant DNA in yeast, insect, plant or vertebrate cells in a manner that increases serum lifetime or targets the protein to specific cell types.
It is another object of the invention to provide an in vitro method for glycosylating proteins produced from bacterial, yeast, plant, viral or animal DNA in a manner that enhances stability and effective biological activity.
It is a further objective of the invention to provide a method for glycosylation of proteins or modification of oligosaccharide chains on glycoproteins which is efficient, reproducible and cost-effective.