The present invention relates generally to the biosynthesis of glycans found as free oligosaccharides or covalently bound to proteins and glycolipids. In particular, this invention relates to a family of nucleic acids encoding UDP-N-acetylglucosamine: N-acetylgalactosamine-xcex21,6-N-acetylglucosaminyltransferases (Core-xcex21,6-N-acetylglucosaminyltransferases), which add N-acetylglucosamine to the hydroxy group at C6 of 2-acetamido-2-deoxy-D-galactosamine (GalNAc) in O-glycans of the core 1 and the core 3 type thereby forming the core 2 and core 4 types. Previously two members of this family have been identified and designated C2GnT1 and C2GnT2.
This invention is more particularly related to a gene encoding a third member of this family of O-glycan xcex21,6-N-acetylglucosaminyltransferases, termed C2GnT3, probes to the DNA encoding C2GnT3, DNA constructs comprising DNA encoding C2GnT3, recombinant plasmids and recombinant methods for producing C2GnT3, recombinant methods for stably transforming or transfecting cells for expression of C2GnT3, methods for identification of agents with the ability to inhibit or stimulate C2GnT3 biological activity, and methods for identification of DNA polymorphism in patients. In the U.S. Provisional patent application No. 60/150,488 filed on Aug. 24, 1999, from which the present application claims priority, this novel Core 2 xcex26GlcNAc-transferase isoform was identified and designated C2GnTII. The designation C2GnTII has here been replaced by the designation C2GnT3 in accordance with its scientific publication (14).
O-linked protein glycosylation involves an initiation stage in which a family of N-acetylgalactosaminyltransferases catalyzes the addition of N-acetylgalactosamine to Serine or Threonine residues (1). Further assembly of O-glycan chains involves several sucessive or alternative biosynthetic reactions: i) formation of simple mucin-type core 1 structures by UDP-Gal: GalNAcxcex1-R xcex21,3Gal-transferase activity; ii) conversion of core 1 to complex-type core 2 structures by UDP-GlcNAc: Galxcex21-3GalNAcxcex1-R xcex21,6GlcNAc-transferase activities; iii) direct formation of complex mucin-type core 3 by UDP-GlcNAc: GalNAcxcex1 xcex21,3GlcNAc-transferase activities; and iv) conversion of core 3 to core 4 by UDP-GlcNAc: GlcNAcxcex21-3GalNAcxcex1-R xcex21,6GlcNAc-transferase activity. The formation of xcex21,6GlcNAc branches (reactions ii and iv) may be considered a key controlling event of O-linked protein glycosylation leading to structures produced upon differentiation and malignant transformation (2-6). For example, increased formation of GlcNAcxe2x96xa11-6GalNAc branching in O-glycans has been demonstrated during T-cell activation, during the development of leukemia, and for immunodeficiencies like Wiskott-Aldrich syndrome and AIDS (7; 8). Core 2 branching may play a role in tumor progression and metastasis (9). In contrast, many carcinomas show changes from complex O-glycans found in normal cell types to immaturely processed simple mucin-type O-glycans such as T (Thomsen-Friedenreich antigen; Galxcex21-3GalNAcxcex11-R), Tn (GalNAcxcex11-R), and sialosyl-Tn (NeuAcxcex12-6GalNAcxcex11-R) (10). The molecular basis for this has been extensively studied in breast cancer, where it was shown that specific downregulation of a core 2 xcex26GlcNAc-transferase was responsible for the observed lack of complex type O-glycans on the mucin MUC1 (6). O-glycan core assembly may therefore be controlled by inverse changes in the expression level of Core-xcex21,6-N-acetylglucosaminyl-transferases and the sialyltransferases forming sialyl-T and sialyl-Tn.
Interestingly, the metastatic potential of tumors has been correlated with increased expression of core 2 xcex26GlcNAc-transferase activity (5). The increase in core 2 xcex26GlcNAc-transferase activity was associated with increased levels of poly N-acetyllactosamine chains carrying sialyl-Lex, which may contribute to tumor metastasis by altering selectin-mediated adhesion (4; 11). The control of O-glycan core assembly is regulated by the expression of key enzyme activities; however, epigenetic factors including posttranslational modification, topology, or competition for substrates may also play a role in this process (11).
Changes in surface carbohydrates of T-cells have been identified during development and activation. O-glycan branches of the core 2 type are restricted to immature thymocytes of the thymal cortex but are no longer exposed on the surface of mature medullary thymocytes (17). Core 2 structures on T-cell surface proteins are ligands for the S-type lectin galectin-1, which participates in thymocyte - thymic epithelia interaction (18). The elimination of Core 2 structures from the thymocyte cell surface was found to be essential for controlled apoptosis mediated by galectin-1 (19).
Core 2 xcex26GlcNAc-transferase activity is carried out by more than one enzyme isoform. The first Core 2 xcex26GlcNAc-transferase isoform was initially identified as a critical enzyme in blood cell development and differentiation and designated leukocyte form or L-Form (C2GnT-L)(12). The gene encoding C2GnT-L has been cloned by expression cloning from a cDNA library of the human promyelocytic leukemia cell line HL-60 (13). This gene has now been renamed as C2GnT1 (14). Using the C2GnT1 sequence as a probe for BLAST analysis of the human expressed sequence tag database, a homologous gene encoding a second Core 2 xcex26GlcNAc-transferase isoform has been identified and designated C2/4GnT (15) and C2GnT-M (I16). This gene has now been renamed as C2GnT2 (14).
C2GnT1 was predicted to control synthesis of core 2 selectin ligands in leukocytes and lymphoid tissues, however, mice deficient in C2GnT1 exhibited only partial reduction in selectin ligand production and no significant changes in lymphocyte homing properties (Ellies, L. G., et al. 1998, Immunity 9: 881-890). One possible explanation for these results would be the expression of additional Core 2 xcex26GlcNAc-transferases. C2GnT2 does not appear to be a candidate, as its expression pattern is restricted to mucous secreting organs (15, 16).
Consequently, there exists a need in the art for detecting as yet unidentified UDP-N-acetylglucosamine: Galactose-xcex21,3-N-acetylgalactosamine-xcex1-R (GlcNAc to GalNAc) xcex21-6 N-acetylglucosaminyltransferases and identifying the primary structures of the genes encoding such enzymes. The present invention meets this need, and further presents other related advantages.
The present invention provides isolated nucleic acids encoding human UDP-N-acetylglucosamine: N-acetylgalactosamine xcex21,6 N-acetylglucosaminyltransferase 3 (C2GnT3), including cDNA and genomic DNA. C2GnT3 has acceptor substrate specificities comparable to C2GnT1 (14). The complete nucleotide sequence encoding C2GnT3 is set forth in SEQ ID NO: 1 and in FIG. 1.
Variations in one or more nucleotides may exist among individuals within a population due to natural allelic variation. Any and all such nucleic acid variations are within the scope of the invention. DNA sequence polymorphisms may also occur which lead to changes in the amino acid sequence of a C2GnT3 polypeptide. These amino acid polymorphisms are also within the scope of the present invention. In addition, species variations i.e. variations in nucleotide sequence naturally occurring among different species, are within the scope of the invention.
Among Core 2 xcex26GlcNAc-transferases, C2GnT3 appears to be the dominant isoform in thymus (14). Thus, C2GnT3 is likely to have important functions during thymocyte development as well as T-cell maturation and homing (14). The identification of agents with the ability to inhibit or stimulate C2GnT3 enzymatic activity therefore has the potential for both diagnostic and therapeutic purposes of related diseases.
Access to the gene encoding C2GnT3 allows production of a glycosyltransferase for use in formation of core 2-based O-glycan modifications on oligosaccharides, glycoproteins and glycosphingolipids. This enzyme can be used, for example, in pharmaceutical or other commercial applications that require synthetic addition of core 2-based O-glycans to these or other substrates, in order to produce appropriately glycosylated glycoconjugates having particular enzymatic, immunogenic, or other biological and/or physical properties.
In one aspect, the invention encompasses isolated nucleic acids comprising the nucleotide sequence of nucleotides 1-1362 as set forth in FIG. 1 or sequence-conservative or function-conservative variants thereof. Also provided are isolated nucleic acids hybridizable with nucleic acids having the sequence as set forth in FIG. 1 or fragments thereof or sequence-conservative or function-conservative variants thereof, preferably, the nucleic acids are hybridizable with C2GnT3 sequences under conditions of intermediate stringency, and, most preferably, under conditions of high stringency. In one embodiment, the DNA se-quence encodes the amino acid sequence shown in FIG. 1, from methionine (amino acid no. 1) to serine (amino acid no. 453). In another embodiment, the DNA sequence encodes an amino acid sequence comprising a sequence from proline (no. 39) to serine (no.453) of the amino acid sequence set forth in FIG. 1.
In a related aspect, the invention provides nucleic acid vectors comprising C2GnT3 DNA sequences, including but not limited to those vectors in which the C2GnT3 DNA sequence is operably linked to a transcriptional regulatory element, with or without a polyadenylation sequence. Cells comprising these vectors are also provided, including without limitation transiently and stably expressing cells. Viruses, including bacteriophages, comprising C2GnT3-derived DNA sequences are also provided. The invention also encompasses methods for producing C2GnT3 polypeptides. Cell-based methods include without limitation those comprising: introducing into a host cell an isolated DNA molecule encoding C2GnT3, or a DNA construct comprising a DNA sequence encoding C2GnT3; growing the host cell under conditions suitable for C2GnT3 expression; and isolating C2GnT3 produced by the host cell. A method for generating a host cell with de novo stable expression of C2GnT3 comprises: introducing into a host cell an isolated DNA molecule encoding C2GnT3 or an enzymatically active fragment thereof (such as, for example, a polypeptide comprising amino acids 39-453 of the sequence set forth FIG. 1), or a DNA construct comprising a DNA sequence encoding C2GnT3 or an enzymatically active fragment thereof; selecting and growing host cells in an appropriate medium; and identifying stably transfected cells expressing C2GnT3. The stably transfected cells may be used for the production of C2GnT3 enzyme for use as a catalyst and for recombinant production of peptides or proteins with appropriate glycosylation. For example, eukaryotic cells, whether normal or diseased cells, having their glycosylation pattern modified by stable transfection as above, or components of such cells, may be used to deliver specific glycoforms of glycopeptides and glycoproteins, such as, for example, as immunogens for vaccination.
In yet another aspect, the invention provides isolated C2GnT3 polypeptides, including without imitation polypeptides having the sequence set forth in FIG. 1, polypeptides having the sequence of amino acids 39-453 as set forth in FIG. 1, and a fusion polypeptide consisting of at least amino acids 39-453 as set forth in FIG. 1 used in frame to a second sequence, which may be any sequence that is compatible with retention of C2GnT3 enzymatic activity in the fusion polypeptide.
Suitable second sequences include without limitation those comprising an affinity ligand or a reactive group.
In a related aspect, methods are disclosed for the identification of agents with the ability to inhibit or stimulate the enzymatic activity of C2GnT3. Assays utilizing C2GnT3 to screen for potential inhibitors or stimulators thereof are encompassed by the invention. Furthermore, methods of using C2GnT3 in the structure-based design of inhibitors or stimulators thereof are also an aspect of the invention. Such a design would comprise the steps of determining the three-dimensional structure of the C2GnT3 polypeptide, analyzing the three-dimensional structure for the likely binding sites of donor and/or acceptor substrates, synthesis, of a molecule that incorporates a predictive reactive site, and determining the inhibiting or stimulating activity of the molecule.
In another aspect of the present invention, methods are disclosed for screening for mutations in the coding region of the C2GnT3 gene using genomic DNA isolated from, e.g., blood cells of patients. In one embodiment, the method comprises: isolation of DNA from a patient; PCR amplification of the coding exon; DNA sequencing of amplified exon DNA fragments and establishing therefrom potential structural defects of the C2GnT3 gene associated with disease.
In accordance with an aspect of the invention there is provided a method of, and products for (i.e. kits), diagnosing and monitoring conditions mediated by C2GnT3 by determining, in a biological sample, the presence of nucleic acid molecules and polypeptides of the invention.
Still further the invention provides a method for evaluating a test compound for its ability to modulate the biological activity of a C2GnT3 polypeptide of the invention. For example, a substance that inhibits or enhances the catalytic activity of a C2GnT3 polypeptide may be evaluated. xe2x80x9cModulatexe2x80x9d refers to a change or an alteration in the biological activity of a polypeptide of the invention. Modulation may be an increase or a decrease in activity, a change in characteristics, or any other change in the biological, functional, or immunological properties of the polypeptide.
Compounds which modulate the biological activity of a polypeptide of the invention may also be identified using the methods of the invention by comparing the pattern and level of expression of a nucleic acid molecule or polypeptide of the invention in biological samples, tissues and cells, in the presence, and in the absence of the compounds.
In an embodiment of the invention a method is provided for screening a compound for effectiveness as an antagonist of a polypeptide of the invention, comprising the steps of a) contacting a sample containing said polypeptide with a compound, under conditions wherein antagonist activity of said polypeptide can be detected, and b) detecting antagonist activity in the sample.
Methods are also contemplated that identify compounds or substances (e.g. polypeptides), which interact with C2GnT3 nucleic acid regulatory sequences (e.g. promoter sequences, enhancer sequences, negative modulator sequences).
The nucleic acids, polypeptides, and substances and compounds identified using the methods of the invention, may be used to modulate the biological activity of a C2GnT3 polypeptide of the invention, and they may be used in the treatment of conditions mediated by C2GnT3 such as proliferative diseases including cancer, and thymus-related disorders. Accordingly, the nucleic acids, polypeptides, substances and compounds may be formulated into compositions for administration to individuals suffering from one or more of these conditions. Therefore, the present invention also relates to a composition comprising one or more of a polypeptide, nucleic acid molecule, or substance or compound identified using the methods of the invention, and a pharmaceutically acceptable carrier, excipient or diluent. A method for treating or preventing these conditions is also provided comprising administering to a patient in need thereof, a composition of the invention.
The present invention in another aspect provides means necessary for production of gene-based therapies directed at the thymus. These therapeutic agents may take the form of polynucleotides comprising all or a portion of a nucleic acid of the invention comprising a regulatory sequence of a C2GnT3 nucleic acid placed in appropriate vectors or delivered to target cells in more direct ways.
Having provided a novel C2GnT3, and nucleic acids encoding same, the invention accordingly further provides methods for preparing oligosaccharides. In specific embodiments, the invention relates to a method for preparing an oligosaccharide comprising contacting a reaction mixture comprising a donor substrate, and an acceptor substrate in the presence of a C2GnT3 polypeptide of the invention.
In accordance with a further aspect of the invention, there are provided processes for utilizing polypeptides or nucleic acid molecules, for in vitro purposes related to scientific research, synthesis of DNA, and manufacture of vectors.
These and other aspects of the present invention will become evident upon reference to the following detailed description and drawings.