Enzymatic protein glycosylation involves an initiation stage in which glycosyltransferases catalyze the addition of a monosaccharide, or, in the case of asparagine N-linked glycosylation, a preformed oligosaccharide, to an amino acid residue in a given protein. The initiation step of protein glycosylation may be considered the key controlling event leading to the formation of a given glycopeptide linkage (glycoconjugate type), and it involves the essential recognition events between protein and glycosyltransferase which determine the specific sites of glycan attachment. Processing of glycan chains involves the cooperative action of part of the estimated hundreds of different glycosyltransferases successively adding a monosaccharide to the growing glycan chain. Identification and characterization of glycan structures of glycoproteins as well as the specific sites of glycan attachment are important for understanding the structure of a given glycoprotein, its function, and its immunobiology.
The glycosylation of serine and threonine residues during mucin-type O-linked protein glycosylation is catalyzed by a family of polypeptide GalNAc-transferases (EC 2.4.1.41). Two distinct human GalNAc-transferase genes, GalNAc-T1 and -T2, have previously been cloned and characterized (Homa et al. 1993; Hagen et al. 1993; White et al. 1995). In preliminary studies the specificity of recombinant GalNAc-T1 and -T2 with respect to polypeptide acceptors (i.e., acceptor substrates) has been analyzed. Comparison of the total acceptor substrate specificity of recombinant GalNAc-T1 and -T2 with the substrate specificities previously described in extracts of various organs showed that several peptides served as substrates only for galactosyltransferase enzymes present in the organ extracts (Sorensen et al. 1995).
Matsuura et al (1988) reported a tumor-associated de novo O-glycosylation of fibronectin in the IIICS region with the peptide sequence VTHPGY SEQ IN NO:3. In a more recent study Matsuura et al (1989) reported that O-glycosylation of this epitope was only achieved by transferase-containing extracts from fetal tissue or tumor tissue and not normal tissue. Recombinant GalNAc-T1 and GalNAc-T2 have not been found to catalyze O-glycosylation of this peptide sequence.
A peptide derived from the Human Immunodeficiency Virus (HIV.sub.IIIB) gp120 (GRAFVTIGKIG SEQ ID NO:4) was found to be an effective acceptor substrate for crude GalNAc-transferase extracts from several organs (Sorensen et al. 1995). However, purified GalNAc-T2 (Clausen et al. 1994; Sorensen et al. 1995) and recombinant GalNAc-T1 and GalNAc-T2 did not catalyze glycosylation of this substrate. These implicate additional GalNAc-transferases.
Families of glycosyltransferases with related but distinct acceptor and/or donor substrate specificities may be encoded by homologous genes showing segments of sequence similarity (Schachter, 1994; Kleene et al., 1993). The human GalNAc-transferases T1 and T2 share a segment of 61 amino acids with 82% sequence similarity and this segment is also found in a homologous gene from C. elegans (White et al. 1995, EMBL accession # L16621).
At present, knowledge of the key controlling event of initiation of O-glycosylation of proteins is limited to the involvement of two GalNAc-transferase genes, GalNAc-T1 and GalNAc-T2, and their encoded enzymes. The action of the two hitherto identified enzymes does not account for all observed O-glycosylation, with fibronectin and HIV being notable examples of O-glycosylation not mediated by said enzymes. Access to additional existing GalNAc-transferase genes would allow production of enzymes capable of performing such O-glycosylation initiation. Such enzymes could be used, for example, in pharmaceutical or other commercial applications that require synthetic O-glycosylation of these or other substrates that are not acted upon by GalNAc-T1 or -T2, in order to produce glycosylated polypeptides having particular enzymatic, immuogenic, or other biological and/or physical properties.
Consequently, there exists a need in the art for additional UDP-N-acetyl-.alpha.-D-galactosamine: polypeptide N-acetylgalactosaminyltransferases and the primary structure of the genes encoding these enzymes. The present invention meets this need, and further presents other related advantages.