Secretory granule proteoglycans comprise one of several families of mammalian proteoglycans, all of which are highly acidic macromolecules possessing at least one sulfated, glycosaminoglycan chain covalently bound to a peptide core. Tantravahi, R. V., et al., Proc. Natl. Acad. Sci. U.S.A. 83:9207 (1986). Various proteoglycan families have been proposed in the human based on differential reactivity to panels of monoclonal antibodies, differential peptide mapping after proteolytic treatment, different sizes after translation, and, in some cases, different amino acid sequences. They are further differentiated by their eventual location in the cell (e.g., intracellular, extracellular, and/or pericellular matrix) and by the nature of the carbohydrate residues attached to their peptide cores. For instance, proteoglycans that are localized to extracellular matrices appear to belong to a family that is distinct from the subfamily of more hydrophobic proteoglycans intercalated into the plasma membrane. Tantravahi, R. V., et al., supra.
Although there are at least 12 distinct proteins in the human that exist as proteoglycan peptide cores, the complete primary structure and/or chromosomal location of only a few of these proteoglycans are presently known and no gene has been isolated that encodes the complete peptide core of any proteoglycan.
A cDNA that encodes the dermatan sulfate proteoglycan peptide core that resides in the extracellular matrix around fibroblasts has been cloned from a human embryonic fibroblast cell line. Krusius, T., and Ruoslahti, E., Proc. Natl. Acad. Sci., U.S.A. 83:7683 (1986). Its nucleotide sequence predicts a peptide core of 40,000 M.sub.r with three potential glycosaminoglycan initiation sites and three potential sites for N-linked oligosaccharides.
Cell surface glycoproteins may also exist as proteoglycans. For example, the transferrin receptor and the invariant chain of the class II antigens have been reported to be proteoglycans on the plasma membrane of human skin fibroblasts and human lymphoid tissues, respectively. Giacoletto, K. S., et al., J. Exp. Med. 164:1422 (1986). The deduced amino acid sequence of the cDNAs that encode the transferrin receptor and the invariant protein have revealed serine-glycine glycosaminoglycan initiation sites. Another proteoglycan which contains chondroitin sulfate glycosaminoglycans has been described on the surfaces of human melanoma cells. Bumol, T. F., and Ricefeld, R. A., Proc. Natl. Acad. Sci. U.S.A. 79:1245-1249 (1982). The gene that encodes its more than 240,000 M.sub.r peptide core is predicted to reside on chromosome 15. Rettig, W. J., et al., Science 231:1281 (1986).
The first clear evidence for the existence of proteoglycans stored within the granules of a cell was obtained from studies of the rat skin mast cell. However, it has become increasingly apparent that a number of cells which participate in immune and inflammatory responses, including mucosal mast cells, basophils, eosinophils, neutrophils, macrophages, platelets, and natural killer cells, also contain proteoglycans in their granules. The presence of a family of proteoglycans that resides inside cells rather than on the plasma membrane or in the extracellular matrix suggests that these molecules may be important in the functions of such cells for tumor surveillance and host defense against bacterial, viral, fungal, and parasitic pathogens. Stevens, R. L., "Intracellular Proteoglycans in Cells of the Immune System," In: Biology of Proteoglycans (1987), herein incorporated by reference, reviews the evidence for the localization of the proteoglycans of mast cells, basophils, and natural killer cells within the secretory granule, the unique structural features of these proteoglycans, and their possible functions in the immune response.
In Bourdon et al., Proc. Natl. Acad. Sci. U.S.A. 82:1322 (1985), a rat chondroitin sulfate proteoglycan peptide core cDNA was identified and sequenced. The selection of the cDNA clone pPG-1 from a cDNA library prepared from L2 rat yolk sac tumor poly(A).sup.+ mRNA was accomplished by using oligonucleotides derived from two regions of the NH.sub.2 -terminal protein sequence of the L2 proteoglycan. The use of oligonucleotides from two different parts of the NH.sub.2 -terminal amino acid sequence was essential for the identification of the desired cDNA clones. Several clones that hybridized with one or the other of the 17-mer oligonucleotides but not the 11-mer were obtained. One of these clones was partially sequenced. The resulting sequence did not have complete homology with the appropriate 17-mer probe and did not code for the NH.sub.2 -terminal peptide sequence of the proteoglycan.
The amino acid sequence inferred from the pPG-1 proteoglycan peptide core cDNA clone revealed the complete primary structure of the mature proteoglycan peptide core produced by this rat tumor cell. The proteoglycan peptide core coding region, identified on the basis of inferred amino acid sequence homology with the proteoglycan peptide core NH.sub.2 -terminal amino acid sequence, codes for a 104 amino acid core protein with a calculated molecular weight of 10,190 daltons.
The amino acid sequence of the rat L2 cell proteoglycan peptide core contains three structural regions beginning with a 14 amino acid NH.sub.2 -terminal region followed by a 49 amino acid serine-glycine repeat region and a 41 amino acid COOH-terminal region. The functions of the NH.sub.2 - and COOH-terminal regions are unknown, although it is thought that they play a role in determining interactions between the proteoglycan and both cell surfaces and extracellular molecules.
It is also thought that the function of the serine-glycine repeat region in the middle of the molecule is to serve as a recognition and receptor site for the attachment of chondroitin sulfate side chains. The attachment of chondroitin sulfate and heparin chains onto all proteoglycan cores is accomplished via O-glycosyl linkage to serine. Moreover, it is known that glycine residues are also involved in glycosaminoglycan attachment, since glycine is abundant in proteoglycans, and synthetic peptides containing alternating serine and glycine can serve as acceptors for glycosaminoglycan chain initiation. The extent of serine O-glycosylation in the L2 proteoglycan has been estimated to be near 60%, indicating that at least 14 of the serine residues of the core protein bear a chondroitin sulfate chain. Bourdon et al. noted that the structure of the serine-glycine region closely parallels that predicted for the glycosaminoglycan attachment region of a rat heparin proteoglycan, indicating that the serineglycine repeat may be a general feature of at least a subset of proteoglycans. Because the pronase-resistant glycosaminoglycan attachment region of rat mast cell heparin proteoglycan contains only serine and glycine amino acids, it was proposed that 15-20 serine residues alternate with glycine residues, with heparin chains being attached to at least two of every three serines. It would appear that at least two proteoglycans, the rat mast cell heparin proteoglycan and the rat yolk sac tumor proteoglycan, have identical or nearly identical glycosaminoglycan attachment regions. The codon usage for serine and glycine is quite restricted in the pPG-1 cDNA, with 81% of the serine residues being coded for by two of six possible codons and 70% of the glycine residues being coded for by one of its four possible codons.
In addition to chondroitin sulfate side chains, many other proteoglycans have O- and N-linked oligosaccharides. These oligosaccharide chains are linked to the proteoglycan peptide core through either threonine (serine) O-glycosyl or asparagine N-glycosyl acceptor recognition sequences. An examination of the rat L2 cell proteoglycan peptide core amino acid sequence does not reveal any asparagine oligosaccharide acceptor sites that have the sequence X-asparagine-Y-serine. Bourdon, M. A. et al., supra; Hughes, R. C., Prog. Biophys. Mol. Biol. 26:189-268 (1973).
In Stevens et al., J. Biol. Chem. 260:14194-14200 (1985), the applicant and several other investigators report an analysis of the structure of the protein of the intracellular chondroitin E proteoglycan from the interleukin 3-dependent mouse mast cell. The analysis revealed the sum of the glycine, serine, and glutamic acid/glutamine residues accounted for 70% of the total amino acids in the core peptide. The authors note that the mouse mast cell chondroitin sulfate E is similar to the rat serosal mast cell heparin proteoglycan in that both are packaged in secretory granules, are protease-resistant, and have highly sulfated glycosaminoglycans, and have peptide cores which are rich in serine and glycine. They also noted that the mouse chondroitin sulfate E proteoglycan has a peptide core somewhat similar to the chondroitin sulfate proteoglycan isolated by Bourdon et al., supra, from the rat yolk sac's tumor in size and in Ser, Gly, and Glx content.
Tantravahi et al., Proc. Natl. Acad. Sci. U.S.A. 83:9207-9210 (1986), reported on the use of the pPG-1 probe disclosed in Bourdon et al., supra, and subclones thereof to demonstrate that the same gene is expressed in mouse bone marrow derived mast cells (BMMC) (cells that contain intracellular chondroitin sulfate E proteoglycan) and rat serosal mast cells (cells that contain intracellular heparin proteoglycan).