1. Field of the Invention
This invention relates to the field of proteoglycans and of cell surface receptors for biological effector molecules, more particularly the use of genetic engineering to define a class of proteoglycans and their constituent functional domains, particularly their glycosarninoglycan attachment regions. The invention includes the use of recombinant DNA vectors to produce proteins in prokaryotic cells and proteoglycans in eukaryotic cells, and a variety of techniques to link the functional domains to biological effector molecules, cell surface receptors, drugs, antibodies, diagnostic agents, and components of microorganisms.
2. Description of the Background
The cellular behavior responsible for the development, repair and maintenance of tissues is regulated, in large part, by interactions between cells and components of their microenvironment. These interactions are mediated by cell surface molecules acting as receptors that bind large insoluble matrix molecules, growth factors, enzymes, and other molecules that induce responses which result in changes of cellular phenotype. Several proteins associated with the cell surface can bind these components. These proteins differ in their specificity and affinity and in their mode of association with the cell surface.
The present inventors have studied a lipophilic proteoglycan containing both heparan sulfate and chondroitin sulfate that is found at the surface of mouse mammary epithelial cells and that behaves as a high affinity receptor specific for multiple components of the interstitial matrix. This proteoglycan has been given the name syndecan-1. The proteoglycan binds the epithelial cells via its heparan sulfate chains to collagen types I, III, and V (Koda, J. E., Rapraeger, A., and Bernfield, M., J. Biol. Chem. (1985) 260: 8157-8162), fibronectin (Saunders, S. and Bernfield, M., J. Cell Biol. (1988) 106: 423-430), and thrombospondin. When its extracellular domain (ectodomain) is cross-linked at the cell surface, it associates intracellularly with the actin cytoskeleton, and the isolated proteoglycan binds directly or indirectly to F-actin (Rapraeger, A., and Bernfield, M., J. Biol. Chem. (1985) 260: 4103-4109). Cultured cells shed the ectodomain from their apical surfaces as a nonlipophilic proteoglycan that contains all of the glycosaminoglycan of the intact molecule. Upon suspension of these cells, the extracellular domain is cleaved from the cell surface; the proteoglycan is not replaced while the cells are suspended (Jalkanen, M., Rapraeger, A., Saunders, S., and Bernfield, M., J. Cell Biol. (1987) 105: 30873096). The proteoglycan is mainly on epithelia in mature tissues (Hayashi, K., Hayashi, M., Jalkanen, M., Firestone, J. H., Trelstad, R. L., and Bernfield, M., J. Histochem. Cytochem. (1987)35: 1079-1088).
Syndecan-1 undergoes substantial regulation; its size, glycosaminoglycan composition and location at the cell surface vary between cell types, and its expression changes during development. The proteoglycan is located exclusively at the basolateral cell surface of simple epithelia but surrounds stratified epithelial cells. At basolateral cell surfaces, it appears to contain two heparan sulfate and two chrondroitin sulfate chains, but where it surrounds cells, it contains only a single heparan sulfate chain and a single small chrondroitin sulfate chain (Sanderson, R. D., and Bernfield, M., Proc. Natl. Acad. Sci. USA (1987) 238: 491-497). In self-renewing epithelial cell populations, such as the epidermis or vagina, the proteoglycan is lost when the cells terminally differentiate (Hayashi, K., Hayashi, M., Boutin, E., Cunha, G. R., Bernfield, M., and Trelstad, R. L., J. Lab. Invest. (1988) 58: 68-76). In embryos, the proteoglycan is transiently lost when epithelia change their shape and is transiently expressed by mesenchymal cells undergoing morphogenetic tissue interaction.
Heparan sulfate proteoglycans are ubiquitous on the surfaces of adherent cells and bind various ligands including extracellular matrix, growth factors, proteinase inhibitors, and lipoprotein lipase; see Fransson, L., Trends Biochem. Sci. (1987) 12: 406411, Bernfield et al. (1992) Annu. Rev. Cell. Biol. 8:365-93 However, despite much study of these molecules, no structure was known for the core protein prior to this invention of any such cell surface proteoglycan.
For general background on genetic engineering, see Watson, J. D., The Molecular Biology of the Gene, 4th Ed., Benjamin, Menlo Park, Calif., (1988).
Accordingly, it is an object of this invention to provide eukaryotic cells capable of providing useful-quantities of syndecanand proteins of similar function from multiple species.
It is a further object of this invention to provide a recombinant DNA vector containing a heterologous segment encoding syndecan-1 or a related protein that is capable of being inserted into a microorganism or eukaryotic cell and expressing the encoded protein.
It is still another object of this invention to provide a DNA or RNA segment of defined structure that can be produced synthetically or isolated from natural sources and that can be used in the production of the desired recombinant DNA vectors or that can be used to recover related genes from other sources.
It is yet another object of this invention to provide a peptide that can be produced synthetically in a laboratory or by a microorganism which will mimic the activity of natural syndecan-1 core protein and which can be used to produce proteoglycans and glycosaminoglycans in eukaryotic cells in a reproducible and standardized manner.
It is yet a further object of this invention to provide novel heparan sulfate attachment sequences which are identified by combinatorial mutagenesis.
It is another object of this invention to provide chimeric molecules which comprise at least a heparan sulfate glycosaminoglycan chain derived from a syndecan. The chimeric molecule can be, by way of illustration, a fusion protein which includes a functional heparan sulfate attachment sequence placed into other proteins which normally do not have heparan sulfate glycosaminoglycan chains.
It is yet a further object of this invention to provide therapeutic agents comprising heparan sulfate glycosaminoglycans to act agonistically or antagonistically to a biological activity.
These and other objects of the invention as will hereinafter become more readily apparent have been accomplished by providing an isolated proteoglycan having a core polypetide molecular weight of about 30 kD to about 35 kD, and comprising a hydrophilic amino terminal extracellular region, a hydrophilic carboxy terminal cytoplasmic region, a transmembrane hydrophobic region between said cytoplasmic and extracellular regions, a protease susceptible cleavage sequence extracellularly adjacent the transmembrane region of the peptide, and at least one glycosylation site for attachment of a heparan sulfate chain to said extracellular region, said glycosylation site comprising a heparan.sulfate attachment sequence represented by a formula Xac-Z-Ser-Gly-Ser-Gly SEQ ID NO. [44], where Xac represents an amino acid residue having an acidic sidechain, and Z represents from 1 to 10 amino acid residues. The proteoglycan can include at least one heparan sulfate glycosaminoglycan attached at said glycosylation site, as well as at least one chondroitin sulfate glycosaminoglycan attached at other sites on the protein.
Particularly preferred are peptides of
(a) a first formula:
M-R-R-A-A-L-W-L-W-L-C-A-L-A-L-R-L-Q-P-A-L-P-Q-I-V-A-V-N-V-P-P-E-D-Q-D-G-S-G-D-D-S-D-N-F-S-G-S-G-T-G-A-L-P-D-T-L-S-R-Q-T-P-S-T-W-K-D-V-W-L-L-T-A-T-P-T-A-P-E-P-T-S-S-N-T-E-T-A-F-T-S-V-L-P-A-G-E-K-P-E-E-G-E-P-V-L-H-V-E-A-E-P-G-F-T-A-R-D-K-E-K-E-V-T-T-R-P-R-E-T-V-Q-L-P-I-T-Q-R-A-S-T-V-R-V-T-T-A-Q-A-A-V-T-S-H-P-H-G-G-M-Q-P-G-L-H-E-T-S-A-P-T-A-P-G-Q-P-D-H-Q-P-P-R-V-E-G-G-G-T-S-V-I-K-E-V-V-E-D-G-T-A-N-Q-L-P-A-G-E-G-S-G-E-Q-D-F-T-F-E-T-S-G-E-N-T-A-V-A-A-V-E-P-G-L-R-N-Q-P-P-V-D-E-G-A-T-G-A-S-Q-S-L-L-D-R-K-E-V-L-G-G-V-I-A-G-G-L-V-G-L-I-F-A-V-C-L-V-A-F-M-L-Y-R-M-K-K-K-D-E-G-S-Y-S-L-E-E-P-K-Q-A-N-G-G-A-Y-Q-K-P-T-K-Q-E-E-F-Y-A amino acids 23-311 of SEQ ID NO. 2
(b) a second formula:
Q-I-V-A-V-N-V-P-P-E-D-Q-D-G-S-G-D-D-S-D-N-F-S-G-S-G-T-G-A-L-P-D-T-L-S-R-Q-T-P-S-T-W-K-D-V-W-L-L-T-A-T-P-T-A-P-E-P-T-S-S-N-T-E-T-A-F-T-S-V-L-P-A-G-E-K-P-E-E-G-E-P-V-L-H-V-E-A-E-P-G-F-T-A-R-D-K-E-K-E-V-T-T-R-P-R-E-T-V-Q-L-P-I-T-Q-R-A-S-T-V-R-V-T-T-A-Q-A-A-V-T-S-H-P-H-G-G-M-Q-P-G-L-H-E-T-S-A-P-T-A-P-G-Q-P-D-H-Q-P-P-R-V-E-G-G-G-T-S-V-I-K-E-V-V-E-D-G-T-A-N-Q-L-P-A-G-E-G-S-G-E-Q-D-F-T-F-E-T-S-G-E-N-T-A-V-A-A-V-E-P-G-L-R-N-Q-P-P-V-D-E-G-A-T-G-A-S-Q-S-L-L-D-R-K-E-V-L-G-G-V-I-A-G-G-L-V-G-L-I-F-A-V-C-L-V-A-F-M-L-Y-R-M-K-K-K-D-E-G-S-Y-S-L-E-E-P-K-Q-A-N-G-G-A-Y-Q-K-P-T-K-E-E-F-Y-A amino acids 23-311 of SEQ ID NO. 2
(c) a third formula in which at least one amino acid in said first formula or said second formula is replaced by a different amino acid, with the proviso that the replacements do not substantially alter attachment of a syndecan heparan sulfate glycosaminoglycan chain to the proteoglycan,
(d) a fourth formula in which from 1 to 15 amino acids are absent from either the amino terminal, the carboxy terminal, or both terminals of said first formula, said second formula, or said third formula, or
(e) a fifth formula in which from 1 to 10 additional amino acids are attached sequentially to the amino terminal, carboxy terminal, or both terminals of said first formula, said second formula, or said third formula,
as well as salts of compounds having said formulas.
DNA and RNA molecules, recombinant DNA vectors, and modified microorganisms or eukaryotic cells comprising a nucleotide sequence that encodes any of the peptides indicated above are also part of the present invention. In particular, sequences comprising all or part of the following DNA sequence, a complementary DNA or RNA sequence, or a corresponding RNA sequence are especially preferred:
ATGAGACGCGCGGCGCTCTGGCTCTGGCTCTGCGCGCTGGCGCTGCGCCTGCAGCCTG CCCTCCCGCAAATTGTGGCTGTAAATGTTCCTCCTGAAGATCAGGATGGCTCTGGGGA TGACTCTGACAACTTCTCTGGCTCTGGCACAGGTGCTTTGCCAGATACTTTGTCACGG CAGACACCTTCCACTTGGAAGGACGTGTGGCTGTTGACAGCCACGCCCACAGCTCCAG AGCCCACCAGCAGCAACACCGAGACTGCTTTTACCTCTGTCCTGCCAGCCGGAGAGAA GCCCGAGGAGGGAGAGCCTGTGCTCCATGTAGAAGCAGAGCCTGGCTTCACTGCTCCG GACAAGGAAAGGAGGTCACCACCAGGCCCAGGGAGACCGTGCAGCTCCCCATCACCCA ACGGGCCTCAACAGTCAGAGTCACCACAGCCCAGGCAGCTGTCACATCTCATCCGCAC GGGGGCATGCAACCTGGCCTCCATGAGACCTCGGCTCCCACAGCACCTGGTCAACCTG ACCATCAGCCTCCACGTGTGGAGGGTGGCGGCACTTCTGTCATCAAAGAGGTTGTCGA GGATGGAACTGCCAATCAGCTTCCCGCAGGAGAGGGCTCTGGAGAACAAGACTTCACC TTTGAAACATCTGGGGAGAACCAGCTGTGGCTGCCGTAGAGCCCGGCCTGCGGAATCA GCCCCCGGTGGACGAAGGAGCCCAGGTGCTTCTCAGAGCCTTTTGGACAGGAAGGAAG TGCTCCCACCTCTCATTGCCGGAGCCTAGTGGGCCTCATCTTTGCTGTGTGCCTGGTG GCTTTCATGCTGTACCGGATGAAGAGAAGGACGAAGGCAGCTACTCCTTCCAGGAGCC CAAACAAGCCAATGGCGGTGCCTACAAACCCACCAAGCAGGAGGAGTTCTACGCC amino acids 240-1172 of SEQ ID NO. 1
DNA and RNA molecules containing segments of the larger sequence are also provided for use in carrying out preferred aspects of the invention relating to the production of such peptides by the techniques of genetic engineering and the production of oligonucleotide probes.