This invention relates to the field of genetic engineering, and, in particular, relates to the nucleotide sequence of a novel human gene. More particularly, this invention relates to the cDNA sequence of a novel human Growth/Differentiation Factor (hGDF3-2), which is a splice variant of hGDF3. The invention also relates to the polypeptides encoded by the nucleotide sequence, the uses of these polynucleotides and polypeptides, and the methods for producing them.
Transforming Growth Factor-xcex2 (TGF-xcex2) was discovered about 15 years ago, using biochemical means. It is a protein with many biological regulatory activities. Shortly after the discovery, it was found that TGF-xcex2 represented a group of growth factors with various functions. In different organisms, these factors exert important regulatory functions on cell growth, differentiation and tissue morphogenesis. (Handbook of Experimental Pharmacology, 1990, Vol.95, p419-475, Springer Verlag, Geidelberg). TGF-xcex2, together with these related proteins, forms a superfamily named TFG-xcex2 superfamily. Up to now, the TGF-xcex2 superfamily has over 30 different members. There are four major families in the TGF-xcex2 superfamily (Proc Soc Exp Biol Med, 1997, 214(1), 27-40), which are: (1) the Mullerian inhibitory substance (MIS) familyxe2x80x94MIS regulates Mullerian duct regression in male embryos; (2) the inhibin/activin familyxe2x80x94Inhibins block the follicle stimulating hormone (FSH) release by the pituitary cell, and activins stimulate FSH release; (3) Vg-related family, which includes bone morphogenic protein (BMP), dorsalin-1 (which regulates the differentiation of neural tube), growth/differentiation factor GDF-1, DPP, Vgl in the Xenopus and the murine homolog Vgr-1, etc.; (4) TGF-xcex2 family, which includes five isoforms of TGF-xcex2 (TGF-xcex2 1-5).
As a representative of this superfamily, TGF-xcex2 has been extensively and intensively studied. The investigation indicates that TGF-xcex2 is a strong endogenous mediators of tissue repair via their stimulatory effects on chemotaxis, angiogenesis, and extracellular matrix (ECM) deposition within the wound: environment. (Clin Immunol Immunopatol, 1997,83(1), 25-30). TGF-xcex2 also regulates the growth and differentiation of various cells (Bioessays,1997, 19(7), 581-591), either positively or negatively. Most of the evidences suggest that TGF-xcex2 exerts its regulatory effects at the G1 phase of cell cycle. Besides, it is reported that TGF-xcex2 can induce cell death of some sensitive cell types, including hepatoma, myeloid, and osteoclast cells. In vitro experiments also show TGF-xcex2 regulates the differentiation of various cell strains, though the mechanism is still unknown. The regulatory activity of TGF-xcex2 on cell growth and differentiation naturally leads to considerations on the potential application in chemotherapy and cancer therapy. There have been considerable amount of reports concerning these topics (Clin Immunol Immunopathol, 1997, 83(1), 25-30; Bioessays, 1997, 19(7),581-591).
Members of TGF-xcex2 family have been found in many species, e.g., Xenopus, fowl, mice, swine, bovine, etc. Human TGF-xcex2 (xe2x88x921,xe2x88x922,xe2x88x923) were cloned in the late 1980""s. Among them, the sequencing of TGF-xcex21, was finished by Derynck R et al. in 1985. (Nature, 1985, 316(6030), 701-705). By analyzing the sequence encoding TGF-xcex21, they found that functional TGF-xcex2 was produced by splicing a precursor that was much longer than the mature protein. Later, people found this phenomenon was common in TGF-xcex2 superfamily. In 1988, the TGF-xcex22 and TGF-xcex23 nucleotide sequences were obtained by Madisen L et al. and Ten Dijke P et al., respectively. (Proc Natl Acad Sci USA, 1988, 85(13), 4715-4719; DNA, 1988, 7(1), 1-8). Sequence comparison showed the homology between TGF-xcex22, TGF-xcex23 and TGF-xcex21 was 70%-80%.
Along with the steady improvements of gene cloning and sequencing techniques, more and more members of TGF-xcex2 superfamily have been cloned since 1990. Alexandra. C reported in 1993 that they found a novel member of TGF-xcex2 superfamilyxe2x80x94murine Growth/Differentiation Factor 3, GDF-3 (J. Biol. Chem., 1993, 268(5), 3444-3449). The homology between GDF-3 and other members of the TGF-xcex2 superfamily is not very high. But it still contains the unique conservative sequence of the TGF-xcex2 superfamily. In particular, it lacks the fourth cystein of the seven conservative cysteins of the superfamily, indicating that it might have some particular property.
The homologue of GDF-3 in human was cloned in 1998 (Oncogene, 1998, 16, 95-103). This protein is highly homologous to the murine GDF-3, and thus named hGDF-3. Nevertheless, it is noteworthy that hGDF-3 is much shorter than GDF-3, mainly due to the lack of nearly 50 residues in the N-terminal. Moreover, two residues corresponding to residues 128 and 248 in the murine GDF-3 are also deleted in hGDF-3. This change of hGDF-3 is supposed to be the result of alternative splicing variation or genetic evolution.
Prior to this invention, no other forms of hGDF3 has been isolated or disclosed.
One purpose of the invention is to provide a new polynucleotide which encodes a splice variant of human growth/differentiation factor hGDF3. The splice variant of hGDF3 of the invention is named hGDF3-2.
Another purpose of the invention is to provide a novel protein, which is named hGDF3-2.
Still another purpose of the invention is to provide a new method for preparing said new hGDF3-2 protein by recombinant techniques.
The invention also relates to the uses of said hGDF3-2 protein and its coding sequence.
In one aspect, the invention provides an isolated DNA molecule, which comprises a nucleotide sequence encoding a polypeptide having human hGDF3-2 protein activity, wherein said nucleotide sequence shares at least 70% homology to the nucleotide sequence of nucleotides 14-1105 in SEQ ID NO: 5, or said nucleotide sequence can hybridize to the nucleotide sequence of nucleotides 14-1105 in SEQ ID NO: 5 under moderate stringency. Preferably, said nucleotide sequence encodes a polypeptide comprising the amino acid sequence of SEQ ID NO: 6. More preferably, the sequence comprises the nucleotide sequence of nucleotides 14-1105 in SEQ ID NO: 5.
Further, the invention provides an isolated hGDF3-2 polypeptide, which comprises a polypeptide having the amino acid sequence of SEQ ID NO: 6, its active fragments, and its active derivatives. Preferably, the polypeptide is a polypeptide having the amino acid sequence of SEQ ID NO: 6.
The invention also provides a vector comprising said isolated DNA.
The invention further provides a host cell transformed with said vector.
In another aspect, the invention provides a method for producing a polypeptide with the activity of hGDF3-2 protein, which comprises:
(a) forming a hGDF3-2 protein expression vector comprising the nucleotide sequence encoding the polypeptide having the activity of hGDF3-2 protein, wherein said nucleotide sequence is operably linked with an expression regulatory sequences, and said nucleotide sequence shares at least 70% homology to the nucleotide sequence of positions 14-1105 in SEQ ID NO: 5;
(b) introducing the vector of step (a) into a host cell, thereby forming a recombinant cell of hGDF3-2 protein;
(c) culturing the recombinant cell of step (b) under the conditions suitable for the expression of hGDF3-2 polypeptides;
(d) isolating the polypeptides having the activity of hGDF3-2 protein.
In one embodiment of the present invention, the isolated polynucleotide has a full length of 1141 nucleotides, whose detailed sequence is shown in SEQ ID NO: 5. The open reading frame (ORF) is located at nucleotides 14-1105.
In the present invention, the term xe2x80x9cisolatedxe2x80x9d or xe2x80x9cpurifiedxe2x80x9d or xe2x80x9csubstantially purexe2x80x9d DNA refers to a DNA or fragment which has been isolated from the sequences which frank it in a naturally occurring state. The term also applies to DNA or DNA fragment which has been isolated from other components naturally accompanying the nucleic acid and from proteins naturally accompanying it in the cell.
In the present invention, the term xe2x80x9chGDF3-2 protein encoding sequencexe2x80x9d or xe2x80x9chGDF3-2 polypeptide encoding sequencexe2x80x9d refers to a nucleotide sequence encoding a polypeptide having the activity of hGDF3-2 protein, such as the nucleotide sequence of positions 14-1105 in SEQ ID NO: 5 or its degenerate sequence. The degenerate sequences means the sequences formed by replacing one or more codons in the ORF of 14-1105 in SEQ ID NO: 5 with degenerate codes which encode the same amino acid. Because of the degeneracy of codon, the sequence having a homology as low as about 70% to the sequence of nucleotides 14-1105 in SEQ ID NO: 5 can also encode the sequence shown in SEQ ID NO: 6. The term also refers to the nucleotide sequences that hybridize to the nucleotide sequence of nucleotides 14-1105 in SEQ ID NO: 5 under moderate stringency or preferably under high stringency. In addition, the term also refers to the sequences having a homology of at least 70%, preferably 80%, more preferably 90% to the nucleotide sequence of nucleotides 14-1105 in SEQ ID NO: 5.
The term also refers to variants of the sequence in SEQ ID NO: 5, which are capable of encoding a protein having the same function as human hGDF3-2 protein. These variant""s includes, but are not limited to, deletions, insertions and/or substitutions of several nucleotides (typically 1-90, preferably 1-60, more preferably 1-20, and most preferably 1-10) and additions of several nucleotides (typically less than 60, preferably 30, more preferably 10, most preferably 5) at 5xe2x80x2 end and/or 3xe2x80x2 end.
In the present invention, xe2x80x9csubstantially purexe2x80x9d proteins or polypeptides refers to those which occupy at least 20%, preferably at least 50%, more preferably at least 80%, most preferably at least 90% of the total sample material (by wet weight or dry weight). Purity can be measured by any appropriate method, e.g., in the case of polypeptides by column chromatography, PAGE or HPLC analysis. A substantially purified polypeptides is essentially free of naturally associated components.
In the present invention, the term xe2x80x9chGDF3-2 polypeptidexe2x80x9d or xe2x80x9chGDF3-2 proteinxe2x80x9d refers to a polypeptide having the activity of hGDF3-2 protein comprising the amino acid sequence of SEQ ID NO: 6. The term also comprises the variants of said amino acid sequence which have the same function of human hGDF3-2. These variants include, but are not limited to, deletions, insertions and/or substitutions of several amino acids (typically 1-50, preferably 1-30, more preferably 1-20, most preferably 1-10), and addition of one or more amino acids (typically less than 20, preferably less than 10, more preferably less than 5) at C-terminal and/or N-terminal. For example, the protein functions are usually unchanged when an amino residue is substituted by a similar or analogous one. Further, the addition of one or several amino acids at C-terminal and/or N-terminal will not change the function of protein. The term also includes the active fragments and derivatives of hGDF3-2 protein.
The variants of polypeptide include homologous sequences, allelic variants, natural mutants, induced mutants, proteins encoded by DNA which hybridizes to hGDF3-2 DNA under high or low stringency conditions as well as the polypeptides or proteins retrieved by antisera raised against hGDF3-2 polypeptide. The present invention also provides other polypeptides, e.g., fusion proteins, which include the hGDF3-2 polypeptide or fragments thereof. In addition to substantially full-length polypeptide, the soluble fragments of hGDF3-2 polypeptide are also included. Generally, these fragments comprise at least 10, typically at least 30, preferably at least 50, more preferably at least 80, most preferably at least 100 consecutive amino acids of hGDF3-2 polypeptide.
The present invention also provides the analogues of hGDF3-2 protein or polypeptide. Analogues can differ from naturally occurring hGDF3-2 polypeptide by amino acid sequence differences or by modifications which do not affect the sequence, or by both. These polypeptides include genetic variants, both natural and induced. Induced variants can be made by various techniques, e.g., by random mutagenesis using irradiation or exposure to mutagens, or by site-directed mutagenesis or other known molecular biologic techniques. Also included are analogues which include residues other than those naturally occurring L-amino acids (e.g., D-amino acids) or non-naturally occurring or synthetic amino acids (e.g., beta- or gamma-amino acids). It is understood that the polypeptides of the invention are not limited to the representative polypeptides listed hereinabove.
Modifications (which do not normally alter primary sequence) include in vivo, or in vitro chemical derivation of polypeptides, e.g., acelylation, or carboxylation. Also included are modifications of glycosylation, e.g., those made by modifying the glycosylation patterns of a polypeptide during its synthesis and processing or in the further processing steps, e.g., by exposing the polypeptide to enzymes which affect glycosylation (e.g., mammalian glycosylating or deglycosylating enzymes). Also included are sequences which have phosphorylated amino acid residues, e.g., phosphotyrosine, phosphoserine, phosphothronine, as well as sequences which have been modified to improve their resistance to proteolytic degradation or to optimize solubility properties.
The invention also includes antisense sequence of the sequence encoding hGDF3-2 polypeptide. Said antisense sequence can be used to inhibit expression of hGDF3-2 in cells.
The invention also includes probes, typically having 8-100, preferably 15-50 consecutive nucleotides. These probes can be used to detect the presence of nucleic acid molecules coding for hGDF3-2 in samples.
The present invention also includes methods for detecting hGDF3-2 nucleotide sequences, which comprises hybridizing said probes to samples, and detecting the binding of the probes. Preferably, the samples are products of PCR amplification. The primers in PCR amplification correspond to coding sequence of hGDF3-2 polypeptide and are located at both ends or in the middle of the coding sequence. In general, the length of the primers is 20 to 50 nucleotides.
A variety of vectors known in the art, such as those commercially available, are useful in the invention.
In the invention, the term xe2x80x9chost cellsxe2x80x9d includes prokaryotic and eukaryotic cells. The common prokaryotic host cells include Escherichi coli, Bacillus subtilis, and so on. The common eukaryotic host cells include yeast cells, insect cells, and mammalian cells. Preferably, the host cells are eukaryotic cells, e.g., CHO cells, COS cells, and the like.
In another aspect, the invention also includes antibodies, preferably monoclonal antibodies, which are specific for polypeptides encoded by hGDF3-2 DNA or fragments thereof: By xe2x80x9cspecificityxe2x80x9d, it is meant an antibody which binds to the hGDF3-2 gene products or a fragments thereof. Preferably, the antibody binds to the hGDF3-2 gene products or a fragments thereof and does not substantially recognize nor bind to other antigenically unrelated molecules. Antibodies which bind to hGDF3-2 and block hGDF3-2 protein and those which do not affect the hGDF3-2 function are included in the invention. The invention also includes antibodies which bind to the hGDF3-2 gene product in its unmodified as well as modified form.
The present invention includes not only intact monoclonal or polyclonal antibodies, but also immunologically-active antibody fragments, e.g., a Fabxe2x80x2 or (Fab)2 fragment, an antibody light chain, an antibody heavy chain, a genetically engineered single chain Fv molecule (Lander, et al., U.S. Pat. No. 4,946,778), or a chimeric antibody, e.g., an antibody which contains the binding specificity of a murine antibody, but the remaining portion of which is of human origin.
The antibodies in the present invention can be prepared by various techniques known to those skilled in the art. For example, purified hGDF3-2 gene products, or its antigenic fragments can be administrated to animals to induce the production of polyclonal antibodies. Similarly, cells expressing hGDF3-2 or its antigenic fragments can be used to immunize animals to produce antibodies. Antibodies of the invention can be monoclonal antibodies which can be prepared by using hybridoma technique (See Kohler, et al., Nature, 256; 495,1975; Kohler, et al., Eur. J. Immunol. 6: 511,1976; Kohler, et al., Eur. J. Immunol. 6: 292, 1976; Hammerling, et al., In Monoclonal Antibodies and T Cell Hybridomas, Elsevier, N.Y., 1981). Antibodies of the invention comprise those which block hGDF3-2 function and those which do not affect hGDF3-2 function. Antibodies in the invention can be produced by routine immunology techniques and using fragments or functional regions of hGDF3-2 gene product. These fragments and functional regions can be prepared by recombinant methods or synthesized by a polypeptide synthesizer. Antibodies binding to unmodified hGDF3-2 gene product can be produced by immunizing animals with gene products produced by prokaryotic cells (e.g., E. coli); antibodies binding to post-translationally modified forms thereof can be acquired by immunizing animals with gene products produced by eukaryotic cells (e.g., yeast or insect cells).
The full length human hGDF3-2 nucleotide sequence or its fragment of the invention can be prepared by PCR amplification, recombinant method and synthetic method. For PCR amplification, one can obtain said sequences by designing primers based on the nucleotide sequence disclosed in the invention, especially the sequence of ORF, and using cDNA library commercially available or prepared by routine techniques known in the art as a template. When the sequence is long, it is usually necessary to perform two or more PCR amplifications and link the amplified fragments together in the correct order.
Once the sequence is obtained, a great amount of the sequences can be produced by recombinant methods. Usually, said sequence is cloned in a vector which is then transformed into a host cell. Then the sequence is isolated from the amplified host cells using conventional techniques.
Further, the sequence can be produced by synthesis. Typically, several small fragments are synthesized and linked together to obtain a long sequence. At present, it is completely feasible to chemically synthesize the DNA sequence encoding the protein of the invention, or the fragments or derivatives thereof. In addition, the mutation can be introduced into the sequence of the protein by chemical synthesis.
In addition to recombinant techniques, the protein fragments of the invention may also be prepared by direct chemical synthesis using solid phase synthesis techniques (Stewart et al., (1969) Solid-Phase Peptide 20 Synthesis, WH Freeman Co., San Francisco; Merrifield J. (1963), J. Am. Chem. Assoc. 85: 2149-2154). In vitro protein synthesis can be performed manually or automatically, e.g., using a Model 431 Peptide Synthesizer (Applied Biosystems, Foster City, Calif.). The fragments of protein of the invention can be synthesized separately and linked together using chemical methods so as to produce full-length molecule.
The sequences encoding the protein of the present invention are also valuable for gene mapping. For example, the accurate chromosome mapping can be performed by hybridizing cDNA clones to a chromosome in metaphase. This technique can use cDNA as short as about 500 bp, or as long as about 2000 bp, or more. For details, see Verma et al., Human Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York (1988).
Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. Such data are found in, e.g., Mendelian Inheritance in Man (available on-line through Johns Hopkins: University Welch Medical Library). The relationships between genes and diseases that have been mapped to the same chromosomal region are then identified through linkage analysis.
Then, the differences in the cDNA or genomic sequence between affected and unaffected individuals can also be determined. If a mutation is observed in some or all of the affected individuals but not in any normal individual, then the mutation is likely to be the causative agent of the disease.
The substances which act with the hGDF3-2, e.g., receptors, inhibitors and antagonists, can be screened out by various conventional techniques, using the protein of the invention.
The protein, antibody, inhibitor, antagonist or receptor of the invention provide different effects when administrated in therapy. Usually, these substances are formulated with a non-toxic, inert and pharmaceutically acceptable aqueous carrier. The pH typically ranges from 5 to 8, preferably from about 6 to 8, although pH may alter according to the property of the formulated substances and the diseases to be treated. The formulated pharmaceutical composition is administrated in conventional routine including, but not limited to, intramuscular, intraperitoneal, subcutaneous, intracutaneous, or topical administration.
As an example, the human hGDF3-2 protein of the invention may be administrated together with the suitable and pharmaceutically acceptable carrier. The examples of carriers include, but are not limited to, saline, buffer solution, glucose, water, glycerin, ethanol, or the combination thereof. The pharmaceutical formulation should be suitable for the delivery method. The human hGDF3-2 protein of the invention may be in the form of injections which are made by conventional methods, using physiological saline or other aqueous solution containing glucose or auxiliary substances. The pharmaceutical compositions in the form of tablet or capsule may be prepared by routine methods. The pharmaceutical compositions, e.g., injections, solutions, tablets, and capsules, should be manufactured under sterile conditions. The active ingredient is administrated in therapeutically effective amount, e.g., from about lug to 5 mg per kg body weight per day. Moreover, the polypeptide of the invention can be administrated together with other therapeutic agent.
When the human hGDF3-2 polypeptides of the invention are used as a pharmaceutical, the therapeutically effective amount of the polypeptides are administrated to mammals. Typically, the therapeutically effective amount is at least about 10 ug/kg body weight and less than about 8 mg/kg body weight in most cases, and preferably about 10 ug-1 mg/kg body weight. Of course, the precise amount will depend upon various factors, such as delivery methods, the subject health, and the like, and is within the judgment of the skilled clinician.