This application is the national phase under 35 U.S.C. xc2xa7371 of PCT International Application No. PCT/EP99/00502 which has an International filing date of Jan. 28, 1999, which designated the United States of America.
The present invention relates to recombinant proteins obtained from the combination of structural domains derived from the xcex1 subunits of hepatocyte growth factor (HGF) and macrophage stimulating protein (MSP).
In particular, the engineered factors of the invention are obtained by combining the hairpin loop and kringle domains of HGF xcex1 chains and/or MSP, together with HGF or MSP xcex2 chains, to obtain a structure having two superdomains joined by an intervening linker sequence. Moreover, the invention relates to DNA sequences encoding the above mentioned recombinant proteins, to the expression vectors comprising said DNA sequences and to host cells containing said expression vectors. The recombinant proteins of the present invention are biologically active and protect epithelial cells and other cells from apoptosis induced by chemotherapic drugs. Therefore, these molecules can conveniently be used to prevent or treat the toxic side effects of the chemotherapeutical treatment of tumours.
Hepatocyte Growth Factor (HGF) and Macrophage Stimulating Protein (MSP) are highly related proteins both structurally and functionally (FIGS. 1 and 2). Both these factors are secreted as an inactive precursor, which is processed by specific proteases which recognise a cleavage site inside the molecule, dividing the protein in two subunits. These subunits, named xcex1 chain and xcex2 chain, are linked by a disulphide bond. Thus, the mature factor is an xcex1-xcex2 dimeric protein. Only the mature (dimeric) form of the factor is able to activate its receptor at the surface of the target cells (the Met tyrosine kinase in the case of HGF and the Ron tyrosine kinase in the case of MSP) and therefore to mediate biological responses (Naldini, L. et al., 1992, EMBO J. 11: 4825-4833; Wang, M. et al., 1994, J. Biol. Chem. 269; 3436-3440; Bottaro, D. et al., 1991, Science 25: 802-804; Naldini, L. et al., 1991, EMBO J. 10: 2867-2878; Wang, M. et al., 1994, Science 266: 117-119; Gaudino, G. et al., 1994, EMBO J. 13: 3524-3532).
The xcex1 chain of both factors contains a hairpin loop (HL) structure and four domains with a tangle-like structure named kringles (K1-K4; Nakamura, T. et al., 1989, Nature 342: 440-443; Han, S. et al., 1991, Biochemistry 30: 9768-9780). The precursor also contains a signal sequence (LS) of 31 amino acids (in the case of HGF) or of 18 amino acids (in the case of MSP), removed in rough endoplasmic reticulum, which directs the neoformed peptide to the secretive pathway. The xcex2 chain contains a sequence box homologous to the typical catalytic domain of serine proteases, but it has no enzymatic activity (Nakamura, T. et al., 1989, Nature 342:440-443; Han, S. et al., 1991, Biochemistry 30:9768-9780). Both xcex1 and xcex2 chains contribute to the binding of the growth factor to the respective receptor (Met for HGF and Ron for MSP).
HGF and MSP polypeptides are able to induce a variety of biological effects besides cell proliferation. The main biological activities of these molecules are: stimulation of cell division (mitogenesis); stimulation of motility (scattering); induction of polarisation and cell differentiation; induction of tubule formation (branched morphogenesis), increase of cell survival (protection from apoptosis). The tissues that respond to HGF and MSP stimulation are those containing cells that express the respective Met (HGF) and Ron (MSP) receptors. The most important target tissues of these factors are epithelia of different organs, such as liver, kidney, lung, breast, pancreas and stomach, and some cells of the hematopoietic and nervous systems. A detailed review of the biological effects of HGF and MSP in the various tissues can he found in: Tamagnone, L. and Comoglio, P., Cytokine and Growth Factor Reviews, 1997, 8: 129-142, Elsevier Science Ltd.; Zarnegar, R. and Michalopoulos, G., 1995, J. Cell Biol. 129: 1177-1180; Medico, E. et al., 1996, Mol. Biol. Cell, 7: 495-504; Banu, N. et al., 1996, J. Immunol. 156: S2933-2940.
In the case of HGF, the hairpin loop and the first two kringles are known to contain the sites of direct interaction with the Met receptor (Lokker, N. et al., 1992, EMBO J. 11: 2503-251.0; Lokker, N. et al., 1994, Protein Engineering 7: 895-903). Two naturally-occurring truncated forms of HGF produced by some cells by alternative splicing have been described. The first one comprises the first kringle (NK1-HGF Cioce, V. et al., 1996, J. Biol. Chem. 271: 13110-13115) whereas the second one spans to the second kringle (NK2-HGF Miyazawa, K. et al., 1991, Eur. J. Biochem. 197: 15-22). NK2-HGF induces cell scattering, but it is not mitogenic as the complete growth factor is (Hartmann, G. et al., 1992, Proc. Natl. Acad. Sci. USA 89: 11574-11578). However, NK2-HGF exhibits mitocenic activity in the presence of heparin, a glycosaminoglycan that binds the first kringle of HGF and is likely to induce dimerization of NK2-HGF (Schwall, R. et al., 1996, J. Cell Biol. 133: 709-718). Moreover NK2-HGF, being a partial agonist of Met, behaves as a competitive inhibitor of HGF as far as the mitogenic activity is concerned (Chan, A. et al., 1991, Science 254: 1382-1385). NK1-HGF has also been described to exert partial stimulation of Met and competitive inhibition of HGF mitogenic activity (Cioce, V. et al., 1996, J. Biol. Chem. 271: 13110-13115).
In the case of MSP, the modality of interaction with the Ron receptor is less understood: some preliminary studies suggest a situation opposite of that of HGF, i.e. the xcex2 chain directly binds the receptor whereas the xcex1 chain stabilises the complex (Wang, M. et al., 1997, J. Biol. Chem. 272: 16999-17004).
The therapeutical use of molecules such as HGF and MSP is potentially valuable in a wide range of pathologies (Abdulla, S., 1997, Mol. Med. Today 3: 233). Nevertheless, a number of technical as well as biological complications make the application of these molecules in clinics difficult.
For example, HGF was shown to protect kidney cells against programmed cell death (apoptosis) induced by cisplatinum, but at the same time it can induce an undesired proliferation of neoplastic cells. The natural truncated forms NK1 and NK2 of HGF show no problems of proteolytic activation, but they have a reduced biological activity.
The present invention provides recombinant molecules deriving from the combination of structural domains of HGF and MSP xcex1 and xcex2 subunits, which overcome the problems of the prior art molecules described above.
The molecules of the invention are composed of two superdomains, one obtained combining HL and K1-K4 domains of HGF and MSP xcex1 chains, the other corresponding to HGF or MSP xcex2 chain, connected by a linker which may contain a proteolytic cleavage site. This structure allows the recombinant proteins to interact with both Met and Ron receptors, in order to induce biological responses which are synergistic and selective compared with the natural factor and the truncated forms of the prior art.
The present invention relates to recombinant proteins (which will be hereinafter referred to indifferently as proteins, molecules, engineered or recombinant factors) characterised by a structure that comprises two superdomains, one consisting of a combination of HL and K1-K4 domains derived from HGF or MSP xcex1 chain, the other corresponding to HGF or MSP xcex2 chain, linked by a spacer sequence or a linker. In particular, the invention relates to proteins of general formula (I)
[A]-B-[C]-(D)yxe2x80x83xe2x80x83(I)
in which
[A] corresponds to the sequence (LS)m-HL-K1-(K2)n-(K3)o-(K4)p wherein (the numbering of the following amino acids being referred to the HGF and MSP sequences as reported in FIG. 1 and 2, respectively):
LS is an amino acid sequence corresponding to residues 1-31 of HGF or 1-18 of MSP;
HL is an amino acid sequence derived from the xcex1 chain of HGF starting between residues 32-70 and ending between residues 96-127; or it is an amino acid sequence derived from the xcex1 chain of MSP starting between residues 19-56 and ending between residues 78-109;
K1 is an amino acid sequence derived from the xcex1 chain of HGF starting between residues 97-128 and ending between residues 201-205; or it is an amino acid sequence derived from the xcex1 chain of MSP starting between residues 79-110 and ending between residues 186-190;
K2 is an amino acid sequence derived from the xcex1 chain of HGF starting between residues 202-206 and ending between residues 283-299; or it is an amino acid sequence derived from the xcex1 chain of MSP starting between residues 187-191 and ending between residues 268-282;
K3 is an amino acid sequence derived from the xcex1 chain of HGF starting between residues 284-300 and ending between residues 378-385; or it is an amino acid sequence derived from the xcex1 chain of MSP starting between residues 269-283 and ending between residues 361-369;
K4 is an amino acid sequence derived from the xcex1 chain of HGF starting between residues 379-386 and ending between residues 464-487; or it is an amino acid sequence derived from the xcex1 chain of MSP starting between residues 362-370 and ending between residues 448-481;
m, n, o, p is 0 or 1;
the sum n+o+p is an integer from 1 to 3 or 0, with the proviso that nxe2x89xa7oxe2x89xa7p;
B is selected from the sequence 488-491 of HGF, the sequence 478-489 of MSP, optionally preceded by a spacer of 1 to 13 aminoacids, a consensus sequence for protease or an uncleavable sequence;
[C] is the sequence of HGF xcex2 chain starting between amino acid residues 5 490 to 492 and ending at residue 723; or it is the sequence of MSP xcex2 chain starting between amino acid residues 484 to 486 and ending at residue 711; with the proviso that, when [A] coincides with HGF or MSP xcex1 chain, [C] corresponds to MSP and HGF xcex2 chain, respectively;
D is the sequence W-Z, wherein W is a conventional proteolytic site, Z any sequence useful for the purification of the protein on nickel or affinity columns; y is 0 or 1.
Non-limiting examples of W are consensus sequences for enterokinase protease, thrombin, factor Xa and IgA protease.
Preferred proteins of general formula (I), are those in which: HL domain is the sequence of HGF xcex1 chain ranging from amino acids 32 to 127, or the sequence of MPS xcex1 chain ranging from amino acids 19 to 98; K1 domain is the sequence of HGF xcex1 chain ranging from amino acids 128 to 203, or the sequence of MPS xcex1 chain ranging from amino acids 99 to 188; K2 domain is the sequence of HGF xcex1 chain ranging from amino acids 204 to 294, or the sequence of MPS xcex1 chain ranging from amino acids 189 to 274; K3 domain is the sequence of HGF xcex1 chain ranging from amino acids 286 to 383, or the sequence of MPS xcex1 chain ranging from amino acids 275 and 367; K4 domain is the sequence of HGF xcex1 chain ranging from amino acids 384 to 487, or the sequence of MPS xcex1 chain ranging from amino acids 368 and 477; C is the sequence 492-723 of HGF xcex2 chain, or the sequence 486-711 of MSP xcex2 chain.
Among the possible combinations of the domains of general formula (I), the following (II) and (III) are preferred, concerning two recombinant factors named Alphabet-1 and Alphabet-RTKR, respectively:
LSHGF-HLHGF-K1HGF-K2HGF-K3HGF-K4HGF-BHGF-Cxcex2MSP-Dxe2x80x83xe2x80x83II (Alphabet 1)
LSHGF-HLHGF-K1HGF-K2HGF-K3HGF-K4HGF-BF-Cxcex2MSP-Dxe2x80x83xe2x80x83III (Alphabet-RTKR)
wherein
LSHGF-HLHGF-K1HGF-K2HGF-K3HGF-K4HGF is the sequence 1-487 of HGF, Cxcex2MSP is the sequence 486-711 of MSP, D is the sequence GNSAVD(H)6(SEQ ID NO:13).
In Alphabet-1 factor, BHGF is the sequence LRVV(SEQ ID NO:14), whereas for Alphabet-RTKR factor, BF is the sequence RTKR-LRVV(SEQ ID NO:15) (RTKR(SEQ ID NO:21) is the cleavage site for furine proteases).
The hybrid molecules of the invention are prepared by genetic engineering techniques according to a strategy involving the following steps:
a) construction of DNA encoding the desired protein;
b) insertion of DNA in an expression vector;
c) transformation of a host cell with recombinant DNA (rDNA);
d) culture of the transformed host cell so as to express the recombinant protein;
e) extraction and purification of the produced recombinant protein.
The DNA sequences corresponding to HGF or MSP structural domains can be obtained by synthesis or starting from DNA encoding for the two natural factors. For example, screening of cDNA libraries can be carried out using suitable probes, so as to isolate HGF or MSP cDNA. Alternatively, HGF or MSP cDNA can be obtained by reverse transcription from purified mRNA from suitable cells.
cDNAs coding for the fragments of HGF and MSP xcex2 chains can be amplificated by PCR (Mullis, K. B. and Faloona, F. A., Methods in Enzymol. 155 (1987) 335-350), and the amplification products can be recombined making use of suitable restriction sites, either naturally occurring in the factor sequences or artificially introduced in the oligonucleotide sequence used for the amplification.
In greater detail, one of the above mentioned strategies can be the following:
the portions of DNA encoding the LS, HL, K1, K2, K3 and K4 domains are amplificated by PCR from HGF or MSP cDNA and then recombined to obtain the hybrid sequences corresponding to [A] and [C]. Oligonucleotides recognising sequences located at the two ends of the domains to be amplificated are used as primers. Primers are designed so as to contain a sequence allowing recombination between the DNA of a domain and the adjacent one. Said recombination can be carried out by endonuclease cleavage and subsequent ligase reaction, or making use of the recombinant PCR method (Innis, NA et al., in PCR Protocols, Academic Press, 1990, 177-183).
Subsequently the cDNA portions encoding for the A and C domains are amplificated by PCR, wherein the antisense primer used to amplificate A and the sense primer used to amplificate C are hybrids, i.e. they contain both the 3xe2x80x2-end sequence of A and the 5xe2x80x2-end sequence of C. Between A and C is placed the domain B, a sequence which may encode a proteolytic cleavage site.
Two amplification products with an identical region artificially inserted are thereby obtained. The presence of this identical sequence allows the hybridisation of the two amplification products and thus the subsequent amplification of the recombinant construct containing the domains [A], B and C.
The amplificated recombinant construct containing the three domains [A], [B] and [C], is then inserted in a suitable vector. In this step it can be decided whether to add or not the domain D (tag), obtained by synthesis as a double strand oligonucleotide, downstream the domain C.
The recombinant expression vector can contain, in addition to the recombinant construct, a promoter, a ribosome binding site, an initiation codon, a stop codon, optionally a consensus site for expression enhancers.
The vector can also comprise a selection marker for isolating the host cells containing the DNA construct. Yeast or bacteria plasmids, such as plasmids suitable for Escherichia Coli, can be used as vectors, as well as bacteriophages, viruses, retroviruses, or DNA.
The vectors are cloned preferably in bacterial cells, for example in Escherichia Coli, as described in Maniatis, Molecular Cloning, Cold Spring Harbor Laboratory, New York, 1982, and the colonies can be selected, for example, by hybridisation with radiolabelled oligonucleotide probes; subsequently, the rDNA sequence extracted from the positive colonies is determined by known methods.
The vector with the recombinant construct can be introduced in the host cell according to the competent cell method, the protoplast method, the calcium phosphate method, the DEAE-dextran method, the electric impulses method, the in vitro packaging method, the viral vector method, the micro-injection method, or other suitable techniques.
Host cells can be prokaryotic or eukaryotic, such as bacteria, yeasts or mammal cells, and they will be such as to effectively produce the recombinant protein.
After transformation, cells are grown in a suitable medium, which can be for example MEM, DMEM or RPMI 1640 in the case of mammal host cells.
The recombinant protein is secreted in the culture medium from which it can be recovered and purified with different methods, such as mass exclusion, absorption, affinity chromatography, salting-out, precipitation, dialysis, ultrafiltration.
A simple, rapid system for the production of the molecules of the invention is, for example, transient expression in mammal cells.
Accordingly, the plasmid containing the recombinant DNA fragment, for example PMT2 (Sambrook, J. et al., Molecular Cloning, Cold Spring Harbor Laboratory Press, 1989), containing the recombinant DNA fragment, is transfected in suitable recipient cells, such as Cos7 (Sambrook, J. et al., supra) by the calcium phosphate technique or other equivalent techniques. Some days after transfection, the conditioned medium of the transfected cells is collected, cleared by centrifugation and analysed for its content in factor. For this analysis, antibodies directed against HGF or MSP, or against any tag sequence, can be used: the supernatant is immunoprecipitated and then analysed by western blot with the same antibody. The supernatant containing the recombinant factor can also be used directly for biochemical and biological tests. The protein can be purified, for example, if domain D is a poly-histidine tag sequence, by absorption on a nickel resin column and subsequent elution with imidazole.
The ability of the recombinant factors to bind both Met and Ron receptors, correctly synthesized and maturated in eukaryotic cells, has been tested. It has been found that hybrid factors containing HGF xcex1 chain and MSP xcex2 chain, i.e. the domains more directly involved in the binding with Met and Ron, respectively, are correctly synthesized by eukaryotic cells. The maturation (cleavage of the proteolytic site) takes place in the presence of serum, on a reduced but significant fraction of said proteins.
Moreover, it has been shown that the modification of the sequence of the proteolytic site permits the maturation of the hybrid factor also in the absence of serum.
Among the applications of the recombinant molecules of the invention, the following can be cited:
prevention of myelotoxicity; in particular they can be used for the expansion of marrow precursors, to increase proliferation of the hematopoietic precursors or to stimulate their entry in circle;
prevention of liver and kidney toxicity, and of mucositis following antineoplastic treatments; in particular the recombinant factors can be used to prevent toxicity (apoptosis) on differentiated cell elements of liver, kidney and mucosa of the gastroenteral tract, and to stimulate staminal elements of cutis and mucosas to allow the regeneration of germinative layers;
prevention of chemotherapeutic neurotoxicity.
In general, the proteins of the invention provide the following advantages, compared with the parent molecules HGF and MSP:
the capability of binding both Met and Ron receptors gives these molecules a wider activity;
by modification of the proteolytic site, hybrid factors can be obtained which are activated by proteases of the endoplasmic reticulum (such as furines), during their synthesis;
when the proteolytic site is removed, permanently immature forms of the factors can be obtained, having a potential partial agonistic or antagonistic activity;
the different functional domains can be combined so as to modulate the biological effects, increasing the favourable ones and reducing those undesired (for example, protection from apoptosis in favour of cell proliferation).
The invention has to be considered also directed at amino acid and nucleotide sequences referred to formula (I), having modifications which for example derive from degeneration of genetic code, without therefore modifying the amino acid sequence, or from the deletion, substitution, insertion, inversion or addition of nucleotides and/or bases according to all the possible methods known in the art.
Furthermore, the invention relates to the expression vectors comprising a sequence encoding for a protein of general formula (I), which can be plasmids, bacteriophages, viruses, retroviruses, or others, and to host cells containing said expression vectors.
Finally, the invention relates to the use of the recombinant proteins as therapeutical agents, and to pharmaceutical compositions containing an effective amount of the recombinant proteins together with pharmacologically acceptable excipients.