Colony stimulating factors (CSFs) which stimulate the differentiation and/or proliferation of bone marrow cells have generated much interest because of their therapeutic potential for restoring depressed levels of hematopoietic stem cell-derived cells. CSFs in both human and murine systems have been identified and distinguished according to their activities. For example, granulocyte-CSF (G-CSF) and macrophage-CSF (M-CSF) stimulate the in vitro formation of neutrophilic granulocyte and macrophage colonies, respectively, while GM-CSF and interleukin-3 (IL-3) have broader activities and stimulate the formation of both macrophage, neutrophilic and eosinophilic granulocyte colonies. IL-3 also stimulates the formation of mast, megakaryocyte and pure and mixed erythroid colonies.
U.S. Pat. No. 4,877,729 and U.S. Pat. No. 4,959,455 disclose a gibbon IL-3 cDNA and a deduced human IL-3 DNA sequence and the protein sequences for which they code. The hIL-3 disclosed has serine rather than proline at position 8 in the protein sequence.
International Patent Application (PCT) WO 88/00598 discloses gibbon- and human-like IL-3. The hIL-3 contains a Ser8->Pro8 replacement. Suggestions are made to replace Cys by Ser, thereby breaking the disulfide bridge, and to replace one or more amino acids at the glycosylation sites.
U.S. Pat. No. 4,810,643 discloses a DNA sequence encoding human G-CSF.
WO 91/02754 discloses a fusion protein comprised of GM-CSF and IL-3 which has increased biological activity compared to GM-CSF or IL-3 alone. Also disclosed are nonglycosylated IL-3 and GM-CSF analog proteins as components of the multi-functional chimeric hematopoietic receptor agonist.
WO 92/04455 discloses fusion proteins composed of IL-3 fused to a lymphokine selected from the group consisting of IL-3, IL-6, IL-7, IL-9, IL-11, EPO and G-CSF.
WO 95/21197 and WO 95/21254 disclose fusion proteins capable of broad multi-functional hematopoietic properties.
GB 2,285,446 relates to the c-mpl ligand (thrombopoietin) and various forms of thrombopoietin which are shown to influence the replication, differentiation and maturation of megakaryocytes and megakaryocytes progenitors which may be used for the treatment of thrombocytopenia.
EP 675,201 A1 relates to the c-mpl ligand (Megakaryocyte growth and development factor (MGDF), allelic variations of c-mpl ligand and c-mpl ligand attached to water soluble polymers such as polyethylene glycol.
WO 95/21920 provides the murine and human c-mpl ligand and polypeptide fragments thereof. The proteins are useful for in vivo and ex vivo therapy for stimulating platelet production.
U.S. Pat. No. 4,703,008 by Lin, F-K. discloses the a cDNA sequence encoding erythropoietin, methods of production and uses for erythropoietin.
WO 91/05867 discloses analogs of human erythropoietin having a greater number of sites for carbohydrate attachment than human erythropoietin, such as EPO (Asn69) EPO (Asn125, Ser127), EPO (Thr125) and EPO (Pro124, Thr125).
WO 94/24160 discloses erythropoietin muteins which have enhanced activity, specifically amino acid substitutions at positions 20, 49, 73, 140, 143, 146, 147 and 154.
WO 94/28391 discloses the native flt3 ligand protein sequence and a cDNA sequence encoding the flt3 ligand, methods of expressing flt3 ligand in a host cell transfected with the cDNA and methods of treating patients with a hematopoietic disorder using flt3 ligand.
U.S. Pat. No. 5,554,512 is directed to human flt3 ligand as an isolated protein, DNA encoding the flt3 ligand, host cells transfected with cDNAs encoding flt3 ligand and methods for treating patients with flt3 ligand.
WO 94/26891 provides mammalianflt3 ligands, including an isolate that has an insertion of 29 amino acids, and fragments there of.
Rearrangement of Protein Sequences
In evolution, rearrangements of DNA sequences serve an important role in generating a diversity of protein structure and function. Gene duplication and exon shuffling provide an important mechanism to rapidly generate diversity and thereby provide organisms with a competitive advantage, especially since the basal mutation rate is low (Doolittle, Protein Science 1:191-200, 1992).
The development of recombinant DNA methods has made it possible to study the effects of sequence transposition on protein folding, structure and function. The approach used in creating new sequences resembles that of naturally occurring pairs of proteins that are related by linear reorganization of their amino acid sequences (Cunningham, et al., Proc. Natl. Acad. Sci. U.S.A. 76:3218-3222, 1979; Teather & Erfle, J. Bacteriol. 172: 3837-3841, 1990; Schimming et al., Eur. J. Biochem. 204: 13-19, 1992; Yamiuchi and Minamikawa, FEBS Lett. 260:127-130, 1991; MacGregor et al., FEBS Lett. 378:263-266). The first in vitro application of this type of rearrangement to proteins was described by Goldenberg and Creighton (J. Mol. Biol. 165:407-413, 1983). A new N-terminus is selected at an internal site (breakpoint) of the original sequence, the new sequence having the same order of amino acids as the original from the breakpoint until it reaches an amino acid that is at or near the original C-terminus. At this point the new sequence is joined, either directly or through an additional portion of sequence (linker), to an amino acid that is at or near the original N-terminus, and the new sequence continues with the same sequence as the original until it reaches a point that is at or near the amino acid that was N-terminal to the breakpoint site of the original sequence, this residue forming the new C-terminus of the chain.
This approach has been applied to proteins which range in size from 58 to 462 amino acids (Goldenberg & Creighton, J. Mol. Biol. 165:407-413, 1983; Li & Coffino, Mol. Cell. Biol. 13:2377-2383, 1993). The proteins examined have represented a broad range of structural classes, including proteins that contain predominantly a-helix (interleukin-4; Kreitman et al., Cytokine 7:311-318, 1995), b-sheet (interleukin-1; Horlick et al., Protein Eng. 5:427-431, 1992), or mixtures of the two (yeast phosphoribosyl anthranilate isomerase; Luger et al., Science 243:206-210, 1989). Broad categories of protein function are represented in these sequence reorganization studies:
EnzymesT4 lysozymeZhang et al., Biochemistry32:12311-12318, 1993; Zhang etal., Nature Struct. Biol. 1:434-438(1995)dihydrofolateBuchwalder et al., Biochemistryreductase31:1621-1630, 1994; Protasova etal., Prot. Eng. 7:1373-1377, 1995)ribonuclease T1Mullins et al., J. Am. Chem. Soc.116:5529-5533, 1994; Garrett et al.,Protein Science 5:204-211, 1996)Bacillus b-glucanseHahn et al., Proc. Natl. Acad. Sci.U.S.A. 91:10417-10421, 1994)aspartateYang & Schachman, Proc. Natl. Acad.transcarbamoylaseSci. U.S.A. 90:11980-11984, 1993)phosphoribosylLuger et al., Science 243:206-210anthranilate(1989; Luger et al., Prot. Eng.isomerase3:249-258, 1990)pepsin/pepsinogenLin et al., Protein Science 4:159-166, 1995)glyceraldehyde-3-Vignais et al., Protein Sciencephosphate dehydro-4:994-1000, 1995)genaseornithineLi & Coffino, Mol. Cell. Biol.decarboxylase13:2377-2383, 1993)yeastRitco-Vonsovici et al., Biochemistryphosphoglycerate34:16543-16551, 1995)dehydrogenaseEnzyme Inhibitorbasic pancreaticGoldenberg & Creighton, J. Mol.trypsin inhibitorBiol. 165:407-413, 1983)Cytokinesinterleukin-1bHorlick et al., Protein Eng. 5:427-431, 1992)interleukin-4Kreitman et al., Cytokine 7:311-318, 1995)Tyrosine KinaseRecognition Domaina-spectrin SH3Viguera, et al., J.domainMol. Biol. 247:670-681, 1995)TransmembraneProteinomp AKoebnik & Kramer, J. Mol. Biol.250:617-626, 1995)Chimeric Proteininterleukin-4-Kreitman et al., Proc. Natl. Acad.PseudomonasSci. U.S.A. 91:6889-6893, 1994).exotoxin
The results of these studies have been highly variable. In many cases substantially lower activity, solubility or thermodynamic stability were observed (E. coli dihydrofolate reductase, aspartate transcarbamoylase, phosphoribosyl anthranilate isomerase, glyceraldehyde-3-phosphate dehydrogenase, ornithine decarboxylase, omp A, yeast phosphoglycerate dehydrogenase). In other cases, the sequence rearranged protein appeared to have many nearly identical properties as its natural counterpart (basic pancreatic trypsin inhibitor, T4 lysozyme, ribonuclease T1, Bacillus b-glucanase, interleukin-1b, a-spectrin SH3 domain, pepsinogen, interleukin-4). In exceptional cases, an unexpected improvement over some properties of the natural sequence was observed, e.g., the solubility and refolding rate for rearranged a-spectrin SH3 domain sequences, and the receptor affinity and anti-tumor activity of transposed interleukin-4-Pseudomonas exotoxin fusion molecule (Kreitman et al., Proc. Natl. Acad. Sci. U.S.A. 91:6889-6893, 1994; Kreitman et al., Cancer Res. 55:3357-3363, 1995).
The primary motivation for these types of studies has been to study the role of short-range and long-range interactions in protein folding and stability. Sequence rearrangements of this type convert a subset of interactions that are long-range in the original sequence into short-range interactions in the new sequence, and vice versa. The fact that many of these sequence rearrangements are able to attain a conformation with at least some activity is persuasive evidence that protein folding occurs by multiple folding pathways (Viguera, et al., J. Mol. Biol. 247:670-681, 1995). In the case of the SH3 domain of a-spectrin, choosing new termini at locations that corresponded to b-hairpin turns resulted in proteins with slightly less stability, but which were nevertheless able to fold.
The positions of the internal breakpoints used in the studies cited here are found exclusively on the surface of proteins, and are distributed throughout the linear sequence without any obvious bias towards the ends or the middle (the variation in the relative distance from the original N-terminus to the breakpoint is ca. 10 to 80% of the total sequence length). The linkers connecting the original N- and C-termini in these studies have ranged from 0 to 9 residues. In one case (Yang & Schachman, Proc. Natl. Acad. Sci. U.S.A. 90:11980-11984, 1993), a portion of sequence has been deleted from the original C-terminal segment, and the connection made from the truncated C-terminus to the original N-terminus. Flexible hydrophilic residues such as Gly and Ser are frequently used in the linkers. Viguera, et al. (J. Mol. Biol. 247:670-681, 1995) compared joining the original N- and C-termini with 3- or 4-residue linkers; the 3-residue linker was less thermodynamically stable. Protasova et al. (Protein Eng. 7:1373-1377, 1994) used 3- or 5-residue linkers in connecting the original N-termini of E. coli dihydrofolate reductase; only the 3-residue linker produced protein in good yield.