Colony stimulating factors which stimulate the differentiation and/or proliferation of bone marrow cells have generated much interest because of their therapeutic potential for restoring depressed levels of hematopoietic stem cell-derived cells. Colony stimulating factors in both human and murine systems have been identified and distinguished according to their activities. For example, granulocyte-CSF (G-CSF) and macrophage-CSF (M-CSF) stimulate the in vitro formation of neutrophilic granulocyte and macrophage colonies, respectively while GM-CSF and interleukin-3 (IL-3) have broader activities and stimulate the formation of both macrophage, neutrophilic and eosinophilic granulocyte colonies. Certain factors such as stem cell factor are able to predominately affect stem cells.
Small amounts of certain hematopoietic growth factors account for the differentiation of a small number of stem cells into a variety of blood cell progenitors for the proliferation of those cells, and for the ultimate differentiation of mature blood cells from those lines. However, when stressed by chemotherapy, radiation or natural myelodysplastic disorders, a resulting period which patients are seriously leukopenic, anemic, neutropenic, or thrombocytopenic occurs. The use hematopoietic factors accelerates hematopoietic regeneration during this compromised period.
Stem cell factor has the ability to stimulate growth of early hematopoietic progenitors which are capable of maturing to erythroid, megakaryocyte, granulocyte, lymphocyte and macrophage cells. Stem cell factor treatment of mammals results in absolute increases in hematopoietic cells of both the myeloid and lymphoid cells.
EP 0 423 980 discloses novel stem cell factor (SCF) polypeptides including SCF.sup.1-148, SCF.sup.1-157, SCF.sup.1-160, SCF.sup.1-161, SCF.sup.1-162, SCF.sup.1-164, SCF.sup.1-165, SCF.sup.1-183, SCF.sup.1-185, SCF.sup.1-188, SCF.sup.1-189, SCF.sup.1-220, SCF.sup.1-248,
Rearrangement of Protein Sequences
In evolution, rearrangements of DNA sequences serve an important role in generating a diversity of protein structure and function. Gene duplication and exon shuffling provide an important mechanism to rapidly generate diversity and thereby provide organisms with a competitive advantage, especially since the basal mutation rate is low (Doolittle, Protein Science 1:191-200, 1992).
The development of recombinant DNA methods has made it possible to study the effects of sequence transposition on protein folding, structure and function. The approach used in creating new sequences resembles that of naturally occurring pairs of proteins that are related by linear reorganization of their amino acid sequences (Cunningham, et al., Proc. Natl. Acad. Sci. U.S.A. 76:3218-3222, 1979; Teather & Erfle, J. Bacteriol. 172: 3837-3841, 1990; Schimming et al., Eur. J. Biochem. 204: 13-19, 1992; Yamiuchi and Minamikawa, FEBS Lett. 260:127-130, 1991: MacGregor et al., FEBS Lett. 378:263-266, 1996). The first in vitro application of this type of rearrangement to proteins was described by Goldenberg and Creighton (J. Mol. Biol. 165:407-413, 1983). A new N-terminus is selected at an internal site (breakpoint) of the original sequence, the new sequence having the same order of amino acids as the original from the breakpoint until it reaches an amino acid that is at or near the original C-terminus. At this point the new sequence is joined, either directly or through an additional portion of sequence (linker), to an amino acid that is at or near the original N-terminus, and the new sequence continues with the same sequence as the original until it reaches a point that is at or near the amino acid that was N-terminal to the breakpoint site of the original sequence, this residue forming the new C-terminus of the chain.
This approach has been applied to proteins which range in size from 58 to 462 amino acids (Goldenberg & Creighton, J. Mol. Biol. 165:407-413, 1983; Li & Coffino, Mol. Cell. Biol. 13:2377-2383, 1993). The proteins examined have represented a broad range of structural classes, including proteins that contain predominantly a -helix (interleukin-4; Kreitman et al., Cytokine 7:311-318, 1995), b -sheet (interleukin-1; Horlick et al., Protein Eng. 5:427-431, 1992), or mixtures of the two (yeast phosphoribosyl anthranilate isomerase; Luger et al., Science 243:206-210, 1989). Broad categories of protein function are represented in these sequence reorganization studies:
Enzymes
______________________________________ T4 lysozyme Zhang et al., Biochemistry 32:12311-12318 (1993); Zhang et al., Nature Struct. Biol. 1:434-438 (1995) dihydrofolate Buchwalder et al., Biochemistry reductase 31:1621-1630 (1994); Protasova et al., Prot. Eng. 7:1373-1377 (1995) ribonuclease T1 Mullins et al., J. Am. Chem. Soc. 116:5529-5533 (1994); Garrett et al., Protein Science 5:204-211 (1996) Bacillus b-glucanse Hahn et al., Proc. Natl. Acad. Sci. U.S.A. 91:10417-10421 (1994) aspartate Yang & Schachman, Proc. Natl. Acad. transcarbamoylase Sci. U.S.A. 90:11980-11984 (1993) phosphoribosyl Luger et al., Science 243:206-210 anthranilate (1989); Luger et al., Prot. Eng. isomerase 3:249-258 (1990) pepsin/pepsinogen Lin et al., Protein Science 4:159- 166 (1995) glyceraldehyde-3- Vignais et al., Protein Science phosphate dehydro- 4:994-1000 (1995) genase ornithine Li & Coffino, Mol. Cell. Biol. decarboxylase 13:2377-2383 (1993) yeast Ritco-Vonsovici et al., Biochemistry phosphoglycerate 34:16543-16551 (1995) dehydrogenase Enzyme Inhibitor basic pancreatic Goldenberg & Creighton, J. Mol. trypsin inhibitor Biol. 165:407-413 (1983) Cytokines interleukin-1b Horlick et al., Protein Eng. 5:427- 431 (1992) interleukin-4 Kreitman et al., Cytokine 7:311- 318 (1995) Tyrosine Kinase Recognition Domain a-spectrin SH3 Viguera, et al., J. domain Mol. Biol. 247:670-681 (1995) Transmembrane Protein omp A Koebnik & Kramer, J. Mol. Biol. 250:617-626 (1995) Chimeric Protein interleukin-4- Kreitman et al., Proc. Natl. Acad. Pseudomonas Sci. U.S.A. 91:6889-6893 (1994). exotoxin fusion molecule ______________________________________
The results of these studies have been highly variable. In many cases substantially lower activity, solubility or thermodynamic stability were observed (E. coli dihydrofolate reductase, aspartate transcarbamoylase, phosphoribosyl anthranilate isomerase, glyceraldehyde-3-phosphate dehydrogenase, ornithine decarboxylase, omp A, yeast phosphoglycerate dehydrogenase). In other cases, the sequence rearranged protein appeared to have many nearly identical properties as its natural counterpart (basic pancreatic trypsin inhibitor, T4 lysozyme, ribonuclease T1, Bacillus b-glucanase, interleukin-1b, a -spectrin SH3 domain, pepsinogen, interleukin-4). In exceptional cases, an unexpected improvement over some properties of the natural sequence was observed, e.g., the solubility and refolding rate for rearranged a -spectrin SH3 domain sequences, and the receptor affinity and anti-tumor activity of transposed interleukin-4-Pseudomonas exotoxin fusion molecule (Kreitman et al., Proc. Natl. Acad. Sci. U.S.A. 91:6889-6893, 1994; Kreitman et al., Cancer Res. 55:3357-3363, 1995).
The primary motivation for these types of studies has been to study the role of short-range and long-range interactions in protein folding and stability. Sequence rearrangements of this type convert a subset of interactions that are long-range in the original sequence into short-range interactions in the new sequence, and vice versa. The fact that many of these sequence rearrangements are able to attain a conformation with at least some activity is persuasive evidence that protein folding occurs by multiple folding pathways (Viguera, et al., J. Mol. Biol. 247:670-681, 1995). In the case of the SH3 domain of a -spectrin, choosing new termini at locations that corresponded to b-hairpin turns resulted in proteins with slightly less stability, but which were nevertheless able to fold.
The positions of the internal breakpoints used in the studies cited here are found exclusively on the surface of proteins, and are distributed throughout the linear sequence without any obvious bias towards the ends or the middle (the variation in the relative distance from the original N-terminus to the breakpoint is ca. 10 to 80% of the total sequence length). The linkers connecting the original N- and C-termini in these studies have ranged from 0 to 9 residues. In one case (Yang & Schachman, Proc. Natl. Acad. Sci. U.S.A. 90:11980-11984, 1993), a portion of sequence has been deleted from the original C-terminal segment, and the connection made from the truncated C-terminus to the original N-terminus. Flexible hydrophilic residues such as Gly and Ser are frequently used in the linkers. Viguera, et al. (J. Mol. Biol. 247:670-681, 1995) compared joining the original N- and C-termini with 3- or 4-residue linkers; the 3-residue linker was less thermodynamically stable. Protasova et al. (Protein Eng. 7:1373-1377, 1994) used 3- or 5-residue linkers in connecting the original N-termini of E. coli dihydrofolate reductase; only the 3-residue linker produced protein in good yield.