The growth hormone (GH) supergene family (Bazan, F. Immunology Today 11: 350-354 (1991); Mott, H. R. and Campbell, I. D. Current Opinion in Structural Biology 5: 114-121 (1995); Silvennoinen, O. and Ihle, J. N. (1996) SIGNALING BY THE HEMATOPOIETIC CYTOKINE RECEPTORS) represents a set of proteins with similar structural characteristics. Each member of this family of proteins comprises a four helical bundle, the general structure of which is shown in FIG. 1. Family members are referred to herein as “four helical bundle polypeptides” or “4HB” polypeptides. While there are still more members of the family yet to be identified, some members of the family include the following: growth hormone, prolactin, placental lactogen, erythropoietin (EPO), thrombopoietin (TPO), interleukin-2 (IL-2), IL-3, IL-4, IL-5, IL-6, IL-7, IL-9, IL-10, IL-1, IL-12 (p35 subunit), IL-13, IL-15, oncostatin M, ciliary neurotrophic factor, leukemia inhibitory factor, alpha interferon, beta interferon, gamma interferon, omega interferon, tau interferon, epsilon interferon, granulocyte-colony stimulating factor (G-CSF), granulocyte-macrophage colony stimulating factor (GM-CSF), macrophage colony stimulating factor (M-CSF) and cardiotrophin-1 (CT-1) (“the GH supergene family”). Members of the GH supergene family have similar secondary and tertiary structures, despite the fact that they generally have limited amino acid or DNA sequence identity. The shared structural features allow new members of the gene family to be readily identified. The general structures of family members hGH, EPO, IFNα-2, and G-CSF are shown in FIGS. 2, 3, 4, and 5, respectively.
One member of the GH supergene family is human growth hormone (hGH). Human growth hormone participates in much of the regulation of normal human growth and development. This naturally-occurring single-chain pituitary hormone consists of 191 amino acid residues and has a molecular weight of approximately 22 kDa. hGH exhibits a multitude of biological effects, including linear growth (somatogenesis), lactation, activation of macrophages, and insulin-like and diabetogenic effects, among others (Chawla, R., et al., Ann. Rev. Med. 34:519-547 (1983); Isaksson, O., et al., Ann. Rev. Physiol., 47:483-499 (1985); Hughes, J. and Friesen, H., Ann. Rev. Physiol., 47:469-482 (1985)).
The structure of hGH is well known (Goeddel, D., et al., Nature 281:544-548 (1979)), and the three-dimensional structure of hGH has been solved by x-ray crystallography (de Vos, A., et al., Science 255:306-312 (1992)). The protein has a compact globular structure, comprising four amphipathic alpha helical bundles, termed A-D beginning from the N-terminus, which are joined by loops. hGH also contains four cysteine residues, which participate in two intramolecular disulfide bonds: C53 is paired with C165 and C182 is paired with C189. The hormone is not glycosylated and has been expressed in a secreted form in E. coli (Chang, C., et al., Gene 55:189-196 (1987)).
A number of naturally occurring mutants of hGH have been identified. These include hGH-V (Seeberg, DNA 1: 239 (1982); U.S. Pat. Nos. 4,446,235, 4,670,393, and 4,665,180, which are incorporated by reference herein) and a 20-kDa hGH containing a deletion of residues 32-46 of hGH (Kostyo et al., Biochem. Biophys. Acta 925: 314 (1987); Lewis, U., et al., J. Biol. Chem., 253:2679-2687 (1978)). In addition, numerous hGH variants, arising from post-transcriptional, post-translational, secretory, metabolic processing, and other physiological processes, have been reported (Baumann, G., Endocrine Reviews 12: 424 (1991)).
The biological effects of hGH derive from its interaction with specific cellular receptors. The hormone is a member of a family of homologous proteins that include placental lactogens and prolactins. hGH is unusual among the family members, however, in that it exhibits broad species specificity and binds to either the cloned somatogenic (Leung, D., et al., Nature 330:537-543 (1987)) or prolactin (Boutin, J., et al., Cell 53:69-77 (1988)) receptor. Based on structural and biochemical studies, functional maps for the lactogenic and somatogenic binding domains have been proposed (Cunningham, B. and Wells, J., Proc. Natl. Acad. Sci. 88: 3407 (1991)). The hGH receptor is a member of the hematopoietic/cytokine/growth factor receptor family, which includes several other growth factor receptors, such as the interleukin (IL)-3, -4 and -6 receptors, the granulocyte macrophage colony-stimulating factor (GM-CSF) receptor, the erythropoietin (EPO) receptor, as well as the G-CSF receptor. See, Bazan, Proc. Natl. Acad. Sci. USA 87: 6934-6938 (1990). Members of the cytokine receptor family contain four conserved cysteine residues and a tryptophan-serine-X-tryptophan-serine motif positioned just outside the transmembrane region. The conserved sequences are thought to be involved in protein-protein interactions. See, e.g., Chiba et al., Biochim. Biophys. Res. Comm. 184: 485-490 (1992). The interaction between hGH and extracellular domain of its receptor (hGHbp) is among the most well understood hormone-receptor interactions. High-resolution X-ray crystallographic data (Cunningham, B., et al., Science, 254:821-825 (1991)) has shown that hGH has two receptor binding sites and binds two receptor molecules sequentially using distinct sites on the molecule. The two receptor binding sites are referred to as Site I and Site II. Site I includes the carboxy terminal end of helix D and parts of helix A and the A-B loop, whereas Site II encompasses the amino terminal region of helix A and a portion of helix C. Binding of GH to its receptor occurs sequentially, with Site I binding first. Site II then engages a second GH receptor, resulting in receptor dimerization and activation of the intracellular signaling pathways that lead to cellular responses to the hormone. An hGH mutein in which a G120R substitution has been introduced into site II is able to bind a single hGH receptor, but is unable to dimerize two receptors. The mutein acts as an hGH antagonist in vitro, presumably by occupying receptor sites without activating intracellular signaling pathways (Fuh, G., et al., Science 256:1677-1680 (1992)).
Recombinant hGH is used as a therapeutic and has been approved for the treatment of a number of indications. hGH deficiency leads to dwarfism, for example, which has been successfully treated for more than a decade by exogenous administration of the hormone. In addition to hGH deficiency, hGH has also been approved for the treatment of renal failure (in children), Turner's Syndrome, and cachexia in AIDS patients. Recently, the Food and Drug Administration (FDA) has approved hGH for the treatment of non-GH-dependent short stature. hGH is also currently under investigation for the treatment of aging, frailty in the elderly, short bowel syndrome, and congestive heart failure.
Recombinant hGH is currently sold as a daily injectable product, with five major products currently on the market: Humatrope™ (Eli Lilly & Co.), Nutropin™ (Genentech), Norditropin™ (Novo-Nordisk), Genotropin™ (Pfizer) and Saizen/Serostim™ (Serono). A significant challenge to using growth hormone as a therapeutic, however, is that the protein has a short in vivo half-life and, therefore, it must be administered by daily subcutaneous injection for maximum effectiveness (MacGillivray, et al., J. Clin. Endocrinol. Metab. 81: 1806-1809 (1996)). Considerable effort is focused on means to improve the administration of hGH agonists and antagonists, by lowering the cost of production, making administration easier for the patient, improving efficacy and safety profile, and creating other properties that would provide a competitive advantage. For example, Genentech and Alkermes formerly marketed Nutropin Depot™, a depot formulation of hGH, for pediatric growth hormone deficiency. While the depot permits less frequent administration (once every 2-3 weeks rather than once daily), it is also associated with undesirable side effects, such as decreased bioavailability and pain at the injection site and was withdrawn from the market in 2004. Another product, Pegvisomant™ (Pfizer), has also recently been approved by the FDA. Pegvisomant™ is a genetically-engineered analogue of hGH that functions as a highly selective growth hormone receptor antagonist indicated for the treatment of acromegaly (van der Lely, et al., The Lancet 358: 1754-1759 (2001). Although several of the amino acid side chain residues in Pegvisomant™ are derivatized with polyethylene glycol (PEG) polymers, the product is still administered once-daily, indicating that the pharmaceutical properties are not optimal. In addition to PEGylation and depot formulations, other administration routes, including inhaled and oral dosage forms of hGH, are under early-stage pre-clinical and clinical development and none has yet received approval from the FDA. Accordingly, there is a need for a polypeptide that exhibits growth hormone activity but that also provides a longer serum half-life and, therefore, more optimal therapeutic levels of hGH and an increased therapeutic half-life.
Interferons are relatively small, single-chain glycoproteins released by cells invaded by viruses or exposed to certain other substances. Interferons are presently grouped into three major classes, designated: 1) leukocyte interferon (interferon-alpha, α-interferon, IFN-α), 2) fibroblast interferon (interferon-beta, β-interferon, IFN-β), and 3) immune interferon (interferon-gamma, γ-interferon, IFN-γ). In response to viral infection, lymphocytes synthesize primarily α-interferon (with omega interferon, IFN-ω), while infection of fibroblasts usually induces production of β-interferon. IFNα and IFNβ share about 20-30 percent amino acid sequence homology. The gene for human IFN-β lacks introns, and encodes a protein possessing 29% amino acid sequence identity with human IFN-α, suggesting that IFN-α and IFN-β genes have evolved from a common ancestor (Taniguchi et al., Nature 285 547-549 (1980)). By contrast, IFN-γ is synthesized by lymphocytes in response to mitogens. IFNα; IFN β and IFNω are known to induce MHC Class I antigen expression and are referred to as type I interferons, while IFNγ induces MHC Class II antigen expression, and is referred to as type II interferon.
A large number of distinct genes encoding different species of IFNα have been identified. Alpha interferons fall into two major classes, I and II, each containing a plurality of discrete proteins (Baron et al., Critical Reviews in Biotechnology 10, 179-190 (1990); Nagata et al., Nature 287, 401-408 (1980); Nagata et al., Nature 284, 316-320 (1980); Streuli et al., Science 209, 1343-1347 (1980); Goeddel et al., Nature 290, 20-26 (1981); Lawn et al., Science 212, 1159-1162 (1981); Ullrich et al., J. Mol. Biol. 156, 467-486 (1982); Weissmann et al., Phil. Trans. R. Soc. Lond. B299, 7-28 (1982); Lund et al., Proc. Natl. Acad. Sci. 81, 2435-2439 (1984); Capon et al., Mol. Cell. Biol. 5, 768 (1985)). The various IFN-α species include IFN-αA (IFN-α2), IFN-αB, IFN-αC, IFN-αC1, IFN-αD (IFN-α1), IFN-αE, IFN-αF, IFN-αG, IFN-αH, IFN-αI, IFN-αJ1, IFN-αJ2, IFN-αK, IFN-αL, IFN-α4B, IFN-α5, IFN-α6, IFN-α74, IFN-α76 IFN-α4a), IFN-α88, and alleles thereof.
Interferons were originally derived from naturally occurring sources, such as buffy coat leukocytes and fibroblast cells, optionally using inducing agents to increase interferon production. Interferons have also been produced by recombinant DNA technology.
The cloning and expression of recombinant IFNαA (IFNαA, also known as IFNα2) was described by Goeddel et al., Nature 287, 411 (1980). The amino acid sequences of IFNαA, B, C, D, F, G, H, K and L, along with the encoding nucleotide sequences, are described by Pestka in Archiv. Biochem. Biophys. 221, 1 (1983). The cloning and expression of mature IFNβ is described by Goeddel et al., Nucleic Acids Res. 8, 4057 (1980). The cloning and expression of mature IFNγ are described by Gray et al., Nature 295, 503 (1982). IFNω has been described by Capon et al., Mol. Cell. Biol. 5, 768 (1985). IFNτ has been identified and disclosed by Whaley et al., J. Biol. Chem. 269, 10864-8 (1994).
Interferons have a variety of biological activities, including anti-viral, immunoregulatory and anti-proliferative properties, and have been utilized as therapeutic agents for treatment of diseases such as cancer, and various viral diseases. As a class, the interferon-α's have been shown to inhibit various types of cellular proliferation, and are especially useful for the treatment of a variety of cellular proliferation disorders frequently associated with cancer, particularly hematologic malignancies such as leukemias. These proteins have shown anti-proliferative activity against multiple myeloma, chronic lymphocytic leukemia, low-grade lymphoma, Kaposi's sarcoma, chronic myelogenous leukemia, renal-cell carcinoma, urinary bladder tumors and ovarian cancers (Bonnem, E. M. et al. (1984) J. Biol. Response Modifiers 3:580; Oldham, R. K. (1985) Hospital Practice 20:71).
Specific examples of commercially available IFN products include IFNγ-1b (Actimmune®), IFNβ-1a (Avonex®, and Rebif®, IFNβ-1b (Betaseron®), IFN alfacon-1 (Infergen A®), IFNα-2 (Intron®), IFNα-2a (Roferon-A®), Peginterferon alfa-2a (Pegasys®), and Peginterferon alfa-2b (PEG-Intron®). Some of the problems associated with the production of PEGylated versions of IFN proteins are described in Wang et al. (2002) Adv. Drug Deliv. Rev. 54:547-570; and Pedder, S. C. Semin Liver Dis. 2003; 23 Suppl 1:19-22. Wang et al. characterized positional isomers of PEG-Intron®, and Pedder at al. compared Pegasys® with PEG-Intron® describing the lability of the PEGylation chemistries used and effects upon formulation. Despite the number of IFN products currently available on the market, there is still an unmet need for interferon therapeutics.
Another member of the GH supergene family is human Granulocyte Colony Stimulating Factor (G-CSF). Naturally-occurring G-CSF is a glycoprotein hormone of about 177 amino acids, having a molecular weight of about 20 kiloDaltons (kDa). The crystal structure of G-CSF is known (Hill et al., (1993) Proc. Natl. Acad. Sci. USA 90:5167-71), and a crystal structure of G-CSF bound to its receptor is also known (Aritomi et al., (1999) Nature, 401:713-717). The three dimensional structure of G-CSF is known at the atomic level. From the three-dimensional structure of G-CSF, predictions of how changes in the amino acid composition of a G-CSF molecule may result in structural changes can be made. These structural characteristics or changes may be correlated with biological activity to design and produce G-CSF analogs.
G-CSF is a pharmaceutically active protein which regulates proliferation, differentiation, and functional activation of neutrophilic granulocytes (Metcalf, Blood 67:257 (1986); Yan, et al. Blood 84(3): 795-799 (1994); Bensinger, et al. Blood 81(11): 3158-3163 (1993); Roberts, et al., Expt'l Hematology 22: 1156-1163 (1994); Neben, et al. Blood 81(7): 1960-1967 (1993); Welte et al. PNAS-USA 82: 1526-1530 (1985); Souza et al. Science 232: 61-65 (1986) and Gabrilove, J. Seminars in Hematology 26:2 1-14 (1989)). G-CSF was purified to homogeneity from cell culture supernatants of the human bladder carcinoma cell line 5637 (Welte et al., Proc. Natl. Acad. Sci. (1985) 82:1526-30). The sequence of the cDNA coding for native G-CSF is known from Souza et al., Science (1986) 232:61-65. As a consequence of alternative splicing in the second intron two naturally occurring forms of G-CSF exist with 204 or 207 amino acids of which the first 30 represent a signal peptide (Lymphokines, IRL Press, Oxford, Washington D.C., Editors D. Male and C. Rickwood). The mature protein was shown to have a molecular weight of about 19 kDa and has 5 cysteine residues which can form intermolecular or intramolecular disulfide bridges. Binding studies have shown that G-CSF binds to neutrophilic granulocytes. Little to no binding is observed with erythroid, lymphoid eosinophilic cell lines as well as with macrophages.
In humans, endogenous G-CSF is detectable in blood plasma (Jones et al. Bailliere's Clinical Hematology 2:1 83-111 (1989)). G-CSF is produced by fibroblasts, macrophages, T cells, trophoblasts, endothelial cells and epithelial cells and is the expression product of a single copy gene comprised of four exons and five introns located on chromosome seventeen. Transcription of this locus produces a mRNA species which is differentially processed, resulting in two forms of G-CSF mRNA, one version coding for a protein of 177 amino acids, the other coding for a protein of 174 amino acids (Nagata et al. EMBO J 5: 575-581 (1986)), and the form comprised of 174 amino acids has been found to have the greatest specific in vivo biological activity. G-CSF is species cross-reactive, such that when human G-CSF is administered to another mammal such as a mouse, canine or monkey, sustained neutrophil leukocytosis is elicited (Moore et al. PNAS-USA 84: 7134-7138 (1987)).
Human G-CSF can be obtained and purified from a number of sources. Natural human G-CSF (nhG-CSF) can be isolated from the supernatants of cultured human tumor cell lines. The development of recombinant DNA technology, see, for instance, U.S. Pat. No. 4,810,643 (Souza) incorporated herein by reference, has enabled the production of commercial scale quantities of G-CSF in glycosylated form as a product of eukaryotic host cell expression, and of G-CSF in non-glycosylated form as a product of prokaryotic host cell expression.
G-CSF has been found to be useful in the treatment of indications where an increase in neutrophils will provide benefits. G-CSF can mobilize stem and precursor cells from bone marrow and is used to treat patients whose granulocytes have been depleted by chemotherapy, or as a prelude to bone marrow transplants. For example, for cancer patients, G-CSF is beneficial as a means of selectively stimulating neutrophil production to compensate for hematopoietic deficits resulting from chemotherapy or radiation therapy. Other indications include treatment of various infectious diseases and related conditions, such as sepsis, which is typically caused by a metabolite of bacteria. G-CSF is also useful alone, or in combination with other compounds, such as other cytokines, for growth or expansion of cells in culture, for example, for bone marrow transplants.
The G-CSF receptor (G-CSFR) is a member of the hematopoietic/cytokine/growth factor receptor family, which includes several other growth factor receptors, such as the interleukin (IL)-3, -4 and -6 receptors, the granulocyte macrophage colony-stimulating factor (GM-CSF) receptor, the erythropoietin (EPO) receptor, as well as the prolactin and growth hormone receptors. See, Bazan, Proc. Natl. Acad. Sci. USA 87: 6934-6938 (1990). Members of the cytokine receptor family contain four conserved cysteine residues and a tryptophan-serine-X-tryptophan-serine motif positioned just outside the transmembrane region. The conserved sequences are thought to be involved in protein-protein interactions. See, e.g., Chiba et al., Biochim. Biophys. Res. Comm. 184: 485-490 (1992). The G-CSF receptor consists of a single peptide chain with a molecular weight of about 150 kD (Nicola, Immunol. Today 8 (1987), 134).
Glycosylated hG-CSF has been compared with de-glycosylated hG-CSF, prepared by in vitro enzymatic digestion with neuraminidase and endo-α-N-acetylgalactosaminidase, with respect to its stability as a function of pH and temperature (Oh-eda et al., 1990, J. Biol. Chem. 265 (20): 11432-35). The de-glycosylated hG-CSF, dissolved at a concentration of 1 μg/mL in 20 mM phosphate buffer containing 0.2 M NaCl and 0.01% Tween 20 was rapidly inactivated within the pH range of from about pH 7 to about pH 8 after a two-day incubation at 37° C. In contrast, glycosylated hG-CSF retained over 80% of its activity under the same conditions. Furthermore, evaluation of the thermal stability of both forms of hG-CSF, measured by biological assay and calorimetric analysis, indicated that de-glycosylated hG-CSF was less thermally stable than the native form of hG-CSF.
A number of approaches have been taken in order to provide stable, pharmaceutically acceptable G-CSF compositions. One approach to improving the composition stability of G-CSF involves the synthesis of derivatives of the protein. U.S. Pat. No. 5,665,863 discloses the formation of recombinant chimeric proteins comprising G-CSF coupled with albumin, which have new pharmacokinetic properties. U.S. Pat. No. 5,824,784 and U.S. Pat. No. 5,320,840, disclose the chemical attachment of water-soluble polymers to proteins to improve stability and provide protection against proteolytic degradation, and more specifically, N-terminally modified G-CSF molecules carrying chemically attached polymers, including polyethylene glycol.
An alternative approach to increasing stability of G-CSF in composition involves alteration of the amino acid sequence of the protein. U.S. Pat. No. 5,416,195 discloses genetically engineered analogues of G-CSF having improved composition stability, wherein the cysteine residue normally found at position 17 of the mature polypeptide chain, the aspartic acid residue found at position 27, and at least one of the tandem proline residues found at positions 65 and 66, are all replaced with a serine residue. U.S. Pat. No. 5,773,581 discloses the genetically engineered G-CSF analogues of G-CSF that have been covalently conjugated to a water soluble polymer.
Another member of the GH supergene family is human erythropoietin (hEPO). Naturally-occurring erythropoietin (EPO) is a glycoprotein hormone of molecular weight 34 kilo Daltons (kDa) that is produced in the mammalian kidney and liver. EPO is a key component in erythropoiesis, inducing the proliferation and differentiation of red cell progenitors. EPO activity also is associated with the activation of a number of erythroid-specific genes, including globin and carbonic anhydrase. See, e.g., Bondurant et al., Mol. Cell. Biol. 5:675-683 (1985); Koury et al, J. Cell. Physiol. 126: 259-265 (1986).
The erythropoietin receptor (EpoR) is a member of the hematopoietic/cytokine/growth factor receptor family, which includes several other growth factor receptors, such as the interleukin (IL)-3, -4 and -6 receptors, the G-CSF receptor (G-CSFR), the granulocyte macrophage colony-stimulating factor (GM-CSF) receptor as well as the prolactin and growth hormone receptors. See, Bazan, Proc. Natl. Acad. Sci. USA 87: 6934-6938 (1990). Members of the cytokine receptor family contain four conserved cysteine residues and a tryptophan-serine-X-tryptophan-serine motif positioned just outside the transmembrane region. The conserved sequences are thought to be involved in protein-protein interactions. See, e.g., Chiba et al., Biochim. Biophys. Res. Comm. 184: 485-490 (1992).
U.S. Pat. Nos. 5,441,868; 5,547,933; 5,618,698; and 5,621,080 describe DNA sequences encoding human EPO and the purified and isolated polypeptide having part or all of the primary structural conformation and the biological properties of naturally occurring EPO.
The biological effects of hEPO derive from its interaction with specific cellular receptors. The interaction between hEPO and extracellular domain of its receptor (hEPObp) is well understood. High-resolution X-ray crystallographic data has shown that hEPO has two receptor binding sites and binds two receptor molecules sequentially using distinct sites on the molecule. The two receptor binding sites are referred to as Site I and Site II. Site I includes the carboxy terminal end of helix D and parts of helix A and the A-B loop, whereas Site II encompasses the amino terminal region of helix A and a portion of helix C. Binding of EPO to its receptor occurs sequentially, with site I binding first. Site II then engages a second EPO receptor, resulting in receptor dimerization and activation of the intracellular signaling pathways that lead to cellular responses to the hormone.
Recombinant hEPO is used as a therapeutic and has been approved for the treatment of human subjects. hEPO deficiency leads to anemia, for example, which has been successfully treated by exogenous administration of the hormone.
Covalent attachment of the hydrophilic polymer poly(ethylene glycol), abbreviated PEG, is a method of increasing water solubility, bioavailability, increasing serum half-life, increasing therapeutic half-life, modulating immunogenicity, modulating biological activity, or extending the circulation time of many biologically active molecules, including proteins, peptides, and particularly hydrophobic molecules. PEG has been used extensively in pharmaceuticals, on artificial implants, and in other applications where biocompatibility, lack of toxicity, and lack of immunogenicity are of importance. In order to maximize the desired properties of PEG, the total molecular weight and hydration state of the PEG polymer or polymers attached to the biologically active molecule must be sufficiently high to impart the advantageous characteristics typically associated with PEG polymer attachment, such as increased water solubility and circulating half life, while not adversely impacting the bioactivity of the parent molecule.
PEG derivatives are frequently linked to biologically active molecules through reactive chemical functionalities, such as lysine, cysteine and histidine residues, the N-terminus and carbohydrate moieties. Proteins and other molecules often have a limited number of reactive sites available for polymer attachment. Often, the sites most suitable for modification via polymer attachment play a significant role in receptor binding, and are necessary for retention of the biological activity of the molecule. As a result, indiscriminate attachment of polymer chains to such reactive sites on a biologically active molecule often leads to a significant reduction or even total loss of biological activity of the polymer-modified molecule. R. Clark et al., (1996), J. Biol. Chem., 271:21969-21977. To form conjugates having sufficient polymer molecular weight for imparting the desired advantages to a target molecule, prior art approaches have typically involved random attachment of numerous polymer arms to the molecule, thereby increasing the risk of a reduction or even total loss in bioactivity of the parent molecule.
Reactive sites that form the loci for attachment of PEG derivatives to proteins are dictated by the protein's structure. Proteins, including enzymes, are composed of various sequences of alpha-amino acids, which have the general structure H2N—CHR—COOH. The alpha amino moiety (H2N—) of one amino acid joins to the carboxyl moiety (—COOH) of an adjacent amino acid to form amide linkages, which can be represented as —(NH—CR—CO)n—, where the subscript “n” can equal hundreds or thousands. The fragment represented by R can contain reactive sites for protein biological activity and for attachment of PEG derivatives.
For example, in the case of the amino acid lysine, there exists an —NH2 moiety in the epsilon position as well as in the alpha position. The epsilon —NH2 is free for reaction under conditions of basic pH. Much of the art in the field of protein derivatization with PEG has been directed to developing PEG derivatives for attachment to the epsilon —NH2 moiety of lysine residues present in proteins. “Polyethylene Glycol and Derivatives for Advanced PEGylation”, Nektar Molecular Engineering Catalog, 2003, pp. 1-17. These PEG derivatives all have the common limitation, however, that they cannot be installed selectively among the often numerous lysine residues present on the surfaces of proteins. This can be a significant limitation in instances where a lysine residue is important to protein activity, existing in an enzyme active site for example, or in cases where a lysine residue plays a role in mediating the interaction of the protein with other biological molecules, as in the case of receptor binding sites.
A second and equally important complication of existing methods for protein PEGylation is that the PEG derivatives can undergo undesired side reactions with residues other than those desired. Histidine contains a reactive imino moiety, represented structurally as —N(H)—, but many chemically reactive species that react with epsilon —NH2 can also react with —N(H)—. Similarly, the side chain of the amino acid cysteine bears a free sulfhydryl group, represented structurally as —SH. In some instances, the PEG derivatives directed at the epsilon —NH2 group of lysine also react with cysteine, histidine or other residues. This can create complex, heterogeneous mixtures of PEG-derivatized bioactive molecules and risks destroying the activity of the bioactive molecule being targeted. It would be desirable to develop PEG derivatives that permit a chemical functional group to be introduced at a single site within the protein that would then enable the selective coupling of one or more PEG polymers to the bioactive molecule at specific sites on the protein surface that are both well-defined and predictable.
In addition to lysine residues, considerable effort in the art has been directed toward the development of activated PEG reagents that target other amino acid side chains, including cysteine, histidine and the N-terminus. See, e.g., U.S. Pat. No. 6,610,281 which is incorporated by reference herein, and “Polyethylene Glycol and Derivatives for Advanced PEGylation”, Nektar Molecular Engineering Catalog, 2003, pp. 1-17. A cysteine residue can be introduced site-selectively into the structure of proteins using site-directed mutagenesis and other techniques known in the art, and the resulting free sulfhydryl moiety can be reacted with PEG derivatives that bear thiol-reactive functional groups. This approach is complicated, however, in that the introduction of a free sulfhydryl group can complicate the expression, folding and stability of the resulting protein. Thus, it would be desirable to have a means to introduce a chemical functional group into bioactive molecules that enables the selective coupling of one or more PEG polymers to the protein while simultaneously being compatible with (i.e., not engaging in undesired side reactions with) sulfhydryls and other chemical functional groups typically found in proteins.
As can be seen from a sampling of the art, many of these derivatives that have been developed for attachment to the side chains of proteins, in particular, the —NH2 moiety on the lysine amino acid side chain and the —SH moiety on the cysteine side chain, have proven problematic in their synthesis and use. Some form unstable linkages with the protein that are subject to hydrolysis and therefore decompose, degrade, or are otherwise unstable in aqueous environments, such as in the bloodstream. See Pedder, S. C. Semin Liver Dis. 2003; 23 Suppl 1:19-22 for a discussion of the stability of linkages in PEG-Intron®. Some form more stable linkages, but are subject to hydrolysis before the linkage is formed, which means that the reactive group on the PEG derivative may be inactivated before the protein can be attached. Some are somewhat toxic and are therefore less suitable for use in vivo. Some are too slow to react to be practically useful. Some result in a loss of protein activity by attaching to sites responsible for the protein's activity. Some are not specific in the sites to which they will attach, which can also result in a loss of desirable activity and in a lack of reproducibility of results. In order to overcome the challenges associated with modifying proteins with poly(ethylene glycol) moieties, PEG derivatives have been developed that are more stable (e.g., U.S. Pat. No. 6,602,498, which is incorporated by reference herein) or that react selectively with thiol moieties on molecules and surfaces (e.g., U.S. Pat. No. 6,610,281, which is incorporated by reference herein). There is clearly a need in the art for PEG derivatives that are chemically inert in physiological environments until called upon to react selectively to form stable chemical bonds.
Recently, an entirely new technology in the protein sciences has been reported, which promises to overcome many of the limitations associated with site-specific modifications of proteins. Specifically, new components have been added to the protein biosynthetic machinery of the prokaryote Escherichia coli (E. coli) (e.g., L. Wang, et al., (2001), Science 292:498-500) and the eukaryote Saccharomyces cerevisiae (S. cerevisiae) (e.g., J. Chin et al., Science 301:964-7 (2003)), which has enabled the incorporation of non-genetically encoded amino acids to proteins in vivo. A number of new amino acids with novel chemical, physical or biological properties, including photoaffinity labels and photoisomerizable amino acids, keto amino acids, and glycosylated amino acids have been incorporated efficiently and with high fidelity into proteins in E. coli and in yeast in response to the amber codon, TAG, using this methodology. See, e.g., J. W. Chin et al., (2002), Journal of the American Chemical Society 124:9026-9027; J. W. Chin, & P. G. Schultz, (2002), ChemBioChem 11:1135-1137; J. W. Chin, et al., (2002), PNAS United States of America 99:11020-11024; and, L. Wang, & P. G. Schultz, (2002), Chem. Comm., 1-10. These studies have demonstrated that it is possible to selectively and routinely introduce chemical functional groups, such as ketone groups, alkyne groups and azide moieties, that are not found in proteins, that are chemically inert to all of the functional groups found in the 20 common, genetically-encoded amino acids and that may be used to react efficiently and selectively to form stable covalent linkages.
The ability to incorporate non-genetically encoded amino acids into proteins permits the introduction of chemical functional groups that could provide valuable alternatives to the naturally-occurring functional groups, such as the epsilon —NH2 of lysine, the sulfhydryl —SH of cysteine, the imino group of histidine, etc. Certain chemical functional groups are known to be inert to the functional groups found in the 20 common, genetically-encoded amino acids but react cleanly and efficiently to form stable linkages. Azide and acetylene groups, for example, are known in the art to undergo a Huisgen [3+2] cycloaddition reaction in aqueous conditions in the presence of a catalytic amount of copper. See, e.g., Tornoe, et al., (2002) Org. Chem. 67:3057-3064; and, Rostovtsev, et al., (2002) Angew. Chem. Int. Ed. 41:2596-2599. By introducing an azide moiety into a protein structure, for example, one is able to incorporate a functional group that is chemically inert to amines, sulfhydryls, carboxylic acids, hydroxyl groups found in proteins, but that also reacts smoothly and efficiently with an acetylene moiety to form a cycloaddition product. Importantly, in the absence of the acetylene moiety, the azide remains chemically inert and unreactive in the presence of other protein side chains and under physiological conditions.
The present invention addresses, among other things, problems associated with the activity and production of four helical bundle (4HB) polypeptides, and also addresses the production of a 4HB polypeptide with improved biological or pharmacological properties, such as improved therapeutic half-life.