In multicellular organisms, such as humans, cells communicate with each other by the so-called signal transduction pathway, in which a secreted ligand (e.g. cytokines, growth factors or hormones) binds to its cell surface receptor(s), leading to receptor activation. The receptors are membrane proteins, which consist of an extracellular domain responsible for ligand binding, a central transmembrane region followed by a cytoplasmic domain responsible for sending the signal downstream. Signal transduction can take place in the following three ways: paracrine (communication between neighboring cells), autocrine (cell communication to itself) and endocrine (communication between distant cells through circulation), depending on the source of a secreted signal and the location of target cell expressing a receptor(s). One of the general mechanisms underlying receptor activation, which sets off a cascade of events beneath the cell membrane including the activation of gene expression, is that a polypeptide ligand such as a cytokine, is present in an oligomeric form, such as a homo- dimer or trimer, which when bound to its monomeric receptor at the cell outer surface, leads to the oligomerization of the receptor. Signal transduction pathways play a key role in normal cell development and differentiation, as well as in response to external insults such as bacterial and viral infections. Abnormalities in such signal transduction pathways, in the form of either underactivation (e.g. lack of ligand) or overactivation (e.g. too much ligand), are the underlying causes for pathological conditions and diseases such as arthritis, cancer, AIDS, and diabetes.
One of the current strategies for treating these debilitating diseases involves the use of receptor decoys, such as soluble receptors consisting of only the extracellular ligand-binding domain, to intercept a ligand and thus overcome the overactivation of a receptor. The best example of this strategy is the creation of Enbrel®, a dimeric soluble TNF-α receptor-immunoglobulin (IgG) fusion protein by Immunex (Mohler et al., 1993; Jacobs et al., 1997), which is now part of Amgen. The TNF family of cytokines is one of the major pro-inflammatory signals produced by the body in response to infection or tissue injury. However, abnormal production of these cytokines, for example, in the absence of infection or tissue injury, has been shown to be one of the underlying causes for diseases such as arthritis and psoriasis. Naturally, a TNF-α receptor is present in monomeric form on the cell surface before binding to its ligand, TNF-α, which exists, in contrast, as a homotrimer (Locksley et al., 2001). Accordingly, fusing a soluble TNF-α receptor with the Fc region of immunoglobulin G1, which is capable of spontaneous dimerization via disulfide bonds (Sledziewski et al., 1992 and 1998), allowed the secretion of a dimeric soluble TNF-α receptor (Mohler et al., 1993; Jacobs et al., 1997). In comparison with the monomeric soluble receptor, the dimeric TNF-α receptor II-Fc fusion has a greatly increased affinity to the homo-trimeric ligand. This provides a molecular basis for its clinical use in treating rheumatoid arthritis (RA), an autoimmune disease in which constitutively elevated TNF-α, a major pro-inflammatory cytokine, plays an important causal role. Although Enbrel® was shown to have a Ki in the pM range (μg/mL) to TNF-α ( Mohler et al., 1993), 25 mg twice a week subcutaneous injections, which translates to μg/mL level of the soluble receptor, are required for the RA patients to achieve clinical benefits (www.enbrel.com). The high level of recurrent Enbrel® consumption per RA patients has created a great pressure as well as high cost for the drug supply, which limits the accessibility of the drug to millions of potential patients in this country alone.
In addition to the TNF-α family of potent proinflammatory cytokines, the HIV virus that causes AIDS also uses a homo-trimeric coat protein, gp120, to gain entry into CD-4 positive T helper cells in our body (Kwong et al., 1998). One of the earliest events during HIV infection involves the binding of gp120 to its receptor CD-4, uniquely expressed on the cell surface of T helper cells (Clapham et al., 2001). Monomeric soluble CD-4 was shown over a decade ago as a potent agent against HIV infection (Clapham et al., 1989) however, the excitement was sadly dashed when its potency was shown to be limited only to laboratory HIV isolates (Daar et al., 1990). It turned out that HIV strains from AIDS patients, unlike the laboratory isolates, had a much lower affinity to the monomeric soluble CD-4, likely due to the sequence variation on the gp120 (Daar et al., 1990). Although the dimeric soluble CD-4-Fc fusion proteins have been made, these decoy CD-4 HIV receptors showed little antiviral effect against natural occurring HIVs from AIDS patients, both in the laboratories and in clinics, due to the low affinity to the gp120 (Daar et al., 1990).
Clearly, there is a great need to be able to create secreted homo-trimeric soluble receptors or biologically active proteins, which can have perfectly docked binding sites, hence higher affinity, to their naturally occurring homo-trimeric ligands, such as the TNF family of cytokines and HIV coat proteins. Such trimeric receptor decoys theoretically should have a much higher affinity than its dimeric counterparts to their trimeric ligand. Such rationally designed soluble trimeric receptor analogs could significantly increase the clinical benefits as well as lower the amount or frequency of the drug injections for each patient. To be therapeutically feasible, a desired trimerizing protein moiety for biologic drug designs should satisfy the following criteria. Ideally it should be part of a naturally secreted protein, like immunoglobulin Fc, that is also abundant (non-toxic) in the circulation, human in origin (lack of immunogenicity), relatively stable (long half-life), capable of efficient self-trimerization which is strengthened by inter-chain covalent disulfide bonds, and pertain an optimal geometry in projecting soluble receptor to be trimerized to confirm maximum ligand binding.
Collagen is a family of fibrous proteins that are the major components of the extracellular matrix. It is the most abundant protein in mammals, constituting nearly 25% of the total protein in the body. Collagen plays a major structural role in the formation of bone, tendon, skin, cornea, cartilage, blood vessels, and teeth (Stryer, 1988). The fibrillar types of collagen I, II, III, IV, V, and XI are all synthesized as larger trimeric precursors, called procollagens, in which the central uninterrupted triple-helical domain consisting of hundreds of “G-X-Y” repeats (or glycine repeats) is flanked by non-collagenous domains (NC), the N- propeptide and the C-propeptide (Stryer, 1988). Both the C- and N-terminal extensions are processed proteolytically upon secretion of the procollagen, an event that triggers the assembly of the mature protein into collagen fibrils which forms an insoluble cell matrix (Prockop et al., 1998). BMP-1 is a protease that recognizes a specific peptide sequence of procollagen near the junction between the glycine repeats and the C-prodomain of collagens and is responsible for the removal of the propeptide (Li et al.). The shed trimeric C-propeptide of type I collagen is found in human sera of normal adults at a concentration in the range of 50-300 ng/mL, with children having a much higher level which is indicative of active bone formation (Melkko et al.). In people with familial high serum concentration of C-propeptide of type I collagen, the level could reach as high as 1-6 μg/mL with no apparent abnormality, suggesting the C-propeptide is not toxic (Sorva et al.). Structural study of the trimeric C-propeptide of collagen suggested that it is a tri-lobed structure with all three subunits coming together in a junction region near their N-termini to connect to the rest of the procollagen molecule (Bernocco et al.). Such geometry in projecting proteins to be fused in one direction is similar to that of Fc dimer.
Type I, IV, V and XI collagens are mainly assembled into heterotrimeric forms consisting of either two α-1chains and one α-2 chain (for Type I, IV, V), or three different a chains (for Type XI), which are highly homologous in sequence. The type II and III collagens are both homotrimers of α-1chain. For type I collagen, the most abundant form of collagen, stable α-1(I) homotrimer is also formed and is present at variable levels (Alvares et al., 1999) in different tissues. Most of these collagen C-propeptide chains can self-assemble into homotrimers, when over-expressed alone in a cell. Although the N-propeptide domains are synthesized first, molecular assembly into trimeric collagen begins with the in-register association of the C-propeptides. It is believed the C-propeptide complex is stabilized by the formation of interchain disulfide bonds, but the necessity of disulfide bond formation for proper chain registration is not clear. The triple helix of the glycine repeats and is then propagated from the associated C-termini to the N-termini in a zipper-like manner. This knowledge has led to the creation of non-natural types of collagen matrix by swapping the C-propeptides of different collagen chains using recombinant DNA technology (Bulleid et al., 2001). Non-collagenous proteins, such as cytokines and growth factors, also have been fused to the N-termini of either pro-collagens or mature collagens to allow new collagen matrix formation, which is intended to allow slow release of the noncollagenous proteins from the cell matrix (Tomita et al., 2001). However, under both circumstances, the C-propeptides are required to be cleaved before recombinant collagen fibril assembly into an insoluble cell matrix.
Although, other protein trimerization domains, such as those from GCN4 from yeast (Yang, X. et al, 2000), fibritin from bacteria phage T4 (Frank, S. et al., 2001) and aspartate transcarbamoylase of Escherichia coli (Chen, B. et al., 2004), have been described previously to allow trimerization of heterologous proteins, none of these trimerizing proteins are human in nature, nor are they naturally secreted proteins. As such, any trimeric fusion proteins would have to be made intracellularly, which not only may fold incorrectly for naturally secreted proteins such as soluble receptors, but also make purification of such fusion proteins from thousands of other intracellular proteins difficult. Moreover, the fatal drawback of using such non-human protein trimerization domains (e.g. from yeast, bacteria phage and bacteria) for trimeric biologic drug design will be their immunogenicity in the human body, rendering such fusion proteins ineffective within weeks after injecting into the human body.
One secreted protein previously used as a protein trimerization tag is tetranectin, which is a plasminogen-binding protein of C-lectin family (Holtet et al.). However, unlike IgG Fc dimerization tag, the trimeric tetranectin structure is not strengthened by any interchain disulfide bonds, and significant fractions of both monomeric and dimeric tetranectin co-existed with the trimeric structure in solution (Holtet et al.). Physiologically, teranectin is involved in tissue remodeling and increased cell matrix concentration of tetranectin in human has been linked to multiple cancer types. Recombinant heterologous tetranectin fusion proteins have only been produced intracellularly in E. coli as insoluble inclusion bodies that required refolding to obtain soluble structures (Holtet et al. and Graversen et al.). These unfavorable attributes suggest that tetranectin is not ideal for therapeutic applications as a protein trimerization tag. Nonetheless, bacterially produced ApoAI-Tetranectin fusion protein has been produced and patented (Graversen et al.) and is being tested as a therapeutic agent for atherosclerosis.