The present invention relates to polypeptides able to form multimers, particularly trimers, and the manufacture and use of such polypeptides.
A. Coiled-coils
A basic component of the quaternary structure of the present multimerizing polypeptides is the coiled-coil (reviewed in Müller et al., (2000) Meth.Enzymol. 328: 261-283). Coiled-coils are protein domains that take the shape of gently twisted, ropelike bundles. The bundles contain two to five a helices in parallel or antiparallel orientation. The essential feature of coiled-coil sequences is a seven-residue, or heptad, repeat (abcdefg)n with the first (a) and fourth (d) positions commonly occupied by hydrophobic amino acids. The remaining amino acids of the coiled-coil structure are generally polar, where proline is usually excluded due to its disruptive effect on helical architecture.
This characteristic heptad repeat (also known as a 3,4 hydrophobic repeat) is what forms the structure of the coiled-coil domain, with each residue sweeping about 100°. This results in the seven residues of the heptad repeat falling short of two full turns by 20°. The lag forms a gentle, left-handed hydrophobic stripe of residues running down an α helix and the coiled-coil forms when these hydrophobic stripes associate. Deviations from the regular 3,4 spacing of nonpolar residues changes the angle of the hydrophobic stripe with respect to the helix axis, altering the crossing angle of the helices and destabilizing the quaternary structure. An example of a common means of diagramming coiled-coil heptad repeats is illustrated in FIG. 1.
Parallel dimers and trimers are the most common observed coiled-coil structures. The features that distinguish dimers from trimers from higher order oligomerization formations are relatively well understood. The core residues of the heptad repeat (residues a and d) largely determine the oligomerization state, while residues on the edge of the helix (e and g) play secondary roles. Trimers are the default organization for a random distribution of core residues as other oligomerization states cannot tolerate β-branched amino acids (Ile, Val, and Thr) in the d position for dimers or the a position for tetramers. In contrast, trimers generally permit the presence of β-branched and other hydrophobic amino acids in the core positions.
Coiled-coil fusions have been used to achieve diverse experimental goals. One common use is the replacement of natural oligomerization domains with a heterologous sequence to alter oligomerization state, stability, and/or avidity. Low affinity monomers that do not naturally associate can be oligomerized in order to bind effectly to other multimeric targets. Additionally, the oligmerization domain fusion can be used to mimic the activated state of the native protein that is difficult to achieve with recombinant protein production (see, e.g., Pullen et al. (1999) Biochem. 94:6032). This approach has been particularly effective when producing only specific domains, such as the extracellular (cytoplasmic) or intracellular portion of a protein of interest. Commonly, coiled-coils are genetically fused to the protein of interested via a flexible linker that will provide access for the fusion to a large three-dimensional space. Direct fusions are used for experimental goals that require more rigid molecules, such as those used for crystallization.
A number of model coiled-coil systems have been developed based on the structural information of large structural proteins, such as myosin and tropomyosin (TM43, Lau et al. J Biol Chem; 259: 13253-13261), a group of proteins known as collectins (Hoppe et al. (1994) Protein Sci; 3:1143-1158), or of the dimerization region of DNA regulatory proteins, such as the yeast transcriptional activator protein GCN4-p1 (Landschulz et al. (1988) Science; 240:1759-1764). This last structure is often referred to as a “leucine zipper” or LZ. Derivative model systems from the TM43 have been made, specifically where one leucine per heptad has been switched to phenylalanine. This structure is known as a “phenylalanine zipper” or FZ (Thomas et al. Prog Colloid Polymer Sci; 99: 24-30). A third type of well-known derivative of the LZ is the isoleucine zipper (IZ) (Harbury et al. (1994) Nature 371:80-83).
An important constraint of model coiled-coils is the ability to be produced in the expression host. The lack of disulfide bonds in coiled-coil structures aids their production in heterologous expression systems. However, de novo designed sequences tend to be sensitive to proteolysis. Even if effectively expressed, the relative lack of effectiveness as compared to natural sequences reflects the gaps in the current knowledge about all variables involved in protein interaction (Arndt et al. (2002) Structure 10: 1235-1248). Additionally, the use of model sequences is problematic when the goal of the fusion protein produced is a biologically functional protein.
B. Viral Heptad Repeats
Many viruses produce a fusogenic form of viral envelope glycoproteins. Among the viral genuses or families that exhibit this type of fusion proteins are Orthomyxovirus (for example, Influenza virus), Filovirus (Ebola virus), Betaretrovirus (Mason Pfizer Monkey Virus (MPMV)), Gammaretrovirus (Friend Murine Leukemia Virus (FRMLV); Moloney Murine Leukemia Virus (MoMLV)), Deltraretrovirus (Human T-cell Leukemia Virus type 1 (HTLV-1)) and Lentivirus (Human Immunodeficiency Virus type 1 (HIV-1) and Simian Immunodeficiency Virus (SIV)) (reviewed in Cheynet et al. (2005) J Virol; 79; 5585-5593). In these viruses, the transmembrane subunit is produced as a fusion protein, encoded by the env gene. This gene product is cleaved into two proteins, the surface protein which is involved in receptor recognition, and the transmembrane subunit, which anchors the whole env complex to the membrane and is involved in virus entry through membrane fusion. These proteins are characterized by the presence of heptad repeats within the TM region which form strong interactions between oligomers of the protein via the formation of a coiled-coil structure with three subunits (Li et al. (1996) J Virol; 70: 1266-1270; Tucker et al. (1991) Virol; 185:710-720).
Crystal structures of several TM proteins have been determined: MoMLV (Fass et al. (1996) Nat Struc Biol; 3:465-469); HIV-1 (Chan et al. (1997) Cell; 89: 263-273); HTLV-1 (Kobe et al. (1999) PNAS; 93:4319-4324). Similar structures were also found in influenza virus (Wilson et al. (1981) Nature; 289: 366-373) and Ebola virus (Malashkevich et al. (1999) PNAS; 96:2662-2667) and this structure has been hypothesized to reflect a common mechanism for the fusion process and viral entry (Chambers et al. J Gen Virol; 71: 3075-3080). Other viruses that include heptad repeats within their genome are Cornaviruses (Severe Acute Respiratory Syndrome-associated cornavirus (SARS-CoV)) (Bosch et al. J Virol 77: 8801-8811); Herpesvirus (herpes virus simplex 1 (HSV-1) (Gianni et al. (2005) J Virol 79:7042-7049) and Human Cytomegalovirus (CMV)) (Lopper et al. J Virol (2004) 78: 8333-8341); and Paramyoxvirus (measles virus) (Buckland et al. (1992) J Gen Virol 73:1703-1707).
Sequence homology in the TM region has also allowed for the identification of endogenous retroviruses in sequence databases. These searches have been performed and have successfully identified endogenous retroviruses in many organism genomes, including human, rat, and mouse. One well known example of a family of human endogenous retroviruses (HERVs) is HERV-W. One locus of this family, ERVWE1 has been shown to encode a full length env open reading frame and produces a protein also known as syncytin (Cheynet et al. (2005) J Virol; 79:5585-5593). Like the viral protein, this protein is produced and cleaved into two separate subunits, a gp50 surface subunit and a gp24 transmembrane subunit. The gp24 subunit includes heptad repeats and the subunits are found associated as homotrimers. Interestingly, this protein is naturally produced in the placenta and may be involved in cell-cell interactions such as the fusion of the placenta to the uterine wall.
There remains a need in the art to adapt natural trimerization sequences for use in the production of biologically active, recombinant fusion proteins. Accordingly, the present application describes the screening, discovery, and development of appropriate natural genetic sequences for trimerization in the recombinant protein art.