This invention relates to the design and production of single-chain biosynthetic constructs herein called xe2x80x9cmorphonsxe2x80x9d which mimic the activities of one or more members of the TGF-xcex2 superfamily.
The TGF-xcex2 superfamily includes five distinct forms of TGF-xcex2 (Sporn and Roberts (1990) in Peptide Growth Factors and Their Receptors, Sporn and Roberts, eds., Springer-Verlag: Berlin pp. 419-472), as well as the differentiation factors vg-1 (Weeks and Melton (1987) Cell 51: 861-867), DPP-C polypeptide (Padgett et al. (1987) Nature 325: 81-84), the hormones activin and inhibin (Mason et al. (1985) Nature 318: 659-663; Mason et al. (1987) Growth Factors 1: 77-88), the Mullerian-inhibiting substance, MIS (Cate et al. (1986) Cell 45:685-698), osteogenic and morphogenic proteins OP-1 (PCT/US90/05903), OP-2 (PCT/US91/07654), OP-3 (PCT/WO94/10202), the BMPs, (see U.S. Pat. Nos. 4,877,864; 5,141,905; 5,013,649; 5,116,738; 5,108,922; 5,106,748; and 5,155,058), the developmentally regulated protein VGR-1 (Lyons et al. (1989) Proc. Natl. Acad. Sci. USA 86: 4554-4558) and the growth/differentiation factors GDF-1, GDF-3, GDF-9 and dorsalin-1 (McPherron et al. (1993) J. Biol. Chem. 268: 3444-4449; Basler et al. (1993) Cell 73: 687-702).
The proteins of the TGF-xcex2 superfamily are disulfide-linked homo- or heterodimers that are expressed as large precursor polypeptide chains containing a hydrophobic signal sequence, a long and relatively poorly conserved N-terminal pro region of several hundred amino acids, a cleavage site, and a mature domain comprising an N-terminal region which varies among the family members and a more highly conserved C-terminal region. This C-terminal region, present in the processed mature proteins of all known family members, contains approximately 100 amino acids with a characteristic cysteine motif having a conserved six or seven cysteine skeleton. Although the position of the cleavage site between the mature and pro regions varies among the family members, the cysteine pattern of the C-terminus of all of the proteins is in the identical format, ending in the sequence Cys-X-Cys-X, SEQ ID No: 44 (Sporn and Roberts (1990), supra).
Recombinant TGF-xcex21 has been cloned (Derynck et al. (1985) Nature 316: 701-705), and expressed in Chinese hamster ovary cells (Gentry et al. (1987) Mol. Cell. Biol. 7: 3418-3427). Additionally, recombinant human TGF-xcex22 (deMartin et al. (1987) EMBO J. 6:3673), as well as human and porcine TGF-xcex23 (Derynck et al. (1988) EMBO J. 7: 3737-3743; Dijke et al. (1988) Proc. Natl. Acad. Sci. USA 85:4715), have been cloned. Expression levels of the mature TGF-xcex21 protein in COS cells have been increased by substituting cysteine residues located in the pro region of the TGF-xcex21 precursor with serine residues (Brunner et al. (1989) J. Biol. Chem. 264: 13660-13664).
A unifying feature of the biology of the proteins of the TGF-xcex2 superfamily is their ability to regulate developmental processes. These structurally related proteins have been identified as being involved in a variety of developmental events. For example, TGF-xcex2 and the polypeptides of the inhibin/activin group appear to play a role in the regulation of cell growth and differentiation. MIS causes regression of the Mullerian duct in development of the mammalian male embryo, and dpp, the gene product of the Drosophila decapentaplegic complex, is required for appropriate dorsal-ventral specification. Similarly, Vg-1 is involved in mesoderm induction in Xenopus, and Vgr-1 has been identified in a variety of developing murine tissues. Regarding bone formation, many of the proteins in the TGF-xcex2 supergene family, namely OP-1 and a subset of the BMPs apparently play the major role. OP-1 (BMP-7) and other osteogenic proteins have been produced using recombinant techniques (U.S. Pat. No. 5,011,691 and PCT Application No. US 90/05903) and shown to be able to induce formation of true endochondral bone in vivo. BMP-2 has been recombinantly produced in monkey COS-1 cells and Chinese hamster ovary cells (Wang et al. (1990) Proc. Natl. Acad. Sci. USA 87:2220-2224).
Recently the family of proteins taught as having osteogenic activity as judged by the Sampath and Reddi bone formation assay have been shown to be morphogenic, i.e., capable of inducing the developmental cascade of tissue morphogenesis in a mature mammal (See PCT Application No. US 92/01968). In particular, these proteins are capable of inducing the proliferation of uncommitted progenitor cells, and inducing the differentiation of these stimulated progenitor cells in a tissue-specific manner under appropriate environmental conditions. In addition, the morphogens are capable of supporting the growth and maintenance of these differentiated cells. These morphogenic activities allow the proteins to initiate and maintain the developmental cascade of tissue morphogenesis in an appropriate, morphogenically permissive environment, stimulating stem cells to proliferate and differentiate in a tissue-specific manner, and inducing the progression of events that culminate in new tissue formation. These morphogenic activities also allow the proteins to induce the xe2x80x9credifferentiationxe2x80x9d of cells previously stimulated to stray from their differentiation path. Under appropriate environmental conditions it is anticipated that these morphogens also may stimulate the xe2x80x9credifferentiationxe2x80x9d of committed cells.
The tertiary and quaternary structure of both TGF-xcex22 and OP-1 have been determined. Although TGF-xcex22 and OP-1 exhibit only about 35% amino acid identity in their respective amino acid sequences the tertiary and quaternary structures of both molecules are strikingly similar. Both TGF-xcex22 and OP-1 are dimeric in nature and have a unique folding pattern involving six of the seven C-terminal cysteine residues, as illustrated in FIG. 1A. FIG. 1A shows that in each subunit four cysteines bond to generate an eight residue ring, and two additional two cysteine residues form a disulfide bond that passes through the ring to form a knot-like structure. With a numbering scheme beginning with the most N-terminal cysteine of the 7 conserved cysteine residues assigned number 1, the 2nd and 6th cysteine residues bond to dose one side of the eight residue ring while the 3rd and 7th cysteine residues close the other side. The 1st and 5th conserved cysteine residues bond through the center of the ring to form the core of the knot. The 4th cysteine forms an interchain disulfide bond with the corresponding residue in the other subunit.
The TGF-xcex22 and OP-1 monomer subunits comprise three major structural elements and an N-terminal region. The structural elements are made up of regions of contiguous polypeptide chain that possess over 50% secondary structure of the following types: (1) loop, (2) helix and (3) xcex2-sheet. Furthermore, in these regions the N-terminal and C-terminal strands are not more than 7 A apart. The residues between the 1st and 2nd conserved cysteines (FIG. 1A) form a structural region characterized by an anti-parallel xcex2-sheet finger, referred to herein as the finger 1 region (F1). A ribbon trace of the finger 1 peptide backbone is shown in FIG. 1B. Similarly the residues between the 5th and 6th conserved cysteines in FIG. 1A also form an anti-parallel xcex2-sheet finger, referred to herein as the finger 2 region (F2). A ribbon trace of the finger 2 peptide backbone is shown in FIG. 1D. A xcex2-sheet finger is a single amino acid chain, comprising a xcex2-strand that folds back on itself by means of a xcex2-turn or some larger loop so that the entering and exiting strands form one or more anti-parallel xcex2-sheet structures. The third major structural region, involving the residues between the 3rd and 4th conserved cysteines in FIG. 1A, is characterized by a three turn xcex1-helix referred to herein as the heel region (H). A ribbon trace of the heel peptide backbone is shown in FIG. 1C.
The organization of the monomer structure is similar to that of a left hand where the knot region is located at the position equivalent to the palm, finger 1 is equivalent to the index and middle fingers, the xcex1-helix is equivalent to the heel of the hand, and finger 2 is equivalent to the ring and small fingers. The N-terminal region (not well defined in the published structures) is predicted to be located at a position roughly equivalent to the thumb.
In the dimeric forms of both TGF-xcex22 and OP-1, the subunits are oriented such that the heel region of one subunit contacts the finger regions of the other subunit with the knot regions of the connected subunits forming the core of the molecule. The 4th cysteine forms a disulfide bridge with its counterpart on the second chain thereby equivalently linking the chains at the center of the palms. The dimer thus formed is an ellipsoidal (cigar shaped) molecule when viewed from the top looking down the two-fold axis of symmetry between the subunits (FIG. 2A). Viewed from the side, the molecule resembles a bent xe2x80x9ccigarxe2x80x9d since the two subunits are oriented at a slight angle relative to each other (FIG. 2B).
U.S. Pat. Nos.: 5,132,405; 5,091,513; and 5,258,498 and PCT Application No. PCT/US88/01737 disclose how to make single-chain binding proteins which mimic the structure of an immunoglobulin Fv region by linking the C- and N-termini of light- and heavy chain variable region domains, and how to make multifunctional proteins by linking together separate protein domains which function either independently or in concert. U.S. Pat. Nos.: 4,704,692; 4,881,175; 4,939,666; 4,946,778; and 5,260,203 disclose computer-based techniques for selecting a peptide linker sequences for connecting separate protein chains to produce a single-chain protein.
The invention provides a family of single-chain constructs of the TGF-xcex2 superfamily (hereinafter called xe2x80x9cmorphonsxe2x80x9d) which mimic the physiological effects of one or more members of the superfamily. Specifically, the morphon constructs of the invention bind preferentially to a natural cell surface receptor that typically interacts with a TGF-xcex2 superfamily member, and the morphon, upon binding with the receptor initates a cascade of events that would occur when the TGF-xcex2 superfamily member binds to the receptor.
The morphon constructs differ from the natural TGF-xcex2 superfamily members in that they are single-chain proteins which preferably are expressed from a single DNA in a host cell. The natural members of the TGF-xcex2 superfamily are dimeric structures wherein the monomer subunits are held together by non-covalent interactions or by one or more disulfide bonds. The TGF-xcex2 superfamily members are inactive as monomers. In contrast, the morphon constructs comprise a functional monomer subunit and, therefore, are believed to be more stable than the natural dimers, particularly under reducing conditions. In addition, the morphon constructs may have a molecular weight significantly lower than the natural dimers and thus are likely to diffuse faster and be cleared by the body faster than natural superfamily members.
The morphon constructs preferably are manufactured in accordance with the principles disclosed herein by assembly of nucleotides and/or joining DNA restriction fragments to produce synthetic DNAs. The DNAs are transfected into an appropriate protein expression vehicle, the encoded protein expressed, folded if necessary, and purified. Particular constructs can be tested for agonist activity in vitro. The tertiary structure of the candidate morphon constructs may be iteratively refined and binding modulated by site-directed or nucleotide sequence directed mutagenesis aided by the principles disclosed herein, computer-based protein structure modeling, and recently developed rational drug design techniques to improve or modulate specific properties of a molecule of interest. Known phage display or other nucleotide expression systems may be exploited to produce simultaneously a large number of candidate constructs. The pool of candidate constructs subsequently may be screened for binding specificity using, for example, a chromatography column comprising surface immobilized receptors, salt gradient elution to select for, and to concentrate high binding candidates, and in vitro assays to determine whether or not particular isolated candidates agonize the activity of the template superfamily member. Identification of a useful construct is followed by production of cell lines expressing commercially useful quantities of the construct for laboratory use and ultimately for producing therapeutically useful drugs. It is contemplated also that preferred single-chain constructs, once identified and characterized by the recombinant DNA methodologies described above, may be produced by standard peptide synthesis methodologies.
It has now been discovered how to design, make, test and use single-chain amino acid constructs comprising an amino acid sequence which, when properly folded, assume a tertiary structure defining a finger 1 region, a finger 2 region, and a heel region which together are complementary to the ligand binding interactive surface of a TGF-xcex2 superfamily member receptor. The constructs, upon binding with the receptor agonize the activity of a TGF-xcex2 superfamily member. In one important subset of embodiments, the constructs agonize initiation of cellular differentiation and tissue morphogenesis, e.g., initiate cell transformation leading to new tissue formation, such as, bone formation. The constructs comprise an amino acid sequence sufficiently duplicative of the amino acid sequence of a TGF-xcex2 superfamily member such that it preferentially binds the cognate receptor for that member.
All of the morphon constructs of the invention comprise regions of amino acid sequences defining the three regions required for utility, namely, finger 1, finger 2, and heel regions, and additional linker sequences which join these regions, maintain them in their proper conformation individually, and maintain their relative positions and orientations in space. Sequences for the finger and heel regions may be copied from the respective finger and heel region sequences of any known TGF-xcex2 superfamily member identified herein. Alternatively, the finger and heel regions may be selected from the amino acid sequence of a new member of this superfamily discovered hereafter using the principles disclosed hereinbelow.
The finger, heel, and linker sequences also may be altered by amino acid substitution, for example by exploiting substitute amino acid residues selected in accordance with the principles disdosed in Smith et al. (1990) Proc. Natl. Acad. Sci. USA 87: 118-122, the disclosure of which is incorporated herein by reference. Smith et al. disclose an amino acid class hierarchy, similar to the amino acid hierarchy table set forth in FIG. 10, which may be used to rationally substitute one amino acid for another while minimizing gross conformational distortions of the type which otherwise may inactivate the protein. In any event, it is contemplated that many synthetic finger 1, finger 2, and heel region sequences, having only 70% homology with natural regions, preferably 80%, and most preferably at least 90%, may be used to produce active morphon constructs. It is contemplated also, as disclosed herein, that the size of the constructs may be reduced significantly by truncating the natural finger and heel regions of the template TGF-xcex2 superfamily member, while compensating for dimensional changes using linkers as disdosed below.
The linker sequences, as described herein, join and maintain the spatial relationship of the finger and heel regions within a monomeric subunit. The linker sequences typically comprising about 3-13 amino acids serve to maintain, for example, the cysteine structural motif, namely, the cysteine pattern and knot structure (cysteines 1, 2, 3, 5, 6, and 7, and their respective spatial and bonding relationship) which characterize, and are believed to be essential for maintaining the tertiary structure of the members of the superfamily. Principles and methods for selecting appropriate polypeptide linker sequences are disclosed hereinbelow.
More specifically, the invention provides a functional morphon construct that is constituted by a finger 1, a finger 2, and a heel region, joined together by peptide linkers having, e.g., 3-13 amino acids, and optionally including an N-terminal sequence upstream of the first cysteine beginning the sequence of the finger 1 region. In the single-chain morphon constructs of the invention, the finger 1, finger 2 and heel regions together define a structure complementary to the ligand binding surface of a TGF-xcex2 superfamily member receptor structure and are sufficiently duplicative of a sequence of a member of the TGF-xcex2 superfamily such that the construct preferentially binds the receptor. Where the finger 1 region is designated F1, the finger 2 designated F2, and the heel designated H, the constructs of this type can take the form of one of the constructs set forth below, and include:
F1-linker-F2-linker-H
F1-linker-H-linker-F2
F2-linker-F1-linker-H
F2-linker-H-linker-F1
H-linker-F1-linker-F2, and
H-linker-F2-linker-F1.
Generic and specific sequences of amino acids which constitute the finger 1, finger 2, heel regions, and N-terminal sequences are disclosed hereinbelow.
The invention further comprises DNAs encoding the morphon constructs of the invention, cell lines transformed with the DNAs, and methods of making the morphons by culturing transformed cells to produce them followed by purification.
The invention may be understood further, and various of its objects and features better appreciated by referring to the drawings, description, sequence listing, and claims which follow.