Much is now known about the correlation between the amino acid sequence, or primary structure, of a protein and its secondary structure. This knowledge allows some predictions to be made about the energetically favored conformations that a protein of known amino acid sequence will assume in an aqueous solvent. The basic building blocks of protein secondary structure are the .alpha.-helix, the .beta.-sheet and the .beta.-turn. Chou et al. (1974) Biochemistry 13:211, 222; Chou et al. (1978) Ann. Rev. Biochem. 47:251-278; and Fasman (1987) Biopolymers 26(supp.):S59-S79 have shown that certain amino acids have a propensity for forming .alpha.-helices, while others tend to destabilize helices. One turn of the .alpha.-helix contains 3.6 amino acids, and is stabilized in part by hydrogen bonding interactions between amino acid residues and by interactions between neighboring amino acid residues such as vertical ionic bonding between negatively and positively charged amino acids. Helix-forming amino acids include Glu, Ala, His, Leu, Asp, Met, Ser, Lys, Arg, Phe and Trp (see Table 1 for amino acid one letter and three letter codes). Amino acids associated with .beta.-turns and .beta.-pleated sheets are also known. For additional reviews of the relationship between amino acid sequence and secondary structure, see Richardson and Richardson (1988) Science 240:1648-1652; presta and Rose (1988) Science 240:1632-1641; and Garnier et al. (1978) J. Mol. Biol. 120:97-120. Molecular modeling programs are known in the art which allow predictions of at least portions of the favored secondary structure of amino acid sequences. DeLisi (1988) Science 240:47-52 and von Heijne (1988) Nature 333:605-607 give some discussion of publicly-available software for the analysis of amino acid data.
TABLE 1 ______________________________________ ABBREVIATIONS AND CONVENTIONS USED ______________________________________ ATEE N-Acetyl-L-tyrosine ethyl ester Boc t-Butyloxycarbonyl Bom Benzyloxymethyl Bzl Benzyl BTEE N-Benzoyl-L-tyrosine ethyl ester DCM Dichloromethane Clz 2-Chlorobenzyloxycarbonyl Fmoc 9-Fluorenylmethoxycarbonyl MBHA -p-Methylbenzhydrylamine resin (for synthesis of peptide amides) Npys 3-Nitro-2-pyridylsulfenyl Tos -p-Toluenesulfonyl TFA Trifluoroacetic acid ZTONP N-Benzyloxycarbonyl-L-tyrosine -p-nitrophenyl ester Common Amino Acids 3 letter 1 letter 3 letter 1 letter ______________________________________ Alanine Ala A Lysine Lys K Arginine Arg R Methionine Met M Asparagine Asn D Ornithine Orn Aspartic Acid Asp D Phenylalanine Phe F Cysteine Cys C Proline Pro P Glutamic Acid Glu E Pryoglutamyl &lt;Glu &lt;E Glutamine Gln Q Serine Ser S Glycine Gly G Threonine Thr T Histidine His H Tryptophan Trp W Isoleucine Ile I Tyrosine Tyr Y Leucine Leu L Valine Val V ______________________________________ Representation of blocking groups on amino acids: A symbol to the left and hyphenated is a blocking group on the .alpha.-amino group: Boc--Gly.dbd.N-.alpha.-Boc--glycine. A symbol to the right and hyphenated is an ester on the .alpha.- carboxyl: Gly--OHBT.dbd.hydroxybenzotriazole ester of glycine. A symbol after the amino acid symbol and in parentheses is a blocking group on the side chain: Tyr(Bzl).dbd.O--benzyl tyrosine. EXAMPLES: Boc--Glu(OBzl)--ONp.dbd.ONp.dbd.N-.alpha.-Boc glutamic acid - gamma benzyl ester .alpha.-nitrophenyl ester, Boc--Tyr(Bzl)--OBzl.dbd.N-.a lpha.- Boc--O--Benzyl--tyrosine benzyl ester. ______________________________________
Interactions between structural components within a protein and the interactions between these structural components and the solvent environment determine the preferred tertiary structure of a protein. Hydrophobic interactions, especially in the internal portions of the protein, are particularly important in determining the most favorable conformation. For example, amphiphilic .alpha.-helices, in which the amino acids are positioned such that the helix contains a hydrophobic and a hydrophilic face, will interact to form tight bundles In such structures, the hydrophobic faces of the .alpha.-helices associate, while the hydrophilic faces of the helices at the surface of the structure interact with aqueous solvent. Similarly, if there are two .beta.-pleated sheet structures in which hydrophobic and hydrophilic amino acid residues alternate, the sheets may lie over one another, held by the interactions of the hydrophobic faces of the sheets. The interaction of secondary structure features of proteins is also influenced by the size of the amino acid side chains present and the possible formation of hydrogen bonds, salt bridges and covalent disulfide bonds.
One of the important principles of protein structure is that there is degeneracy in the folding code, i.e. different sequences can interact to form similar secondary structures and similar overall tertiary structures. In other words, the possible conformations for proteins are not equal to the number predicted statistically. Thus, it is practical to use the principles of secondary and tertiary structure to generate at least a simple model protein structure. There have been several reports of successful attempts to model a desired protein structural feature.
Regan et al. (1988) Science 241:976-978 detail the de novo design of a helical protein containing 74 amino acids, which design employed information known about the factors which stabilize amino acid .alpha.-helices. The model protein design included four identical .alpha.-helical regions connected by three identical hairpin loop regions and was intended as an idealized version of the naturally occurring four-helix bundle motif of myohemerythrin and cytochrome c'. A nucleic acid sequence which encoded the designed polypeptide was synthesized and combined with the tac promoter to form a synthetic gene which was then expressed in E. coli to form the designed protein. The helix-forming peptide sequence employed (-Gly-Glu-Leu-Glu-Glu-Leu-Leu-Lys-Lys-Leu-Lys-Glu-Leu-Leu-Lys-Gly) was designed to be amphiphilic so that in the complete protein a four-helix bundle would form with hydrophobic amino acids residues interacting at its interior and hydrophilic amino acid residues directed outward into the aqueous solvent. The hydrophilic residues glutamate and lysine were arranged in the sequence so that they could form ion-pairs along one face of the helix. In addition, each helical region contained negatively charged residues at its amino terminus and positively charged residues at the carboxy terminus to further stabilize helix formation. The specific hairpin loop sequence which was employed to separate the helical regions was: -Pro-Arg-Arg-. Circular Dichroism (CD) measurements of the resultant designed protein in aqueous solution indicated that it was predominantly .alpha.-helical in structure and size exclusion chromatography indicated the protein was monomeric. No particular function was associated with the designed protein.
Mutter and co-workers (see Mutter (1988) TIBS 13:260-265; Mutter and Tuchscherer (1988) Makromol. Chem. Rapid Commun. 9:437-443; Mutter et al. (1988) Tetrahedron 44:771-785; Mutter et al. (1988) Helvetica Chimica Acta 71:835-846; Mutter et al. (1989) Proteins: Structure, Function, and Genetics 5:13-21; Mutter and Vuilleumier (1989) Angew. Chemie (Intl Ed.) 28:535-679) have described the concept of the template-associated synthetic protein (TASP). In a TASP, component amphiphilic peptides, particularly those preferring .alpha.-helical and .beta.-sheet structures, are assembled by covalent bonds to a carrier or template molecule which is said to direct "the peptide chains into protein-like packing arrangements." The resultant molecule has a branched structure in which a number of peptides extend from the template. Oligopeptides, in particular, are employed as template molecules. In these TASPs, as in the 4-helix bundle molecule of Regan et al. supra, the component peptide chains each have identical amino acid sequences.
In nature, protein molecules have a variety of functions. One of the most important and most intensely studied functions of proteins is as enzymes. Enzymes catalyze chemical reactions of their substrates. An enzyme's properties are ultimately determined by its amino acid sequence, i.e. its primary structure. The primary structure directs the formation of any secondary structure and the overall three-dimensional or tertiary structure of the protein. The three-dimensional structure of the protein specifies the relational geometry of the amino acids which are associated with the active site (amino acid residues involved in catalysis), and determines the structure and composition of any substrate binding site. The structural characteristics of an enzyme protein establish the reaction catalyzed, any substrate specificity or selectivity of the reaction catalyzed and the kinetics of that reaction.
There is growing interest in the design of synthetic, small molecule enzymes to carry out known enzymatic reactions. Such synthetic catalysts are desirable because it is believed that a number of the problems associated with the use of enzyme catalysts, such as instability, can be alleviated while the desirable high selectivity and reaction rates of enzymes can be retained. In addition, an understanding of the correlation between an enzyme's structure and its properties, such as substrate specificity, can be employed to create enzyme-like catalysts having unique properties. For medical applications, there may be additional benefits resulting from the expected lower immunogenicity of low molecular weight synthetic enzymes. Furthermore, the ability to synthesize an enzyme chemically avoids any possible problems in isolating the enzyme in pure form from its natural source or in expression problems associated with production by recombinant DNA techniques. Several strategies have been employed to achieve artificial enzyme activity. Attempts have been made to mimic the substrate binding ability of the enzyme and/or mimic an enzyme active site and thus obtain catalytic activity.
Breslow et al. (1988) Tetrahedron 44:5515-5524 have described a strategy to obtain enzyme-like selective substrate binding in a relatively small molecule. Pyridoxamine phosphate, the cofactor commonly associated with transaminase enzymes, was covalently attached to .beta.-cyclodextrin to produce a substrate selective reactant. The .alpha.-keto amino acids phenylpyruvic acid and indolepyruvic acid were selectively bound via their aromatic rings in the cyclodextrin cavity, and were converted to phenylalanine and tryptophan, respectively, by reaction with the bound co-factor. In contrast, the pyridoxamine-cyclodextrin did not react with pyruvate to form alanine. Substrate specificity of the reaction is said to result from binding of the aromatic ring of the .alpha.-keto acid in the cyclodextrin cavity. In enzyme catalyzed transaminase reactions, pyridoxamine phosphate is regenerated by amino group transfer from a different amino acid (resulting in transamination). The pyridoxamine-cyclodextrin reactant described is not regenerated after reaction and is therefore not a true catalyst.
Kelly et al. (1989) J. Am. Chem. Soc. 111:3744-3745 have described a bi-substrate reaction template molecule, which brings two substrate molecules in close proximity and thus accelerates the reaction between them. The template molecule, which is composed of linked aromatic rings, was specifically designed to contain binding sites for desired substrates. The binding sites contain amine and keto groups selected for interaction with a specific substrate.
Poly-.alpha.-amino acids have been described which act as stereospecific catalysts for epoxidation reactions (Valencia-Parera et al. (1986) J. Coll. Interface Sci. 114:140-148). Polyalanine and chalcone were shown to interact in a water-toluene emulsion in the presence of air to produce an asymmetric epoxychalcone. The mechanism by which stereospecificity is achieved is not understood, but is suggested to be related to the kind of emulsion formed.
A protein with ion channel activity has been designed which contains four antiparallel helices (DeGrado et al. (1989) Science 243:622-628). The amino acid sequence of the protein specifies the four helical regions, which are separated by amino acid sequences which have the requisite flexibility for appropriate folding. The helices are designed so that the "outsides" of the helices are hydrophobic to facilitate positioning within lipid bilayers, and the "insides" of the helices are hydrophilic so that proton-conductive channels are formed when the four helices associate. The bulkiness of the amino acid groups at the interior of the bundle structure prevents the conductance of larger cations.
The synthesis and enzyme-like activity of a designed hemeprotein called a helichrome has been described (Sasaki and Kaiser (1989) J. Am. Chem. Soc. 111:380-383). The helichrome is composed of four identical amphiphilic .alpha.-helices bound to a porphyrin ring to which Fe(III) can be complexed. The four helices, each of which has the amino acid sequence: Ala-Glu-Gln-Leu-Leu-Gln-Glu-Ala-Glu-Gln-Leu-Leu-Gln-Glu-Leu-amide, are reported to interact to form a hydrophobic pocket in aqueous solution into which a substrate can bind. The Fe(III) helichrome complex was reported to catalyze the hydroxylation of aniline to p-aminophenol. The kinetic constants of the helichrome-catalyzed hydroxylation of aniline are similar to those observed for the same reaction catalyzed by hemoglobin, indoleamine 2,3-dioxygenase and L-tryptophan 2,3-dioxygenase.
The serine proteases are a family of extensively studied enzymes which catalyze similar reactions (the hydrolysis of peptide and certain ester bonds) and share common structural features in their active site. The serine proteases, which include among others chymotrypsin, trypsin and elastase, are so named because of the presence of a uniquely reactive serine residue at their active site. Even though the amino acid sequences of chymotrypsin, trypsin and elastase are quite different, the enzymes have a very similar tertiary structure. Each contains a similar active site composed of an Asp, His and Ser (positions 102, 57 and 195 in chymotrypsin) in an approximately planar arrangement which form a "charge relay system." The serine proteases also have a substrate binding pocket allowing appropriate positioning of the substrate with respect to the active site. Differences in the structure of the substrate binding pocket of serine proteases are associated with differences in substrate specificity. Because the active site structure and mechanism of catalysis by serine proteases are well understood, several attempts have been made to create artificial enzymes which mimic their catalytic activities.
Mutter (1985) Angew. Chem. Int. Ed. Engl. 24:639-653 reviews several apparently unsuccessful strategies that have been applied in order to mimic .alpha.-chymotrypsin activity. He notes that chymotrypsin "mimics" produced by polymerization of the amino acids of the active site resulted in low catalytic activity probably due to the random structure of the polypeptides. He further notes that cyclic peptides containing the active site amino acids, such as cyclo-(Gly-L-His-L-Ser-Gly-L-His-Ser-) display no significant enhancement of hydrolysis of the chymotrypsin substrate, p-nitrophenyl acetate. He then reports the synthesis of the peptide: EQU Ac-His-Phe-Gly-Cys-D-Phe-Ser-Gly-Glu-Cys-NH.sub.2
which is described as having functional groups oriented by a .beta.-turn and an S--S bond and having a hydrophobic pocket provided by the two phenylalanine residues. The peptide, however, is described as having "low catalytic activity."
Vorherr et al. (1986) Helv. Chim. Acta 69:410-414 described a single-center model for the active site of .alpha.-chymotrypsin. The model utilizes the hydrophilic polymer polyethyleneglycol (PEG) to which a peptide containing amino acid residues similar to those of the active site of .alpha.-chymotrypsin was attached. In the specific model, glutamate was substituted for aspartate for convenience in synthesis of the peptide. The specific sequence of the peptide employed, Glu-Gly-His-Pro-Gly-Ser-Gly, was predicted to have a high potential for .beta.-turn configuration. It was also predicted that a preferred conformation of the peptide would contain a planar H-bonded structure involving the side chains of Glu, His and Ser which would approximate the interactions of Asp-102, His-57 and Ser-195 in the active site of native chymotrypsin. CD measurements were reported to confirm that the PEG-bound peptide was predominantly in the .beta.-turn conformation which the authors suggest indicates that the active site residues are in the proper geometry for activity. The PEG-bound peptide was, however, reported to display "no increased catalytic activity . . . compared to other functional model compounds . . . in the hydrolysis of p-nitrophenylacetate."
Thus, while some success in modeling desired peptide and protein structures has been achieved, much less success has been achieved in mimicking the function of proteins such as enzymes.