1. Field of the Invention
The present application relates to computer assisted methods for making proteins (and other biological materials with similar molecular structure) or other macromolecules artificially, to permit rapid manufacture of macromolecular substances with structures not necessarily found in nature.
2. Discussion of Related Art
Various known teachings, and novel teachings first disclosed in the U.S. Pat. No. 4,704,692, which are believed to be related to various ones of the innovations disclosed in the present application will now be discussed. However, applicant specifically notes that not every idea discussed in this section is necessarily prior art. For example, the characterizations of the particular patents and publications discussed may relate them to inventive concepts in a way which is itself based on knowledge of some of the inventive concepts. Moreover, the following discussion attempts to fairly present various suggested technical alternatives (to the best of applicant's knowledge), even though the teachings of some of those technical alternatives may not be "prior art" under the patent laws of the United States or of other countries. Similarly, the Summary of the Invention section of the present application may contain some discussion of prior art teachings, interspersed with discussion of generally applicable innovative teachings and/or specific discussion of the best mode as presently contemplated, and applicant specifically notes that statements made in the Summary section do not necessarily delimit the various inventions claimed in the present application or in related applications.
Proteins (or polypeptides) are linear polymers of amino acids. Since the polymerization reaction which produces a protein results in the loss of one molecule of water from each amino acid, proteins are often said to be composed of amino acid "residues." Natural protein molecules may contain as many as 20 different types of amino acid residues, each of which contains a distinctive side chain. The particular sequence of amino acid residues in a protein defines the primary sequence of the protein.
Proteins fold into a three-dimensional structure. The folding is determined by the sequence of amino acids and by the protein's environment. The remarkable properties of proteins depend directly from the protein's three-dimensional conformation. Thus, this conformation determines the activity or stability of enzymes, the capacity and specificity of binding proteins, and the structural attributes of receptor molecules. Because the three-dimensional structure of a protein molecule is so significant, it has long been recognized that a means for stabilizing a protein's three-dimensional structure would be highly desirable.
The three-dimensional structure of a protein may be determined in a number of ways. Perhaps the best known way of determining protein structure involves the use of the technique of x-ray crystallography. An excellent general review of this technique can be found in Physical Bio-chemistry, Van Holde, K. E. (Prentice-Hall, NJ (1971)) (especially pages 221-239), which is herein incorporated by reference. Using this technique, it is possible to elucidate three-dimensional structure with remarkable precision. It is also possible to probe the three-dimensional structure of a protein using circular dichroism, light scattering, or by measuring the absorption and emission of radiant energy (Van Holde, Physical Biochemistry, Prentice-Hall, NJ (1971)). Additionally, protein structure may be determined through the use of the techniques of neutron diffraction, or by nuclear magnetic resonance (Physical Chemistry, 4th Ed. Moore, W. J., Prentice-Hall, NJ (1972) which is hereby incorporated by reference).
The examination of the three-dimensional structure of numerous natural proteins has revealed a number of recurring patterns. Alpha helices, parallel beta sheets, and anti-parallel beta sheets are the most common patterns observed. An excellent description of such protein patterns is provided by R. Dickerson et al., The Structure and Action of Proteins (1969). The assignment of each amino acid to one of these patterns defines the secondary structure of the protein. The helices, sheets and turns of a protein's secondary structure pack together to produce the three-dimensional structure of the protein. The three-dimensional structure of many proteins may be characterized as having internal surfaces (directed away from the aqueous environment in which the protein is normally found) and external surfaces (which are in close proximity to the aqueous environment). Through the study of many natural proteins, researchers have discovered that hydrophobic residues (such as tryptophan, phenylalanine, tyrosine, leucine, isoleucine, valine, or methionine) are most frequently found on the internal surface of protein molecules. In contrast, hydrophilic residues (such as aspartate, asparagine, glutamate, glutamine, lysine, arginine, histidine, serine, threonine, glycine, and proline) are most frequently found on the external protein surface. The amino acids alanine, glycine, serine and threonine are encountered with equal frequency on both the internal and external protein surfaces.
Proteins exist in a dynamic equilibrium between a folded, ordered state and an unfolded, disordered state. This equilibrium in part reflects the interactions between the side chains of amino acid residues which tend to stabilize the protein's structure, and, on the other hand, those thermodynamic forces which tend to promote the randomization of the molecule.
The amino acid side chain interactions which promote protein folding and confer catalytic activity fall into two classes. The interactions may be caused by weak forces (e.g. hydrogen bonds) between the side chains of different amino acid residues. Alternatively, they may be caused by direct covalent bonding between the sulfhydryl groups of two cysteine amino acid residues. Such a bond is known as a "disulfide" bond.
When a protein is synthesized, any cysteine residues present contain free sulfhydryl groups (--SH). When two sulfhydryl groups in close proximity are mildly oxidized, disulfide bonds (--S--S--) may form, thereby crosslinking the polypeptide chain. The formation of this chemical bond is said to convert two "cysteine" residues into a "cystine" residue. Thus "cysteine" residues differ from a "cystine" residue in that the former molecules contain sulfur atoms which are covalently bonded to hydrogen, whereas the latter molecule contains a sulfur atom which is covalently bonded to a second sulfur atom.
A disulfide bond may stabilize the folded state of the protein relative to its unfolded state. The disulfide bond accomplishes such a stabilization by holding together the two cysteine residues in close proximity. Without the disulfide bond, these residues would be in close proximity in the unfolded state only a small fraction of the time. This restriction of the conformational entropy (disorder) of the unfolded state destabilizes the unfolded state and thus shifts the equilibrium to favor the folded state. The effect of the disulfide bond on the folded state is more difficult to predict. It could increase, decrease or have no effect on the free energy of the folded state. Increasing the free energy of the folded state may lead to a destabilization of the protein, which would tend to cause unfolding. Importantly, the cysteine residues which participate in a disulfide bond need not be located near to one another in a protein's primary amino acid sequence.
One potential way of increasing the stability of a protein is to introduce new disulfide bonds into that protein. Thus, one potential application of recombinant DNA technology to the stabilization of proteins involves the introduction of cysteine residues to produce intraprotein disulfide bonds. There are two ways in which cysteine residues may be introduced into a protein: (1) through a replacement-exchange with one of the protein's normally occurring amino acid residues, or (2) an insertion of a cysteine between two existing amino acid residues.
Recently, investigators have employed computers and computer graphics displays as an aid for assessing the appropriateness of potential linkage sites. See Perry, L. J., & Wetzel, R., Science, 226:555-557 (1984); Pabo, C. O., et al., Biochemistry, 25:5987-5991 (1986); Bott, R., et al., European Patent Application Serial Number 130, 756; Perry, L. J., & Wetzel, R., Biochemistry, 25:733-739 (1986); and Wetzel, R. B., European Patent Application Serial Number 155,832; all of which are hereby incorporated by reference. The methods developed by Wetzel and co-workers permit one to project the three-dimensional conformation of a protein onto a computer screen and to simulate the effect which a disulfide bond might have on the protein's structure. Although these methods facilitate the design of more stable proteins, the researcher must still select the amino acid residues which are to be replaced by the cysteine residues of the disulfide bond. Hence, a substantial amount of guess work and trial and error analysis are still required. A need, therefore, still exists where a method which will assist the user in selecting potential disulfide bond linkage sites.