Proteins are linear polymers of amino acids. Since the polymerization reactions which produce proteins result in the loss of one molecule of water from each amino acid, proteins are often said to be composed of amino acid "residues." Natural protein molecules may contain as many as 20 different types of amino acid residues, each of which contains a distinctive side chain. The sequence of amino acids in a protein defines the primary structure of the protein.
Proteins fold into a three-dimensional structure. The folding is determined by the sequence of amino acids and by the protein's environment. The remarkable properties of proteins depend directly from the protein's three-dimensional conformation. Thus, this conformation determines the activity or stability of enzymes, the capacity and specificity of binding proteins, and the structural attributes of receptor molecules.
The three-dimensional structure of a protein may be determined in a number of ways. Perhaps the best known way of determining protein structure involves the use of the technique of X-ray crystallography. An excellent general review of this technique can be found in Physical Biochemistry, Van Holde, K. E. (Prentice-Hall, N.J. (1971) pp221-239) which reference is herein incorporated by reference. Using this technique, it is possible to elucidate three-dimensional structure with remarkable precision. It is also possible to probe the three-dimensional structure of a protein using circular dichroism, light scattering, or by measuring the absorption and emission of radiant energy (Van Holde, Physical Biochemistry, Prentice-Hall, N.J. (1971)). Additionally, protein structure may be determined through the use of the techniques of neutron defraction, or by nuclear magnetic resonance (Physical Chemistry, 4th Ed. Moore, W. J., Prentice-Hall, N.J. (1972) which reference is hereby incorporated by reference).
The examination of the three-dimensional structure of numerous natural proteins has revealed a number of recurring patterns. Alpha helices, parallel beta sheets, and anti-parallel beta sheets are the most common patterns observed. An excellent description of such protein patterns is provided by Dickerson, R. E., et al. In: The Structure and Action of Proteins, W. A. Benjamin, Inc., Calif. (1969). The assignment of each amino acid to one of these patterns defines the secondary structure of the protein. The helices, sheets and turns of a protein's secondary structure pack together to produce the three-dimensional structure of the protein. The three-dimensional structure of many proteins may be characterized as having internal surfaces (directed away from the aqueous environment in which the protein is normally found) and external surfaces (which are in close proximity to the aqueous environment). Through the study of many natural proteins, researchers have discovered that hydrophobic residues (such as tryptophan, phenylalanine, tyrosine, leucine, isoleucine, valine, or methionine) are most frequently found on the internal surface of protein molecules. In contrast, hydrophilic residues (such as aspartic acid, asparagine, glutamate, glutamine, lysine, arginine, histidine, serine, threonine, glycine, and proline) are most frequently found on the external protein surface. The amino acids alanine, glycine, serine and threonine are encountered with equal frequency on both the internal and external protein surfaces.
Proteins exist in a dynamic equilibrium between a folded, ordered state and an unfolded, disordered state. This equilibrium in part reflects the short range interactions between the different segments of the polypeptide chain which tend to stabilize the protein's structure, and, on the other hand, those thermodynamic forces which tend to promote the randomization of the molecule.
The largest class of naturally occurring proteins is made up of enzymes. Each enzyme generally catalyzes a different kind of chemical reaction, and is usually highly specific in its function. Enzymes have been studied to determine correlations between the three-dimensional structure of the enzyme and its activity or stability.
The amino acid sequence of an enzyme determines the characteristics of the enzyme, and the enzyme's amino acid sequence is specified by the nucleotide sequence of a gene coding for the enzyme. A change of the amino acid sequence of an enzyme may alter the enzyme's properties to varying degrees, or may even inactivate the enzyme, depending on the location, nature and/or magnitude of the change in the amino acid sequence.
Although there may be slight variations in a distinct type of naturally occurring enzyme within a given species of organism, enzymes of a specific type produced by organisms of the same species generally are substantially identical with respect to substrate specificity, thermal stability, activity levels under various conditions (e.g., temperature and pH), oxidation stability, and the like. Such characteristics of a naturally occurring or "wild-type" enzyme are not necessarily optimized for utilization outside of the natural environment of the enzyme. It may thus be desirable to alter a natural characteristic of an enzyme to optimize a certain property of the enzyme for a specific use, or for use in a specific environment.