Protein Sequence-Structure Relationships. Common Wisdom:
A protein is a natural polymer of amino acids joined together by peptide bonds, which folds upon synthesis into a three-dimensional structure possessing some biological activity. The three dimensional structure of a protein is dictated by its amino acid sequence (Anfinsen, 1973. Science 181, 223). The amino acid sequence contains all the relevant information needed to dictate the formation of three-dimensional structure by a protein chain. Changes in a protein's sequence, effected through natural introduction of mutations during evolution, or by genetic engineering techniques, result in changes in the protein's structure. Such changes may be subtle, or profound. If the changes are subtle, they generally involve only minor alterations in the microstructure of a particular region of the protein, manifesting as changes in the shape of a local cluster of residues (either buried within the protein, or located on its surface) in the neighborhood of the altered residue, without any profound or visible effect on either the protein's overall shape or function, or the trajectory that its peptide backbone takes through its three-dimensional structure. When the changes are profound, however, they alter the entire shape of the protein as well as the trajectory that the backbone takes through the protein's structure. Sometimes, profound changes effected by mutations can even cause the chain to lose the ability to fold stably into a particular three-dimensional structure (resulting in aggregation, and precipitation). The effect of mutations on a protein's structure cannot always be correlated with, or calibrated to, the changes made in its sequence. Although the effects of very limited changes can nowadays be modeled computationally, experimental exploration of the effects of sequence changes becomes essential in all instances (regardless of whether these changes are subtle or profound), since all parameters, as well as physical forces, involved in determining the effects of these changes cannot yet be modeled. What is known very well today is that both profound and subtle alterations of sequence can lead to either profound, or subtle, alterations of structure in ways that defy predictability.
Moreover, two proteins from two different organisms that are not evolutionarily related can sometimes be seen to have polypeptide backbones (although not amino acid residue side-chains) that are almost identical in their overall shape and folded structure, even though the two proteins have totally different amino acid sequences, with no similarity. However, this is more by way of exception, than the rule; generally, similarly sized proteins only tend to adopt substantially similar backbone structures if they have amino acid sequences that are somewhat similar (involving identity of at least as 20% of all residues). The outer shape characteristics of such proteins, however, are quite different, because of the specific ‘decoration’ of the backbone of each protein by specific groups of interacting residues (side-chains) peculiar to that protein, in a manner determined by its specific amino acid sequence. Conservation of backbone structure thus correlates with broad conservation of function; the precise thermodynamic and kinetic parameters of functionality are influenced almost entirely by the outer shape characteristics of the protein, which are determined by side-chains present on the protein's surface.
To summarize the above discussion, the precise relationship between amino acid sequence and protein structure is very subtle; all aspects of this relationship are not yet understood, or appreciated, and it is not yet possible to predict the effects of making particular changes in sequence on a protein's structure without doing the necessary experimentation, or without reference to a specific structural context. In this regard, and in specific relation to the reengineering of the surface of any protein through sequence changes, considerations of folding and stability play a role inasmuch as the engineering of the whole or part of a protein's surface affects its structure-forming ability, and its structural stability. Generally, in all efforts to engineer proteins, whether in regard to their surfaces or their interiors, two different approaches may be taken, these being: (i) a rational engineering approach based on structure-function analysis, and the deliberate introduction of specific mutations, and (ii) a non-rational (combinatorics-based and directed evolution-based) approach which relies more on random processes such as gene shuffling, or screening of phages displaying randomly generated populations of variants, followed by selection based on a binding trait. Within the field of protein engineering, the rational approach was the first approach adopted. However, because of the unpredictability of the effects of changes made, it proved to be less than satisfactory. Subsequently, newly available recombinant DNA techniques made combinatorics-based (combinatorial) approaches also feasible. The less-than-satisfactory results of the rational approaches led initially to a switchover to combinatorial approaches. Eventually, however, the infeasibility of exploring even a significantly small fraction of the sequence changes that can conceivably be made through an entirely random approach (20n changes for a chain of n residues) led to the adoption of hybrid approaches attempting to combine the best of both approaches. These hybrid approaches involve a rational selection of the residues or structural sites within proteins that are to be subjected to changes, and a non-rational (combinatorial search-based) exploration of the effects of making mutations at such sites.
Instances of Pure Rational Protein Engineering.
Rutter and coworkers introduced site-specific mutations at two positions in the active site of trypsin (following structural analysis of its active site) to reduce the catalytic rate but enhance the substrate specificity of the enzyme towards its natural substrate (Craik et al., 1985. Science 228, 291-297). Numerous other groups have subsequently introduced limited mutations, based on rational analysis of protein structures, to create variants with altered rate and/or affinity characteristics. Estell and coworkers engineered another protease, subtilisin, in respect of the electrostatics of the neighborhood of the enzyme's active site, to alter the preference for binding of substrates differing in their electrostatic characteristics (Wells et al., 1987, Proc. Natl. Acad. Sci. USA 84, 1219-1223). Perham and coworkers made rational mutations in glutathione reductase to leave its substrate specificity unaltered while changing its coenzyme specificity from NADP+ to NAD+ (Scrutton et al., 1990. Nature 343, 38-43). Several instances of such work followed during the decade of the nineties, in all of which limited rationally-selected mutations were introduced to alter enzyme characteristics in respect of the protein-ligand interactions in the proximity of the active site. Subsequently, bolder engineering attempts have been made which are described below. Benkovic and coworkers successfully designed a scytalone dehydratase-like enzyme using a structurally homologous protein scaffold of nuclear transport factor 2, demonstrating the efficacy of the rational engineering approach in developing new entities, by redesigning major sections of the scaffold protein (Nixon et al., 1999, Proct Natl Acad. Sci. 96, 3568-3571). Similarly, Hellinga and coworkers analyzed the structure of a ribose-binding protein, rationally selected sites for 18-22 site-directed mutations that could be expected to impart triosephosphate isomerase (TIM) activity to this protein, and demonstrated this to be the case experimentally (Dwyer et al., 2004, Science 304, 1967-1971). Subsequently, combinatorial approaches also achieved success.
Instances of Pure Non-Rational (Combinatorial) Protein Engineering:
Directed evolution consists of the low frequency introduction of randomly distributed mutations in a gene of interest, followed by selection of the mutated (variant) proteins possessing the desired properties (Roberto et al., 2005, Current Opinion in biotechnology. 16, 378-384). Directed evolution has proven to be a powerful tool for the modification of proteins and has now become a widely used approach. It has been used mainly, however, in searching for temperature-sensitive and such-like mutants, using error-prone PCR to introduce mutations randomly into protein sequences, and in evolving novel binding reagents (through phage-display combinatorial approaches involving sections of proteins randomized by degenerate oligonucleotide incorporation into encoding DNA). There are very few instances of purely non-rational approaches having been used to alter enzyme activity, presumably because the mechanisms used to introduce mutations randomly cannot usually be controlled and restricted to a particular region of a protein's surface without some rational selection of sites, because the full exploration of the combinatorial space is impossible (there are too many variants that can be generated). With the human estrogen receptor alpha ligand binding domain, Zhao and coworkers used random mutagenesis and in vitro directed evolution to evolve a novel corticosterone activity (Chen et al., 2005, J. Mol. Biol. 348, 1273-1282).
Another notable example of the use of a purely random engineering approach (based, however, on a semi-rational selection of sites for randomization) was that of Bryan and coworkers who used directed coevolution to alter the stability and catalytic activity of calcium-free subtilisin (Strausberg et al., 2005, Biochemistry 44, 3272-3279). One example of the use of directed evolution to profoundly alter thermostability, but not activity, is that of Rao and coworkers who used this approach to develop a highly stable lipase (Acharya et al., 2004, J. Mol. Biol. 341, 1271-1281). By and large, the trend nowadays is to mix the rational and non-rational approaches, a few instances of which are cited below.
Instances of Hybrid (Rational-Combinatorial) Protein Engineering:
Some groups have used a hybrid approach which combines rational and combinatorial components to successful ends, as exemplified by the evolution of a new catalytic activity (β-lactamase activity) on the αβ/βα metallohydrolase scaffold of glyoxalase II by Kim and coworkers (Park et al., 2006, Science. 311, 535-538). A second example of the success of this approach is that of Peimbert and Segovia who have evolved a beta lactamase activity on a D-Ala D-Ala transpeptidase fold (Peimbert and Segovia, 2003, Protein Eng. 16, 27-35). Yet another example is the use of this approach to alter the specificity of the NHR human estrogen receptor in favor of a synthetic ligand, 4,4′-dihydroxybenzil, relative to the natural ligand, 17 beta-estradiol (Chockalingam et al., 2005, Proc. Natl. Acad. Sci. USA 102, 5691-5696).
Instances of Protein Structural Stability Engineering Through Modifications of Salt Bridges or Disulfides:
In addition, although this aspect has not been dealt with in detail, it may be noted that protein engineering involving rational approaches has also been attempted to achieve structural stabilization of specific proteins through the introduction e.g., of specific electrostatic interactions, or other additional bonds such as disulfide bonds. Such attempts have been based on the knowledge that surface salt bridges (Anderson et al., 1990. Biochemistry 29, 2403-2408) as well as disulfide bonds (Creighton, 1986, Methods. Enzymol. 131, 83-106) provide additional stability to proteins. However, such rational attempts have met with little success as exemplified by the work of Perham and coworkers (Scrutton et al., 1988, FEBS Letters 241, 46-50) who introduced a disulfide bond into glutathione reductase by design, to try and improve its stability, and produced an active enzyme that formed the intended disulfide bond but showed no additional structural stability.
The only previous instances we have been able to find of protein constructs attempting to somehow combine the structural stability of one protein with the temperature regime of activity of another (related) protein have involved trial-and-error approaches in which whole domains composed of contiguous stretches of residues, sourced from two different homologous proteins, have been recombined to generate chimeric proteins. We have been able to find four instances of the making of such chimeric proteins, two involving beta glucosidases from the work of Hayashi and coworkers (Singh and Hayashi, 1995, J. Biol. Chem. 270, 21928-21933; Goyal et al., 2001. J. Mol. Catalysis. B. Enzymatic 16, 43-51), one involving citrate synthase from the work of Danson and coworkers (Arnott et al., 2000, J. Mol. Biol. 304, 657-668) and one involving avidin (Hytonen et al., 2007, U.S. Pat. No. 7,268,216). In the first instance, chimeras of homologous β-glucosidases from Agrobacterium tumefaciens and Cellvibrio gilvus (˜37% sequence identity; 40% sequence similarity) were made (Singh et al., 1995, op. cit.). In the second instance, chimeras of homologous β-glucosidases from Agrobacterium tumefaciens and Thermotoga maritima were made (Goyal et al., 2001, op. cit.). In the third instance, chimeras of homologous citrate synthases from Thermoplasma acidophilum and Pyrococcus furiosus were made (Arnott et al., 2000, op. cit.). In all three instances, the intention was to obtain chimeras with enzymatic properties of improved enzymatic stability and altered temperature and pH of optimal function. In a fourth instance, which was found in the patent literature (Hytonen et al., 2007. op. cit.), the thermal stability of a chicken avidin protein was improved by replacing one of its structural domains, named beta 4, with the entire beta 4 domain of a different avidin-related (AVR) protein.
Disclosure of the Present Invention, Novelty of the Present Invention, and Differences Between the ‘Chimera’ Approach and the Approach Proposed in the Present Invention:
We have explored, in this invention, the micro-structural and macro-structural effects of sequence alterations on protein surfaces (drawn from evolutionary comparisons of proteins), with particular reference to using such alterations to create, by design (rather than through non-rational combinatorial approaches) novel proteins that combine the structural features of one protein with the functional features of another (homologous) protein sourced from a different organism. Therefore, in this invention, our emphasis is on protein surface reengineering, with specific reference to engineering of the physical parameters circumscribing protein enzymatic activity and/or other function (e.g., protein-protein interactions).