The present invention relates generally to the fields of protein chemistry, protein structural analysis and protein engineering.
In a general way, the forces governing the conformational stability of globular proteins have, for the most part, been identified (Kauzmann, 1959; Dill, 1990; Honig and Yang, 1995. However, their relative and often their absolute roles remain elusive. The interactions that stabilize the folded, native structure sum to a free energy in the range of hundreds of kcal/mole (Kauzmann, 1959; Dill, 1990). Because the sum of the destabilizing forces is within the same range, the net stabilizing free energy is rather small, xe2x88x925 to xe2x88x9220 kcal/mole (Makhatadze and Privalov, 1995).
Proteins are molecules that primarily consist of a polypeptide chain. Proteins can be modified in various ways known to those in the art, e.g., proteins can have carbohydrate sidechains or be derivatized to include modified amino acids.
As a group, proteins have a wide variety of functions and activities and can have more than one function or activity. Examples of protein functions and activities include acting as a ligand, a binding receptor, a co-factor, a regulator of gene expression, a fluorophore, a chromophore, an ion pump, a transducer of energy from one form to another, a light energy harvester, and as catalyst in many types of transformations of another molecule, called a substrate, or even themselves. We make no effort to distinguish between a function or an activity, and thus use the terms interchangeably or together.
The subset of proteins that are catalytic proteins are referred to in the art, and herein, as enzymes. These are often the most commercially important proteins and function in processes that produce a product from a substrate in part or wholly through the action of one or more enzymes.
As used herein, the term protein is used to refer to all manner of polypeptide based molecules independent of their additional features or their natural or commercial functions. The term enzyme is reserved for that set of proteins that are catalytically active in the transformation of a substrate molecule or themselves.
When specifically referring to the proteins used in and produced by the methods of this invention particular terms are used in this specification. The terms known protein or native protein are used to describe a protein having an amino acid sequence that will be altered in a method of this invention. A known protein is a protein known to one conducting a method of this invention. A known protein can be a wild-type protein, which is a protein the amino acid sequence of which has not been altered from that found in nature. The term known enzyme is used analogously to the use of known protein. These terms are not used herein to mean that the protein used as the starting material must have an amino acid sequence which was never changed from one that is found in nature, although that will frequently be the case. The amino acid sequence of the starting protein can have been previously altered in a variety of ways. The starting protein will often be a protein the sequence of which was previously altered in a method of this invention. Previously altered proteins are considered known proteins for purposes of this specification.
The term mutant protein is used herein to refer to a protein produced through the application of a method of this invention to a known protein. However, in some context mutant protein is also used to describe a protein reported in the literature. In such cases the use will be understood in context.
We use the term stability to refer to the ability of a protein to resist the effects of various conditions that can cause the protein to denature, i.e. to unfold partly or fully, or to become functionally impaired, non-functional, partially active or inactive. Many conditions can cause proteins to denature or can negatively impact the function or activity of a protein including heat (temperatures above the temperature optimum of the known protein), cold (temperature below the temperature optimum of the known protein), organic and inorganic solvents, co-solvents, co-solutes and pH. Solvents include non-water based liquids, e.g., ethanol. Co-solvents include mixtures of various proportions of solvents and water, e.g, a mixture of ethanol and water. Co-solutes are other molecules in solution along with the protein. Co-solutes can include ions, e.g., Na+, organic and inorganic compounds and their salts, e.g. detergents, urea and Guanidine hydrochloride.
In the simplest case, a mutant protein created by a method according to this invention is considered more stable when compared to the known protein if the mutant protein functions or has activity under conditions where the known protein does not function or is inactive. However, many improvements seen in a mutant protein will be in degree rather than in kind. Therefore, it is also said that a mutant protein created by the application of methods of this invention is more stable when compared to the known protein if the mutant protein can function to a greater extent or have greater activity in the presence of a greater degree of a given condition than the known protein. For example, if the mutant protein functions or has greater activity than the known protein (1) at a given temperature, (2) in the presence of co-solvents or co-solutes or (3) under other conditions that negatively impact the function or activity of the known protein, then the mutant protein is said to be more stable when compared to the known protein.
The term flexibility is used to refer to the freedom of a protein to assume differing conformations. Often, the conformations that a protein can assume are not very different, but this is not always the case as large changes can occur on binding to other molecules.
As used herein, the terms xe2x80x9csolvent accessible surface areaxe2x80x9d or xe2x80x9caccessible surface areaxe2x80x9d are used when referring to atoms exposed on the surface of a protein in native or extended conformation. Solvent accessible surface area and accessible surface area are the locus of points mapped out in two dimensions when a probe, usually 1.4 angstroms in radius, is rolled, computationally, across the van der Waals surface of the molecule. It is thus larger than the van der Waals chemical radius of the molecule, and many parts of the van der Waals surface can not be contacted by a probe of finite size, as described in Lee and Richards (1971). These terms may be abbreviated to xe2x80x9csurface areaxe2x80x9d or even xe2x80x9careaxe2x80x9d but these terms are also used to refer to the surface area of atoms generally, i.e., the area of a buried atom is zero. In such cases one will understand the meaning from the context.
It is well established that amino acids are buried to various extents in various places in protein structures.
It is well established that amino acids are buried to various extents in various places in protein structures. This variety can be taken advantage of to tailor the effect of altering, by more or less, the amount of buried charged surface in a protein to achieve a desired result. For ease in discussing the range of burial of amino acids, particular terms are used herein to describe the degree of burial of amino acid residues in the native, or folded, conformation of a protein. Fully buried amino acids are those that are approximately 90% or more buried in the native conformation. Partially buried amino acids are those that are not fully exposed or fully buried. For ease of conceptualization, we often use partially buried to refer to amino acids that are between approximately 10% to approximately 90% buried in the native conformation. Amino acids that are substantially inaccessible to solvents are amino acids that are at least about 50% buried in the native conformation of a known protein as seen in the analysis taught herein.
In this specification we refer to amino acids that are outside of and not interacting with an active site or binding site of a protein. In using this terminology we mean amino acids that directly interact with a substrate in an active site and amino acids that interact to such amino acids in such a way that changing them would negatively impact the activity of the protein. Similarly, we mean amino acids that do not directly interact with a moiety bound in the binding site of a protein and amino acids that interact with such amino acids in such a way that changing them would negatively impact the binding of the moiety. A binding moiety can be another protein, another molecule of the same protein or a non-protein molecule.
By nucleic acid is meant any nucleic acid that can be used by proteins and/or enzymes to synthesize a protein either directly or after transcription. Non-natural nucleotides and internucleotide linkages can be used as desired.
A cell culture is any combination of cells, a medium appropriate for the cells and a suitable growth chamber or vessel. The cells can be bacterial, fungal, yeast, plant or mammalian.
An aspect of this invention is a method of analyzing a known protein and thereafter producing a mutant protein that has reduced flexibility. The reduced flexibility translates to an altered activity or functional profile for the mutant protein as compared to the known protein. Initially, one analyzes the known protein to identify fully or partially buried amino acids having a formal charge which are outside of and do not interact with either an active site or a binding site of the known protein. This is performed by comparing the solvent accessible surface area for each atom of the protein in the native conformation and a modeled fully extended conformation. The comparison enables one to determine if any given amino acid is substantially inaccessible to solvents in the native conformation. Thereafter, one synthesizes a mutant protein in which the amino acid having a formal charge is replaced with an amino acid having no formal charge.
In an embodiment, the mutant protein is synthesized by obtaining a nucleic acid having a sequence that encodes the known protein and altering the sequence to encode for the mutant protein by changing the codon for the amino acid having a formal charge to an amino acid having no formal charge. In a preferred embodiment, the protein is produced in a cell culture.
In another embodiment the protein is an enzyme. In a preferred embodiment the mutant enzyme can have a greater activity than the known enzyme at a temperature above the temperature optimum of the known enzyme.
In another embodiment the mutant protein is more stable than the known protein under one of a variety of conditions that negatively impact the activity or function of the known protein. These conditions can include the presence of co-solutes, co-solvents, heat or pH.
In another embodiment, the mutant protein can have greater resistance to the effect of denaturants such as cold, heat, solvents, co-solvents, co-solutes and pH as compared to the known protein.
In another embodiment, the amino acid having a formal charge is replaced by an isosteric amino acid.
In other embodiments, the amino acid is substantially inaccessible to solvents or fully buried.
Another aspect of this invention is a method of analyzing a known protein and thereafter producing a mutant protein that has increased flexibility. The increased flexibility translates to an altered activity or functional profile for the mutant protein as compared to the known protein. Initially, one analyzes the known protein to identify fully or partially buried amino acids having no formal charge which are outside of and do not interact with either an active site or a binding site of the known protein. This is performed by comparing the solvent accessible surface area for each atom of the protein in the native conformation and a modelled fully extended conformation. The comparison enables one to determine if any given amino acid is substantially inaccessible to solvents in the native conformation. Thereafter, one synthesizes a mutant protein in which the amino acid having no formal charge is replaced with an amino acid having a formal charge.
In an embodiment of this aspect, the mutant protein is synthesized by obtaining a nucleic acid having a sequence that encodes the known protein and altering the sequence to encode for the mutant protein by changing the codon for the amino acid having no formal charge to an amino acid having a formal charge. In a preferred embodiment, the protein is produced in a cell culture.
In another embodiment the protein is an enzyme. In a preferred embodiment the mutant enzyme can have a greater activity than the known enzyme at a temperature below the temperature optimum of the known enzyme.
In another embodiment, the amino acid having no formal charge is replaced by an isosteric amino acid.
In other embodiments, the amino acid is substantially inaccessible to solvents or fully buried.
Further features, embodiments and advantages of the invention will become more fully apparent from a consideration of the following description of the invention when taken together with the drawings.