Carbonic anhydrases (CA, EC 4.2.2.1) is a group of enzymes that catalyzes the reversible reaction of carbon dioxide and water into bicarbonate and proton according to:CO2+H2O⇄HCO3−+H+Carbonic anhydrases are widely distributed throughout nature and are categorized in five distinct classes, the α-, β-, γ-, δ-, and ξ-class[1]. The α-class carbonic anhydrases can be found in vertebrates, bacteria, algae and green plants whereas β-class carbonic anhydrases are found in bacteria, algae and chloroplasts. One of each δ and ξ-class carbonic anhydrases have been isolated from eukaryotic marine diatoms. The only γ-class carbonic anhydrase (Cam) isolated so far has been isolated from the thermophilic Archaeon Methanosarcina thermophila[2]. However, since the five classes have evolved through convergent evolution they differ significantly from each other with regard to amino acid sequence, structure and activity. The α-class carbonic anhydrases belongs to a superfamily of homologous proteins i.e. their genes have evolved from a common ancestral gene. Among the most effective carbonic anhydrases are the α-carbonic anhydrases from vertebrates with a turn over number (kcat) of up to 1.4·106 s−1, which is 107 times faster than the spontaneous reaction. Furthermore, the catalytic efficiency (kcat/Km) for e.g. human carbonic anhydrase II is 1.5·108 M−1 s−1, which is close to a diffusion controlled reaction. Since the natural function of the enzyme is e.g. to facilitate the removal of CO2 from the blood (human carbonic anhydrase II) it has been suggested that carbonic anhydrases can be used as biological catalysts in bioreactors designed for capturing CO2 from various gas streams. At this time there is a consensus view that the concentration of carbon dioxide in the atmosphere is the major contributor to increasing global warming, which has also been concluded by the Intergovernmental Panel on Climate Change (IPCC)[3]. Thus, several chemical methods have been suggested and tested for carbon capture and sequestration (CCS). However, most of these operate at extreme pressure or temperature and use harmful chemical compounds and still consume high amounts of energy at low efficiency. If, instead, an enzyme based bioreactor utilizing carbonic anhydrase as a catalyst could be used, this could solve the energy and environmental problem with chemical reactors. Several such bioreactors and processes have been suggested in e.g. WO2006/089423, U.S. Pat. No. 6,524, 842, WO2004/007058, WO 2004/028667, U.S. 2004/0029257, U.S. Pat. No. 7,132, 090, WO 2005/114417, U.S. Pat. No. 6,143,556, WO 2004/104160, US 2005/214936 and U.S. Pat. No. 7,892,814. The aforementioned processes generally operate by bringing carbonic anhydrase, either free in solution or immobilized, in contact with CO2 dissolved in the solution. However, since the operational conditions such as temperature, pH and chemical composition of the solution etc can vary widely depending on application, neither of these processes is of any value if the necessary carbonic anhydrase catalyst is not stable enough to function at the operational conditions or have long enough life time to be economically viable.
Unfortunately, since there are no organisms living under the conditions that can prevail in a CO2-capturing bioreactor, nature has not provided us with a carbonic anhydrase with the desired stability or efficiency. Mammalian, plant and prokaryotic carbonic anhydrases have through natural evolution been selected to be stable at the physiological condition of the respective organism. Thus, α- and β class carbonic anhydrases are generally only stable at physiological conditions, i.e. approximately 37° C. or lower. The only heat-stable carbonic anhydrase has been found in Methanosarcina thermophila, which has an optimal growth at 55° C. and produces a γ-carbonic anhydrase (Cam) with a heat denaturation temperature (melting point, Tm) of about 70° C. However, this enzyme has a catalytic turn over that is approximately a 10-fold slower than that of e.g. human carbonic anhydrase II (kcat of approx. 1.2·105 s−1 as compared to 1.4·106 s−1). Furthermore, the catalytic efficiency is approximately 20-fold lower (7.5·106 M−1·s−1) as compared to the 1.5·108 M−1·s−1 for human carbonic anhydrase II[4, 5]. Other features of γ-carbonic anhydrase from Methanosarcina thermophila that makes it less interesting as a catalyst for a bioreactor is that it is a homotrimeric protein, i.e. an enzyme built up from three identical polypeptide chains. Each of the polypeptides contains 213 amino acids and has a molecular weight of approx. 23 kD, i.e. a total of 639 amino acids and a molecular weight of 69.15 kD. This can be compared to HCA II which is a monomeric protein of 259 amino acids and a molecular weight of 29.3 kD[6]. Thus, an advantage of HCA II, as compared to Cam, is that it will not be inactivated by dissociation of polypeptides. Another problem associated with the use of γ-carbonic anhydrase from Methanosarcina thermophila is that to obtain the most active form of the enzyme (Fe2+-Cam) it needs to be produced anaerobically and to be protected from air during purification and use. If these prerequisites are not met, the naturally occurring Fe2+ in the active site is oxidized to Fe3+ and subsequently exchanged by Zn2+, which lowers the activity an additional 3-fold[6,7].
The conversion rate and efficiency is of course of great importance for the technical and economical feasibility of using carbonic anhydrases in any CO2-capturing process. Thus, if it would be possible to use human carbonic anhydrase II, a bioreactor would require 10-20 times less enzyme (alternatively be 10-20 times smaller with the same amount of enzyme) than a corresponding reactor using e.g. γ-carbonic anhydrase from Methanosarcina thermophila. 
Enzymes are macromolecular protein biomolecules that are able to function as highly effective, high-performing biological catalysts and are fundamental for all biological life. They are substances that accelerate the chemical reactions of life without being consumed themselves in the reaction. Isolated enzymes are important in many industrial processes for treating biological substrates. Thus, enzymes for industrial and environmental applications have a large and increasing economical and ecological value.
One bottleneck in the application of enzymes in industrial processes is that in order to be active, enzymes and other proteins must keep a highly ordered and folded structure. However, the highly ordered structure of proteins is only maintained if the proteins are stable at the prevailing conditions, i.e. pH, ionic strength, temperature, etc., within certain limits that are specific for each type of protein. In terms of natural selection of proteins during evolution, this notion stresses the fact that a protein molecule only makes structural sense when it exists under conditions similar to those for which it was selected, in its so called native state. Protein stability can fundamentally be divided in chemical stability and physical stability. Chemical stability relates to changes in activity of the enzyme in response to various chemical alterations, e.g. deamidination of aspargine to aspartate and oxidation of methionine. Changes in activity can be due to changes of the amino acids involved in the enzymatic process or due to that the chemically modified enzyme looses its structure and hence activity. Physical stability relates to the intrinsic ability of the protein to find and maintain its structure (and hence activity). Physical stability can be measured in several ways, e.g. as the thermodynamic stability, the thermal stability and the kinetic stability which are all a function of the sum of interactions within the protein and between the protein and its surroundings.
Therefore, in the quest to design more stable proteins, it is important to understand the differences and benefits, as well as the underlying mechanisms, of each type of stability to be able to attain proteins with the desired increased stability.
Thermodynamic stability is a measure of the difference in free energy (ΔG) between the inactive unfolded (U) states and the folded state (F) in which the enzyme is active. Thermodynamic stability can be determined at equilibrium conditions if the protein is free to unfold and re-fold. This two-state model can be written as:F⇄UThus, in this case the stability is simply the difference in free energy between the U and the F states (ΔG=GUnfolded−GFolded) and the stability is defined as ΔGFU, whereΔGFU=−RT ln K.K represents the equilibrium constant between the unfolded and the folded state (K=[U]/[F]) and, therefore, the more thermodynamically stable the protein is the larger the difference in free energy (ΔG) is. This can also be graphically represented by plotting the difference in free energy between the unfolded and native state. (See FIG. 1).
Thus, simplified, the thermodynamic stability can be increased by either destabilizing the unfolded state (higher free energy of U) or stabilizing the native state (lower free energy of F) so as to maximize the difference in free energy (ΔGFU) between the two states. The change in free energy needs to be lower than zero (ΔG<O) for the folding reaction to be efficient, that is, favoring the native state of the protein. Since the difference in free energy is determined by its enthalpy (ΔH, interactions) and entropy (ΔS, disorder) according to ΔG=ΔH−TΔS a favorable ΔG can be accomplished by strengthening the interactions of the folded state, leading to lowered enthalpy (e.g. hydrogen bonds, ion bonds, better packing of the protein interior etc.). The same, i.e. a larger difference in free energy between the unfolded and folded state, can be accomplished by destabilizing the unfolded state. Furthermore, for the unfolded state, which can be assumed to be a random coil, the same can be accomplished by restraining the freedom of the unfolded state leading to lowered entropy of the unfolded states and thereby a higher level of free energy for the unfolded state.
The melting point (Tm) of a protein, i.e. the midpoint temperature of unfolding, is a measure of a proteins thermal stability. In industrial processes it is often desirable to use enzymes with a high melting point since it is in many cases beneficial if the reaction can take place at an elevated temperature (higher rates of reaction, lower viscosity, less microbial growth, less fouling etc). For this reason, what is often focused on for proteins that have a potential use in industrial, enzyme based, processes is that the protein has a high thermal stability (i.e. a high melting point).
It is, however, important to recognize that at standard temperature (25° C.) the ΔGFU values for a thermolabile protein are not necessarily lower than for a thermostable protein, i.e. a high thermal stability is not the same as a high thermodynamic stability at all temperatures[8]. Thus, it is not possible to deduce the melting point of a protein by simply determine its thermodynamic stability at ambient temperature or vice versa. The melting temperature (Tm) is the temperature at which U and F are at equilibrium and are equally populated and is determined by the ΔGFU(T) function, and will occur when the denaturing pressure (temperature) is so high that ΔGFU=0. When ΔGFU is plotted as a function of temperature, the ΔGFU(T) function displays a skewed parabola that intersects the x-axis twice (i.e. both heat- and cold denaturation occurs) (see FIG. 2).
FIG. 2 illustrates how the thermostability of a hypothetic protein thus can be increased by other means than increasing the thermodynamic stability (ΔGUF) of the protein at standard temperatures.
Thus, thermal stability is related, but not equivalent, to thermodynamic stability. That is, at ambient temperatures a protein can have a relatively low thermodynamic stability and still prove to have a relatively high melting point.
Kinetic stability is a measure of at what rate a protein unfolds (kU). This is especially important for proteins or conditions that denature proteins irreversibly to unfolded states. A protein can denature irreversibly if the protein in the unfolded state rapidly undergoes some permanent change such as proteolytic degradation or aggregation (which often is the case with thermally denatured proteins).
In these cases it is not the difference in free energy between the folded and unfolded state that is important. That will only affect the equilibrium and this is not a true equilibrium process. Instead, for kinetic stability, the important thing is the difference in free energy between the folded state (F) and the transition state (ts#) on the unfolding pathway which determines the activation energy for unfolding (EA, unfolding). Hence, EA, unfolding determines the rate constant of unfolding (kU) and thereby at what rate an irreversible inactivation of the unfolded state can take place (See FIG. 3).
Thus, this is in no way related to the thermodynamic stability (ΔGFU) or the thermal stability (Tm) and other means are necessary to increase the kinetic stability as compared to ΔGFU and Tm. In order to change the free energy of the transition state the folding/unfolding mechanism of the protein needs to be affected. Simplified, when an ensemble of proteins fold they will mainly follow the fastest route that produces folding intermediates and transition states of lowest possible energy levels. However, if this route is no longer accessible, they will be forced to fold via an alternative route that has folding intermediates and transition states of higher energy. This will in effect lead to a route that places the transition state at a higher level of free energy. In this case, since the folded state has the same energy level as before (still needs to be in its highly ordered native fold to be active) the height of EA, unfolding will have increased and thus provide a barrier to unfolding leading to a slower unfolding rate constant (kU).
Thus, for a protein to be valuable for any application it needs to have a large negative ΔGFU at the temperature of operation so that the protein operates well below its melting point (Tm). Equally important is that it needs a high kinetic stability so that the protein is maintained in the natively folded state and the protein does not sample the unfolded state which will render it irreversibly inactive. Hence, a high kinetic stability will lead to slow unfolding and a long lifetime of the protein. This is true for all conditions and will for example increase shelf life of the protein at ambient temperatures, but the activation energy for unfolding (EA, unfolding) will also provide a barrier for unfolding also if the protein operates close to or even above its unfolding point (thermal or other) and thus keeping the unfolding rate constant (kU) low and the lifetime high also at conditions that induce unfolding.
There are numerous ways of stabilizing proteins[9], either by stabilizing the folded state or by destabilizing the unfolded state by different means. However, most methods to stabilize the folded state rely on strengthening local interactions that are only formed once the protein is folded and few will substantially affect the folding route and hence the kinetic stability. Furthermore, because of the often hundreds of amino acids to vary and the thousands of interactions within the protein and between the protein and the surroundings, it is very difficult to simply examine the structure and pinpoint what to change in order to increase the stability. This is also the reason why combinatory methods like directed evolution has been developed. Since these methods produce thousands of variants of the protein “by chance”, which are subsequently tested for activity at different conditions, it circumvents the need for detailed knowledge of the protein structure, or understanding of protein stability. However, for those well acquainted with the art of protein stability and stabilization it is possible to design more stable proteins by knowledge-based protein engineering. One attractive way to stabilize a specific protein by knowledge-based protein engineering is to graft structural motifs that is known to be stabilizing from one protein homolog to the protein homolog that is to be stabilized, of which there are numerous examples in the literature [10,11]. Two proteins are considered to be homologous if they have identical amino acid residues in a significant number of sequential positions along the polypeptide chain. However, as is text book knowledge in protein chemistry, the three dimensional structure is much more conserved than sequence and it is often found that proteins with very low sequence identity still have similar function and similar three-dimensional structures[12]. Thus, members of such families are also considered to be homologous even though polypeptide sequence identities are not statistically significant, only structurally or functionally significant. Furthermore, homologous proteins always contain a core region (structurally conserved regions) where the general folds of the peptide chains are very similar. That is, the scaffold of even distantly related homologous proteins with low sequence identity have similar structure. It is these relationships that make it possible to transfer stabilizing amino acid combinations or motifs between structurally homologous proteins if there is three dimensional structural data available. Structural data can originate from X-ray crystallography, nuclear magnetic resonance spectroscopy or model building. If two such structures of homologous proteins are superimposed, one with stabilizing interactions of interest (the template) and the other to be stabilized (the target), the three dimensionally structurally equivalent position of stabilizing amino acids to be changed can be identified in the target structure.
One way of reducing the freedom (i.e. entropy) of the unfolded state and thus place the unfolded state on a higher energy level is to introduce covalent links between parts of the protein. This can be done by changing the original amino acids to cysteins which are able to form covalent disulfide bridges (S—S) if the thiol groups of the two amino acid side chains are correctly placed in space. To design such bridges is however not trivial since the geometry of an unstrained —CH2—S—S—CH2— bridge in proteins is limited to rather narrow conformational constraints, and deviations from the geometrical constraints will introduce strains into the folded structure. However, because of the geometrical constraints, identification of disulfide bridges are particularly amenable for homology modeling to identify amino acid positions to alter to cysteines in order to introduce disulfide bridges in homologous proteins, of which there are numerous examples of in the literature[13,14]
Although this method has a limited rate of success since the replacement of the wild type amino acid and the introduction of a disulfide bridge will often lead to loss of favorable interactions or strain in the folded state, it will lead to a larger thermodynamic stability (ΔGFU) if the folded state is unaffected (See FIG. 4).
Further, if the introduced disulfide bridge brings together parts of the protein that normally are in close contact during early stages of the folding event, it will not affect the folding pathway and will thus only increase the thermodynamic stability and possibly the rate of folding (under the prerequisite that the energy level of the folded state is unaffected). If however the introduced disulfide bridge brings parts of the protein together, that during normal folding does not interact early in the folding event, this will lead to that the protein likely needs to fold via an alternative route that has a transition state of higher free energy. Under the prerequisite that the energy level of the folded state is unaffected, this will lead to that the activation energy for unfolding (EA, unfolding) will become higher and thus the unfolding rate will be slower and the lifetime of the protein will be increased. If this can be accomplished, an ideal protein, with both a high thermodynamic stability (and possibly increased melting temperature) and a high kinetic stability, is constructed (See FIG. 5).
Besides being potentially able to increase both the thermodynamic and the kinetic stability of proteins, the stabilization is of entropic origin by restricting the freedom of the unfolded state by incorporation of a covalent bond (disulfide bridge). Thus, enthalpic stabilizing interactions by introducing disulfide bridges will not display a strong temperature dependence, which can otherwise weaken or strengthen e.g. hydrogen bonds, salt bridges, ionic bonds or hydrophobic effects. In addition, this also means that the stabilization will be less influenced also by other characteristics of the surrounding media, such as polarity and ionic strength etc, and the relative increase in stability will be maintained also in media other than buffered aqueous solutions.
From the above it can be presumed that to increase the physical stability of a protein even more, one simply adds more disulfide bridges. However, this is not uncomplicated for several reasons. Firstly, the introduction of even a single stabilizing disulfide bond is challenging, since often what is gained in energy difference by decreased entropy of the unfolded state is often also lost in enthalpic energy in the folded state, because of lost non-covalent interactions, or strain introduced into the structure so that the ΔGFU of the engineered protein is the same or even less than that of the wild type protein (i.e. thermodynamically destabilized). Thus, introducing two or more disulfide bridges might increase or decrease the stability of the protein. Secondly, with two disulfide bridges present, the folding pathway of the protein could be blocked, so that the protein is no longer able to fold into its native active form. Thirdly, when more than two cysteines are introduced in a protein there is a high risk that the cysteines make disulfide bonds with the wrong partner during synthesis or folding. This will always lead to an inactive protein as it will not be able to find its folded active conformation. This is also especially important during production of heterologous (e.g. mammalian) proteins with multiple disulfide bonds in recombinant systems (e.g. bacteria) as the formation of correct or native disulfide bonds in such systems is very inefficient, often leading to low yield of production of functional enzymes.