Proteolytic Enzymes
Proteins can be regarded hetero-polymers that consist of amino acid building blocks connected by a peptide bond. The repetitive unit in proteins is the central alpha carbon atom with an amino group and a carboxyl group. Except for glycine, a so-called amino acid side chain substitutes one of the two remaining alpha carbon hydrogen atoms. The amino acid side chain renders the central alpha carbon asymmetric. In general, in proteins the L-enantiomer of the amino acid is found. The following terms describe the various types of polymerized amino acids. Peptides are short chains of amino acid residues with defined sequence. Although there is not really a maximum to the number of residues, the term usually indicates a chain which properties are mainly determined by its amino acid composition and which does not have a fixed three-dimensional conformation. The term polypeptide is usually used for the longer chains, usually of defined sequence and length and in principle of the appropriate length to fold into a three-dimensional structure. Protein is reserved for polypeptides that occur naturally and exhibit a defined three-dimensional structure. In case the proteins main function is to catalyze a chemical reaction it usually is called an enzyme. Proteases are the enzymes that catalyze the hydrolysis of the peptide bond in (poly)peptides and proteins.
Under physiological conditions proteases catalyse the hydrolysis of the peptide bond. The International Union of Biochemistry and Molecular Biology (1984) has recommended to use the term peptidase for the subset of peptide bond hydrolases (Subclass E.C 3.4.). The terms protease and peptide hydrolase are synonymous with peptidase and may also be used here. Proteases comprise two classes of enzymes: the endo-peptidases and the exo-peptidases, which cleave peptide bonds at points within the protein and remove amino acids sequentially from either N or C-terminus respectively. Proteinase is used as a synonym for endo-peptidase. The peptide bond may occur in the context of di-, tri-, tetra-peptides, peptides, polypeptides or proteins. In general the amino acid composition of natural peptides and polypeptides comprises 20 different amino acids, which exhibit the L-configuration (except for glycine which does not have a chiral centre). However the proteolytic activity of proteases is not limited to peptides that contain only the 20 natural amino acids. Peptide bonds between so-called non-natural amino acids can be cleaved too, as well as peptide bonds between modified amino acids or amino acid analogues. Some proteases do accept D enantiomers of amino acids at certain positions. In general the remarkable stereoselectivity of proteases makes them very useful in the process of chemical resolution. Many proteases exhibit interesting side activities such as esterase activity, thiol esterase activity and (de)amidase activity. These side activities are usually not limited to amino acids only and might turn out to be very useful in bioconversions in the area of fine chemicals.
There are a number of reasons why proteases of filamentous fungi, eukaryotic microorganisms, are of particular interest. The basic process of hydrolytic cleavage of peptide bonds in proteins appears costly and potentially detrimental to an organism if not properly controlled. The desired limits to proteolytic action are achieved through the specificity of proteinases, by compartmentalization of proteases and substrates within the cell, through modification of the substrates allowing recognition by the respective proteases, by regulation via zymogen activation, and the presence or absence of specific inhibitors, as well through the regulation of protease gene expression. In fungi, proteases are also involved in other fundamental cellular processes, including intracellular protein turnover, processing, translocation, sporulation, germination and differentiation. In fact, Aspergillus nidulans and Neurospora crassa have been used as model organisms for analyzing the molecular basis of a range of physiological and developmental processes. Their genetics enable direct access to biochemical and genetical studies, under defined nutrient and cultivation conditions. Furthermore, a large group of fungi pathogenic to humans, live-stock and crop, has been isolated and proteolysis has been suggested to play a role in their pathogenicity (host penetration, countering host defense mechanisms and/or nutrition during infection). Proteases are also frequently used in laboratory, clinical and industrial processes; both microbial and non-microbial proteases are widely used in the food industry (baking, brewing, cheese manufacturing, meat tenderizing), in tanning industry and in the manufacture of biological detergents (Aunstrup, 1980). The commercial interest in exploiting certain filamentous fungi, especially the Aspergilli, as hosts for the production of both homologous and heterologous proteins, has also recently renewed interests in fungal proteases (van Brunt, 1986ab). Proteases often cause problems in heterologous expression and homologous overexpression of proteins in fungi. In particular, heterologous expression is hampered by the proteolytic degradation of the expressed products by homologous proteases. These commercial interests have resulted in detailed studies of proteolytic spectra and construction of protease deficient strains and have improved the knowledge about protease expression and regulation in these organisms. Consequently there is a great need to identify and eliminate novel proteases in filamentous fungi.
Micro-organisms such as for example fungi are particularly useful in the large scale production of proteins. In particular when such proteins are secreted into the medium. Proteolytic enzymes play a role in these production processes. On the one hand particular proteolytic enzymes are in general required for proper processing of the target protein and the metabolic well-being of the production host. On the other hand proteolytic degradation may significantly decrease the yield of secreted proteins. Poor folding in the secretion pathway may lead to degradation by intracellular proteases. This might be a particular problem with producing heterologous proteins. The details of the proteolytic processes, which are responsible for the degradation of the proteins that are diverted from the secretory process in fungi are not exactly known. In eukaryotes the degradation of cellular proteins is achieved by a proteasome and usually involves ubiquitin labelling of proteins to be degraded. In fungi, proteasomal and vacuolar proteases are also likely candidates for the proteolytic degradation of poorly folded secretory proteins. The proteolytic degradation is likely cytoplasmic, but endoplamatic reticulum resident proteases cannot be excluded. From the aspect of production host strain improvement the proteolytic system may be an interesting target for genetic engineering and production strain improvement. Additional copies of protease genes, over-expression of certain proteases, modification of transcriptional control, as well as knock out procedures for deletion of protease genes may provide a more detailed insight in the function a given protease. Deletion of protease encoding genes can be a valuable strategy for host strain improvement in order to improve production yield for homologous as well as heterologous proteins.
Eukaryotic microbial proteases have been reviewed by North (1982). More recently, Suarez Rendueles and Wolf (1988) have reviewed the S. cerevisiae proteases and their function.
Apart from the hydrolytic cleavage of bonds, proteases may also be applied in the formation of bonds. Bonds in this aspect comprise not only peptide and amide bonds but also ester bonds. Whether a protease catalyses the cleavage or the formation of a particular bond does in the first place depend on the thermodynamics of the reaction. An enzyme such as a protease does not affect the equilibrium of the reaction. The equilibrium is dependent on the particular conditions under which the reaction occurs. Under physiological conditions the thermodynamics of the reactions is in favour of the hydrolysis of the peptide due to the thermodynamically very stable structure of the zwitterionic product. By application of physical-chemical principles to influence the equilibrium, or by manipulating the concentrations or the nature of the reactants and products, or by exploiting the kinetic parameters of the enzyme reaction it is possible to apply proteases for the purpose of synthesis of peptide bonds. The addition of water miscible organic solvents decreases the extent of ionisation of the carboxyl component, thereby increasing the concentration of substrate available for the reaction. Biphasic systems, water mimetics, reverse micelles, anhydrous media, or modified amino and carboxyl groups to invoke precipitation of products are often employed to improve yields. When the proteases with the right properties are available the application of proteases for synthesis offers substantial advantages. As proteases are stereoselective as well as regio-selective, sensitive groups on the reactants do usually not need protection and reactants do not need to be optically pure. As conditions of enzymatic synthesis are mild, racemization and decomposition of labile reactants or products can be prevented. Apart from bonds between amino acids, also other compounds exhibiting a primary amino group, a thiol group or a carboxyl group may be linked by properly selected proteases. In addition esters, thiol esters and amides may be synthesized by certain proteases. Protease have been shown to exhibit regioselectively in the acylation of mono, di- and tri-saccharides, nucleosides, and riboflavin. Problems with stability under the sometimes harsh reaction conditions may be prevented by proper formulation. Encapsulation and immobilisation do not only stabilise enzymes but also allow easy recovery and separation from the reaction medium. Extensive crosslinking, treatment with aldehydes or covering the surface with certain polymers such as dextrans, polyethyleneglycol, polyimines may substantially extend the lifetime of the biocatalyst.
The Natural Roles of Proteases
Traditionally, proteases have been regarded as degrading enzymes, capable of cleaving proteins into small peptides and/or amino acids, and whose role it is to digest nutrient protein or to participate in the turnover of cellular proteins. In addition, it has been shown that proteases also play key roles in a wide range of cellular processes, via mechanisms of selective modification by limited proteolysis, and thus can have essential regulatory functions (Holzer and Tschensche 1979; Holzer and Heinrich, 1980). The specificity of a proteinase is assumed to be closely related to its physiological function and its mode of expression. With respect to the function of a particular protease, its localisation is often very important; for example, a lot of the vacuolar and periplasmic proteases are involved in protein degradation, while many of the membrane-bound proteases are important in protein processing (Suarez Rendueles and Wolf, 1988). The different roles of proteases in many cellular processes can be divided into four main functions of proteases: 1) protein degradation, 2) posttranslational processing and (in)activation of specific proteins, 3) morphogenesis, and 4) pathogenesis.
An obvious role for proteases in organisms which utilise protein as a nutrient source is in the hydrolysis of nutrients. In fungi, this would involve the degradation outside the cells by extracellular broad specificity proteases. Protein degradation is also important for rapid turnover of cellular proteins and allows the cell to remove abnormal proteins and to adapt their complement of protein to changing physiological conditions. Generally, proteases of rather broad specificity should be extremely well-controlled in order to protect the cell from random degradation of other than correct target proteins.
Contrary to the hydrolysis the synthesis of polypeptides occurs in vivo by an ATP driven process on the ribosome. Ultimately the sequence in which the amino acids are linked is dictated by the information derived from the genome. This process is known as the transcription. Primary translation products are often longer than the final functional products, and after the transcription usually further processing of such precursor proteins by proteases is required. Proteases play a key role in the maturation of such precursor proteins to obtain the final functional protein. In contrast to the very controlled trimming and reshaping of proteins, proteases can also be very destructive and may completely degrade polypeptides into peptides and amino acids. In order to avoid that proteolytic activity is unleashed before it is required, proteases are subject to extensive regulation. Many proteases are synthesized as larger precursors known as zymogens, which become activated when required. Remarkably this activation always occurs by proteolysis. Apart from direct involvement in the processing, selective activation and inactivation of individual proteins are well-known phenomena catalyzed by specific proteases.
The selectivety of limited proteolysis appears to reside more directly in the proteinase-substrate interaction. Specificity may be derived from the proteolytic enzyme which recognizes only specific amino acid target sequences. On the other hand, it may also be the result of selective exposure of the ‘processing site’ under certain conditions such as pH, ionic strength or secondary modifications, thus allowing an otherwise non-specific protease to catalyze a highly specific event. The activation of vacuolar zymogens by limited proteolysis gives an example of the latter kind.
Morphogenesis or differentiation can be defined as a regulated series of events leading to changes from one state to another in an organism. Although direct relationships between proteases and morphological effects could not be established in many cases, the present evidence suggests a significant involvement of proteases in fungal morphogenesis; apart form the observed extensive protein turnover during differentiation, sporulation and spore germination, proteases are thought to be directly involved in normal processes as hyphal tip branching and septum formation, (Deshpande, 1992).
Species of Aspergillus, in particular A. fumigatus and A. flavus, have been implicated as the causative agents of a number of diseases in humans and animals called aspergillosis (Bodey and Vartivarian, 1989). It has been repeatedly suggested that proteases are involved in virulence of A. fumigatus and A. flavus like there are many studies linking secreted proteases and virulence of bacteria. In fact, most human infections due to Aspergillus species are characterised by an extensive degradation of the parenchyma of the lung which is mainly composed of collagen and elastin (Campbell et al., 1994). Research has been focussed on the putative role of the secreted proteases in virulence of A. fumigatus and A. flavus which are the main human pathogens and are known to possess elastinolytic and collagenic activities (Kolattukudy et al., 1993). These elastinolytic activities were shown to correlate in vitro with infectivity in mice (Kothary et al., 1984). Two secreted proteases are known to be produced by A. fumigatus and A. flavus, an alkaline serine protease (ALP) and a neutral metallo protease (MEP). In A. fumigatus both the genes encoding these proteases were isolated, characterised and disrupted (Reicherd et al., 1990; Tang et al, 1992, 1993; Jaton-Ogay et al., 1994). However, alp mep double mutants showed no differences in pathogenecity when compared with wild type strains. Therefore, it must be concluded that the secreted A. fumigatus proteases identified in vitro are not essential factors for the invasion of tissue (Jaton Ogay et al., 1994). Although A. fumigatus accounts for only a small proportion of the airborne mould spores, it is the most frequently isolated fungus from lung and sputem (Schmitt et al., 1991). Other explanations for the virulence of the fungus could be that the conditions in the bronchia (temperature and nutrients) are favourable for the parasitic growth of A. fumigatus. As a consequence, invasive apergillosis could be a circumstancial event, when the host pathogenic defences have been weakened by immunosuppressive treatments or diseases like AIDS.
Four major classes of proteases are known and are designated by the principal functional groups in their active site: the ‘serine’, the ‘thiol’ or ‘cysteine’, the ‘aspartic’ or ‘carboxyl’ and the ‘metallo’ proteases. A detailed state of the art review on these major classes of proteases, minor classes and unclassified proteases can be found in Methods in Enzymology part 244 and 248 (A. J. Barrett ed, 1994 and 1995).
Specificity of Proteases
Apart from the catalytic machinery of proteases another important aspect of proteolytic enzymes is the specificity of proteases. The specificity of a protease indicates which substrates the protease is likely to hydrolyze. The twenty natural amino acids offer a large number of possibilities to make up peptides. Eg with twenty amino acids one can make up already 400 dipeptides and 800 different tripeptide, and so on. With longer peptides the number of possibilities will become almost unlimited. Certain proteases hydrolyze only particular sequences at a very specific position. The interaction of the protease with the peptide substrate may encompass one up to ten amino acid residues of the peptide substrate. With large proteinacious substrates there may be even more residues of the substrate that interact with the proteases. However this likely involves less specific interactions with protease residues outside the active site binding cleft. In general the specific recognition is restricted to the linear peptide, which is bound in the active site of the protease.
The nomenclature to describe the interaction of a substrate with a protease has been introduced in 1967 by Schechter and Berger (Biochem. Biophys. Res. Corn., 1967, 27, 157-162) and is now widely used in the literature. In this system, it is considered that the amino acid residues of the polypeptide substrate bind to so-called sub-sites in the active site. By convention, these sub-sites on the protease are called S (for sub-sites) and the corresponding amino acid residues are called P (for peptide). The amino acid residues of the N-terminal side of the scissile bond are numbered P3, P2, P1 and those residues of the C-terminal side are numbered P1′, P2′, P3′. The P1 or P1′ residues are the amino acid residues located near the scissile bond. The substrate residues around the cleavage site can then be numbered up to P8. The corresponding sub-sites on the protease that complement the substrate binding residues are numbered S3, S2, S1, S1′, S2′, S3′, etc, etc. The preferences of the sub-sites in the peptide binding site determine the preference of the protease for cleaving certain specific amino acid sequences at a particular spot. The amino acid sequence of the substrate should conform with the preferences exhibited by the sub-sites. The specificity towards a certain substrate is clearly dependant both on the binding affinity for the substrate and on the velocity at which subsequently the scissile bond is hydrolysed. Therefore the specificity of a protease for a certain substrate is usually indicated by its kcat/Km ratio, better known as the specificity constant. In this specificity constant kcat represents the turn-over rate and Km is the dissociation constant.
Apart from amino acid residues involved in catalysis and binding, proteases contain many other essential amino acid residues. Some residues are critical in folding, some residues maintain the overall three dimensional architecture of the protease, some residues may be involved in regulation of the proteolytic activity and some residue may target the protease for a particular location. Many proteases contain outside the active site one or more binding sites for metal ions. These metal ions often play a role in stabilizing the structure. In addition secreted eukaryotic microbial proteases may be extensively glycosylated. Both N- and O-linked glycosylation occurs. Glycosylation may aid protein folding, may increase solubility, prevent aggregation and as such stabilize the mature protein. In addition the extent of glycosylation may influence secretion as well as water binding by the protein.
Regulation of Proteolytic Activity
A substantial number of proteases are subject to extensive regulation of the proteolytic activity in order to avoid undesired proteolytic damage. To a certain extent this regulation takes place at transcription level. For example in fungi the transcription of secreted protease genes appears to be sensitive to external carbon and nitrogen sources, whereas genes encoding intracellular proteases are insensitive. The extracellular pH is sensed by fungi and some genes are regulated by pH. In this process transcriptional regulator proteins play a crucial role. Proteolytic processing of such regulator proteins is often the switch that turns the regulator proteins either on or off.
Proteases are subject to intra- as well as intermolecular regulation. This implies certain amino acids in the proteolytic enzyme molecule that are essential for such regulation. Proteases are typically synthesized as larger precursors known as zymogens, which are catalytically inactive. Usually the peptide chain extension rendering the precursor protease inactive is located at the amino terminus of the protease. The precursor is better known as pro-protein. As many of the proteases processed in this way are secreted from the cells they contain in addition a signal sequence (pre sequence) so that the complete precursor is synthesized as a pre-pro-protein. Apart from rendering the protease inactive the pro-peptide often is essential for mediating productive folding. Examples of proteases include serine proteases (alpha lytic protease, subtilisin, aqualysin, prohormone convertase), thiol proteases (cathepsin L and cruzian), aspartic proteases (proteinase A and cathepsin D) and metalloproteases. In addition the pro-peptide might play a role in cellular transport either alone or in conjunction with signal peptides. It may facilitate interaction with cellular chaperones or it may facilitate transport over the membrane. The size of the extension in the precursor pre-pro-protein may vary substantially, ranging from a short peptide fragment to a polypeptide, which can exist as an autonomous folding unit. In particular these larger extensions are often observed to be strong inhibitors of the protease even after cleavage from the protease. It was observed that even after cleavage such pro-peptides could assist in proper folding of the proteases. As such pro-peptides can be considered to function as molecular chaperones and separate or additional co-expression of such pro-peptides could be advantageous for protease production.
There is substantial difference in the level of regulation between proteases that are secreted into the medium and proteases that remain intracellular. Proteases secreted into the medium are usually after activation no longer subject to control and therefore are usually relatively simple in their molecular architecture consisting of one globular module. Intracellular proteases are necessarily subject to continuous control in order to avoid damage to the cells. In contrast with zymogens of secreted proteases in more complex regulatory proteases very large polypeptide segments may be inserted between the signal and the zymogen activation domain of the proteolytic module. Structure-function studies indicate that such non-protease parts may be involved in interactions with macroscopic structures, membranes, cofactors, substrates, effectors, inhibitors, ions, that regulate activity and activation of the proteolytic module(s) or its (their) zymogens. The non-proteolytic modules exhibit remarkable variation in size and structure. Many of the modules can exist as such independently from the proteolytic module. Therefore such modules can be considered to correspond to independent structural and functional units that are autonomous with respect to folding. The value of such a modular organization is that acquisition of new modules can endow the recipient protease with new novel binding specificities and can lead to dramatic changes in its activity, regulation and targeting. The principle of modular organized proteolytic enzymes may also be exploited by applying molecular biology tools in order to create novel interactions, regulation, specificity, and/or targeting by shuffling of modules. Although in general such additional modules are observed as N or C terminal extension, also large insertions within the exterior loops of the catalytic domain have been observed. It is believed that also in this case the principal fold of the protease represents still the essential topology to form a functional proteolytic entity and that the insertion can be regarded as substructure folded onto the surface of the proteolytic module.
Molecular Structure
In principle the modular organization of larger proteins is a general theme in nature. In particular within the larger multimodular frameworks typical proteolytic modules show sizes of 100 to 400 amino acids on the average. This corresponds with the average size of most of the globular proteolytic enzymes that are secreted into the medium. As discussed above polypeptide modules are polypeptide fragments, which can fold and function as independent entities. Another term for such modules is domains. However domain is used in a broader context than module. The term domain as used herein refers usually to a part of the polypeptide chain that depicts in the three-dimensional structure a typical folding topology. In a protein domains interact to varying extents, but less extensively than do the structural elements within domains. Other terms such as subdomain and folding unit are also used in literature. As such it is observed that many proteins that share a particular functionality may share the same domains. Such domains can be recognized from the primary structure that may show certain sequence patterns, which are typical for a particular domain. Typical examples are the mononucleotide binding fold, cellulose binding domains, helix-turn-helix DNA binding motif, zinc fingers, EF hands, membrane anchors. Modules refer to those domains which are expected to be able to fold and function autonomously. A person skilled in the art knows how to identify particular domains in a primary structure by applying commonly available computer software to said structure and homologous sequences from other organisms or species.
Although multimodular or multidomain proteins may appear as a string of beads, assemblies of substantial more complex architecture have been observed. In case the various beads reside on the same polypeptide chain the beads are generally called modules or domains. When the beads do not reside on one and same polypeptide chain but form assemblies via non-covalent interactions then the term subunit is used to designate the bead. Subunits may be transcribed by one and the same gene or by different genes. The multi-modular protein may become proteolytically processed after transcription leading to multiple subunits. Individual subunits may consist of multiple domains. Typically the smaller globular proteins of 100-300 amino acids usually consist only of one domain.
Molecular Classification of Proteolytic Enzymes
In general proteases are classified according to their molecular properties or according to their functional properties. The molecular classification is based on the primary structure of the protease. The primary structure of a protein represents its amino acid sequence, which can be derived from the nucleotide sequence of the corresponding gene. Tracing extensively the similarities in the primary structures may allow for the notice of similarities in catalytic mechanism and other properties, which even may extend to functional properties. The term family is used to describe a group of proteases that show evolutionary relationship based on similarity between their primary structures. The members of such a family are believed to have arisen by divergent evolution from the same ancestor. Within a family further sub-grouping of the primary structures based on more detailed refinement of sequence comparisons results in subfamilies. Classification according to three-dimensional fold of the proteases may comprise secondary structure, tertiary structure and quarternary structure. In general the classification on secondary structure is limited to content and gross orientation of secondary structure elements. Similarities in tertiary structure have led to the recognition of superfamilies or clans. A superfamily or a clan is a group of families that are thought to have common ancestry as they show a common 3-dimensional fold. In general tertiary structure is more conserved than the primary structure. As a consequence similarity of the primary structure does not always reflect similar functional properties. In fact functional properties may have diverged substantially resulting in interesting new properties. At present quarternary structure has not been applied to classify various proteases. This might be due to a certain bias of the structural databases towards simple globular proteases. Many proteolytic systems that are subject to activation, regulation, or complex reaction cascades are likely to consist of multiple domains or subunits. General themes in the structural organization of such protease systems may lead to new types of classification.
Classification According to Specificity.
In absence of sequence information proteases haven been subject to various type of functional classification. The classification and naming of enzymes by reference to the reactions which are catalyzed is a general principle in enzyme nomenclature. This approach is also the underlying principle of the EC numbering of enzymes (Enzyme Nomenclature 1992 Academic Press, Orlando). Two types of proteases (EC 3.4) can be recognized within Enzyme Nomenclature 1992, those of the exo-peptidases (EC 3.4.11-19) and those of the endo-peptidases (EC 3.4.21-24, 3.4.99). Endo-peptidases cleave peptide bonds in the inner regions of the peptide chain, away from the termini. Exo-peptidases cleave only residues from the ends of the peptide chain. The exo-peptidases acting at the free N-terminus may liberate a single amino acid residue, a dipeptide or a tripeptide and are called respectively amino peptidases (EC 3.4.11), dipeptidyl peptidases (EC 3.4.14) and tripeptidyl peptidase (EC 3.3.14). Proteases starting peptide processing from the carboxyl terminus liberating a single amino acid are called carboxy peptidase (EC 3.4.16-18). Peptidyl-dipeptidases (EC 3.4.15) remove a dipeptide from the carboxyl terminus. Exo- and endo-peptidase in one are the dipeptidases (EC 3.4.13), which cleave specifically only dipeptides in their two amino acid halves. Omega peptidases (EC 3.4.19) remove terminal residues that are either substituted, cyclic, or linked by isopeptide bonds
Apart from the position where the protease cleaves a peptide chain, for each type of protease a further division is possible based on the nature of the preferred amino acid residues in the substrate. In general one can distinguish proteases with broad, medium and narrow specificity. Some proteases are simply named after the specific proteins or polypeptides that they hydrolyze, e.g. keratinase, collagenase, elastase. A narrow specificity may pin down to one particular amino acid or one particular sequence which is removed or which is cleaved respectively. When the protease shows a particular preference for one aminoacid in the P1 or P1′ position the name of this amino acid may be a qualifier. For example prolyl amino peptidase removes proline from the amino terminus of a peptide (proline is the P1 residue). X-Pro or proline is used when the bond on the imino side of the proline is cleaved (proline is P1′ residue), eg proline carboxypeptidase removes proline from the carboxyl terminus. Prolyl endopeptidase (or Pro-X) cleaves behind proline while proline endopeptidase (X-Pro) cleaves in front of a proline. Amino acid residue in front of the scissile peptide bond refers to the amino acid residue that contributes the carboxyl group to the peptide bond. The amino acids residue behind the scissile peptide bond refers to the amino acid residue that contributes the amino group to the peptide bond. According to the general convention an amino acid chain runs from amino terminus (the start) to the carboxyl terminus (the end) and is numbered accordingly. Endo proteases may also show clear preference for a particular amino acid in the P1 or P1′ position, eg glycyl endopeptidase, peptidyl-lysine endopeptidase, glutamyl endopeptidase. In addition proteases may show a preference for a certain group of amino acids that share a certain resemblance. Such a group of preferred amino acids may comprise the hydrophobic amino acids, only the bulky hydrophobic amino acids, small hydrophobic, or just small amino acids, large positively charged amino acids, etc, etc. Apart from preferences for P1 and P1′ residues also particular preferences or exclusions may exist for residues preferred by other subsites on the protease. Such multiple preferences can result in proteases that are very specific for only those sequences that satisfy multiple binding requirements at the same time. In general it should be realized that protease are rather promiscuous enzymes. Even very specific protease may cleave peptides that do not comply with the generally observed preference of the protease. In addition it should be realized that environmental conditions such as pH, temperature, ionic strength, water activity, presence of solvents, presence of competing substrates or inhibitors may influence the preferences of the proteases. Environmental condition may not only influence the protease but also influence the way the proteinacious substrate is presented to the protease.
Classification by Catalytic Mechanism.
Proteases can be subdivided on the basis of their catalytic mechanism. It should be understood that for each catalytic mechanism the above classification based on specificity leads to further subdivision for each type of mechanism. Four major classes of proteases are known and are designated by the principal functional group in the active site: the serine proteases (EC 3.4.21 endo peptidase, EC 3.4.16 carboxy peptidase), the thiol or cysteine proteases (EC 3.4.22 endo peptidase, EC 3.4.18 carboxy peptidase), the carboxyl or aspartic proteases (EC 3.4.23 endo peptidase) and metallo proteases (EC 3.4.24 endo peptidase, EC 3.4.18 carboxy peptidase). There are characteristic inhibitors of the members of each catalytic type of protease. These small inhibitors irreversibly modify an amino acid residue of the protease active site. For example, the serine protease are inactivated by Phenyl Methane Sulfonyl Fluoride (PMSF) and Diisopropyl Fluoro Phosphate (DFP), which react with the active Serine whereas the chloromethylketone derivatives react with the Histidine of the catalytic triad. Phosphoramidon and 1,10 Phenanthroline typically inhibit metallo proteases. Inhibition by Pepstatin generally indicates an aspartic protease. E64 inhibits thiol protease specifically. Amastatin and Bestatin inhibit various aminopeptidases. Substantial variations in susceptibility of the proteases to the inhibitors are observed, even within one catalytic class. To a certain extent this might be related to the specificity of the protease. In case binding site architecture prevents a mechanism based inhibitor to approach the catalytic site, then such a protease escapes from inhibition and identification of the type of mechanism based on inhibition is prohibited. Chymostation for example is a potent inhibitor for serine protease with chymotrypsin like specificity, Elastatinal inhibits elastase like serine proteases and does not react with trypsin or chymostrypsin, 4 amido PMSF (APMSF) inhibits only serine proteases with trypsin like specificity. Extensive accounts of the use of inhibitors in the classification of proteases include Barret and Salvesen, Proteinase Inhibitors, Elsevier Amstardam, 1986; Bond and Beynon (eds), Proteolytic Enzymes, A Practical Approach, IRL Press, Oxford, 1989; Methods in Enzymology, eds E. J. Barret, volume 244, 1994 and volume 248, 1995; E. Shaw, Cysteinyl proteinases and their selective inactivation, Adv Enzymol. 63:271-347 (1990)
Classification According to Optimal Performance Conditions.
The catalytic mechanism of a proteases and the requirement for its conformational integrity determine mainly the conditions under which the protease can be utilized. Finding the protease that performs optimal under application conditions is a major challenge. Often conditions at which proteases have to perform are not optimal and do represent a compromise between the ideal conditions for a particular application and the conditions which would suit the protease best. Apart from the particular properties of the protease it should be realized that also the presentation of a proteinacious substrates is dependant on the conditions, and as such determines also which conditions are most effective for proteolysis. Specifications for the enzyme that are relevant for application comprise for example the pH dependence, the temperature dependence, sensitivity for or the dependence of metal ions, ionic strength, salt concentration, solvent compatibility. Another factor of major importance is the specific activity of a protease. The higher the enzyme's specific activity, the less enzyme is needed for a specific conversion. Lower enzyme requirements imply lower costs and lower protein contamination levels.
The pH is a major parameter that determines protease performance in an application. Therefor pH dependence is an important parameter to group proteases. The major groups that are recognized are the acid proteases, the neutral proteases, the alkaline proteases and the high alkaline proteases. The optimum pH matches only to some extent the proteolytic mechanism, eg aspartic protease show often an optimum at acidic pH, metalloproteases and thiol proteases often perform optimal around neutral pH to slightly alkaline, serine peptidases are mainly active in the alkaline and high alkaline region. For each class exceptions are known. In addition the overall water activity of the system plays a role. The pH optimum of a protease is defined as the pH range where the protease exhibits an optimal hydrolysis rate for the majority of its substrates in a particular environment under particular conditions. This range can be narrow, e.g. one pH unit, as well as quite broad, 3-4 pH units. In general the pH optimum is also dependant on the nature of the proteinacious substrate. Both the turnover rate as well as the specificity may vary as a function of pH. For a certain efficacy it can be desirable to use the protease far from its pH optimum because production of less desired peptides is avoided. Less desired peptides might be for example very short peptides or peptides causing a bitter taste. In addition a more narrow specificity can be a reason to choose conditions that deviate from optimal conditions with respect to turnover rate. Dependant on the pH the specificity may be narrow, e.g. only cleaving the peptide chain in one particular position or before or after one particular amino acid, or broader, e.g. cleaving a chain at multiple positions or cleaving before or after more different types of amino acids. In fact the pH dependence might be an important tool to regulate the proteolytic activity in an application. In case the pH shifts during the process the proteolysis might cease spontaneously without the need for further treatment to inactivate the protease. In some cases the proteolysis itself may be the driver of the pH shift.
Very crucial for application of proteases is their handling and operating stability. As protease stability is strongly affected by the working temperature, stability is often also referred to as thermostability. In general the stability of a protease indicates how long a protease retains its proteolytic activity under particular conditions. Particular conditions may comprise fermentation conditions, conditions during isolation and down stream processing of the enzyme, storage conditions, formulation and operating or application conditions. In case particular conditions encompass elevated temperatures stability in general refers to thermostability. Apart from the general causes for enzyme inactivation such as chemical modification, unfolding, aggregation etc, main problem with proteases is that they are easy subject to autodegradation. Especially for the utilization of proteases the temperature optimum is a relevant criterion to group proteases. Although there are different definitions, economically the most useful definition is the temperature or the temperature range in which the protease is most productive in a certain application. Protease productivity is a function of both the stability and the turnover rate. Where elevated temperature in general will increase the turnover rate, rapid inactivation will counteract the increase in turnover rate and ultimately lead to low productivity. The conformational stability of the protease under a given process condition will determine its maximum operating temperature. The temperature at which the protease looses it active conformation, often indicated as unfolding or melting point, can be determined according various methods, for example NMR, Circular Dichroism Spectroscopy, Differential Scanning Calorimetry etc. For protease unfolding is usually accompanied by a tremendous increase in autodegradation rate.
In applications where low temperatures are required protease may be selected with emphasis on a high intrinsic activity at low to moderate temperature. As under such conditions inactivation is relatively slow, under these conditions activity might largely determine productivity. In processes where only during a short period protease activity is required, the stability of the protease might be used as a switch to turn the protease off. In such case more labile instead of very thermostable protease might be preferred.
Other environmental parameters which may play a role in selecting the appropriate protease may be its sensitivity to salts. The compatibility with metal ions which are found frequently at low concentrations in various natural materials can be crucial for certain applications. In particular with metallo proteases certain ions may replace the catalytic metal ion and reduce or even abolish activity completely. In some applications metal ions have to be added on purpose in order to prevent the washout of the metal ions coordinated to the protease. It is well known that for the sake of enzyme stability and life-time, calcium ions have to be supplied in order to prevent dissociation of protein bound calcium.
Most microorganisms show a certain tolerance with respect to adapting to changes in the environmental condition. As a consequence at least the proteolytic spectrum that the organism is able to produce are likely to show at least similar tolerances. Such a proteolyitic spectrum might be covered by many proteases covering together the hole spectrum or by only a few proteases of a broad spectrum. Taking into account the whole proteolytic spectrum of a microorganism it can be very important to take the location into account.
Cellular Localisation and Characterization of Proteolytic Processing and Degradation
From an industrial point of view the proteases which are excreted from the cell have specific advantages with respect to producibility at a large scale and stress tolerance as they have to survive without protection of the cell. The large group of cellular protease can be further subdivided in soluble and membrane bound. Membrane bound may comprise protease at the inside as well the outside of the membrane. Intracellular soluble protease may be subdivided further according to specific compartments of the cell where they do occur. As the cell shields the proteases to some extent from the environment and because the cell controls the conditions in the cell, intracellular protease might be more sensitive to large environmental changes and their optima might correlate better with the specific intacellualr conditions. Knowing the conditions of the cellular department where the protease resides might indicate their preferences. Where extracellular protease in general do not require any regulation any more once excreted from the cell, intracellular proteases are often subject to more complicated control and regulation.
With respect to the function of a particular protease, its localisation is often very important; for example, a lot of the vacuolar and periplasmic proteases are involved in protein degradation, while many of the membrane-bound proteases are important in protein processing (Suarez Rendueles and Wolf, 1988).
A comprehensive review on the biological properties and evolution of proteases has been published in van den Hombergh: Thesis Landbouwuniversiteit Wageningen: An analysis of the proteolytic system in Aspergillus in order to improve protein production ISBN 90-5485-545-2, which is hereby incorporated by reference herein.
The Protease Problem
An important reason for the interest in microbial proteases are protease related expression problems observed in several expression hosts used in bioprocess industry. The increasing use of heterologous hosts for the production of proteins, by recombinant DNA technology, has recently brought this problem into focus, since it seems that heterologous proteins are more prone to proteolysis (Archer et al., 1992; van den Hombergh et al., 1996b).
In S. cerevisiae, already in the early eighties the protease problem and the involvement of several proteases, thus complicating targetted gene disruption approaches to overcome this problem, was recognised. During secretion a protein is exposed to several proteolytic activities residing in the secretory pathway. Additionally, in a prototrophic microorganism as Aspergillus secreted proteins can be exposed to several extracellular proteolytic activities
The problem of degradation of heterologously expressed proteins is well documented in Aspergillus (van den Hombergh Thesis Landbouwuniversiteit Wageningen: An analysis of the proteolytic system in Aspergillus in order to improve protein production ISBN 90-5485-545-2) and has been reported in the expression of cow prochymosin, human interferon α-2 tPA, GMCSF, IL6, lactoferrin, chicken egg-white lysosyme, porcine pIA2, A. niger pectin lyase B, E. coli enterotoxin B and β-glucoronidase, and Erwinia carotovora pectate lyase 3.
The problem of proteolysis may be addressed at several stages in protein production. Bioprocess engineers may address the problem of proteolysis by downstream processing at low temperatures, by early separation of product and protease(s) or by use of protease inhibitors. These may all lead to successful reduction of the problem. However it is certainly not eliminated, because much of the degradation occurs in vivo during the production of the protein.
In understanding how proteolysis is controlled in the cell, a major question concerns the recognition mechanism by which proteolysis is triggered. Into what extent are proteolytically susceptable (heterologous) proteins recognised as aberrant because of misfolding or, if correctly folded, as ‘foreign’, because they do not posses features essential for stability which are specific to the host. Various types of stress can cause the overall proteolysis in a cell to increase significantly. Factors known to increase rate of proteolysis include nutrient starvation and various other types of stress (i.e. elevation of temperature, osmotic stress, toxic substances and expression of certain heterologous proteins). To deal with proteolysis-related expression problems in vivo, several approaches have been proven succesfull as will be discussed below. However, we have to keep in mind that true ‘non-proteolytic cells’ cannot exist, since proteolysis by intracellular proteases is involved in many essential metabolic and ‘housekeeping’ reactions. Reducing proteolysis will therefore always be a process in which the changed genetical background which results in decreased proteolytic has to be analysed for potential secundary effects which could lead to reduced protein production (e.g. reduced growth rate or sporulation).
Disruption of Proteases in Filamentous Fungal Expression Hosts
Berka and coworkers (1990) describe the cloning and disruption of the A. awamori pepA gene. More recently, three disrupted aspartyl proteases in A. niger have been described. Disruptants for both the major extracellular aspartyl proteases and the major vacuolar aspartyl protease were described. Double and triple disruptants were generated via recombination and tested for protease spectra and expression and secretion of the A. niger pectin lyase PELB protein, which is very susceptable to proteolytic degradation (van den Hombergh et al., 1995). Disruption of pepA and pepB resulted both in reduction of extracellular protease activities, 80% and 6%, respectively. In the ΔpepE disruptant also other (vacuolar) protease activities were severely affected caused by inactivating of the proteolytic cascade for other vacuolar proteases. Reduced extracellular activities correlated with reduced in vitro degradation of PELB and improved in vivo expression of pelB (van den Hombergh et al., 1996f).
Protease Deficient (prt) Mutants Filamentous Fungi
Several Aspergillus protease deficient mutants have been studied whether protein production is improved. Archer and coworkers describe the reduced proteolysis of Hen egg white lysozyme in supernatants of an A. niger double prt mutant generated by Mattern and coworkers (1992) and conclude that although the degradation is not absent, it is significantly reduced. Van den Hombergh et al. (1995) show that the in vitro degradation of A. niger PELB is reduced in all seven prt complementation groups they have isolated. Virtually no degradation is observed in the prtB, prtF and prtG mutants. Recently, the expression of the pelB gene was shown to be improved in six complementation groups tested (prtA-F) and highest expression levels were observed in the prtB, prtF and prtG mutants. In addition to the single mutants, which contained residual extracellular proteolytic activities varying from 2-80% compared to wild type activity, double mutants were generated both by recombination and by additional rounds of mutagenesis. Via this approach several double prt mutants were selected and further characterised, which showed a further reduction of PELB degradation compared to their parental strains.
Instead of elimination of protease activities via disruption or mutagenesis, reduced proteolysis can also be achieved via down-regulation of the interfering proteolytic activities. This may be achieved by genetically altering the promoter or other regulatory sequences of the gene. As shown by Fraissinet-Tachet and coworkers (1996) the extracellular proteases in A. niger are all regulated by carbon catabolite repression and nitrogen metabolite repression. Nutrient starvation also causes the overall proteolysis rate in a cell to increase stromgly, which makes sense for a cell that lacks nutrients but posses proteins, that under starvation conditions are not needed or needed only in smaller amounts. In expression strategies which allow high expression on media containing high glucose and ammonium concentrations reduced proteolysis has been reported. Several constitutive glycolytic promoters (gpd and pkiA) are highly expressed under these conditions and can also be used to drive (heterologous) gene expression in continuous fermentations. The type of nutrient starvation imposed can influence different proteases to varying extent, which means that the importance of nutrient conditions in a given process depend on the type of proteolysis that is involved. Specific proteolysis may therefore be induced by conditions of substrate limitation which are frequently used in many large-scale fermentation processes.
The protease problem can nowadays be addressed in part by one or more of the above strategies. However, the residual proteolytic activity of yet unidentified proteolytic enzymes still constitutes a major problem in the art. In order to further reduce the level of unwanted proteolysis, there is a great need in the art to identify novel proteases responsible for degradation of homologously and heterologously expressed proteins. This invention provides such novel protease gene sequences encoding novel proteases. Once the primary sequence of a novel protease gene is known, one or more of the above recombinant DNA strategies may be employed to produce (knock-out) mutants with reduced proteolytic activity.
Despite the widespread applications of proteases in a great number of industrial processes, current enzymes also have significant shortcomings with respect to at least one of the following properties.
When added to animal feed, current proteases are not sufficiently resistant to digestive enzymes present in the gastrointestinal (GI) tract of e.g. pigs and poultry.
With respect to another aspect, the currently available enzymes are not sufficiently resistant to specific (high) temperatures and (high) pressure conditions that are applied during extrusion or pelleting operations.
Also, the current enzymes are not sufficiently active in a pH range of 3-7, conditions prevailing in many food, beverage products as well as in the GI tract of most animals.
According to yet another aspect the specificity of the currently available proteases is very limited which results in the inability of the existing enzymes to degrade or to dissolve certain “protease resistant” proteins thus resulting in low peptide or amino acid yields. Moreover proteases with new specificities allow the synthesis of new peptides.
Yet another drawback of the currently available enzymes is their low specific activity.
It is therefore clear that for a large number of applications a strong desire exists for proteases that are more resistant to digestive enzymes, high temperature and/or pressure and which exhibit novel specificities regarding their sites of hydrolysis. The present invention provides such enzymes.