Chemical reactions in biological systems are almost always facilitated by the action of one or more catalysts. Enzymes, which are proteins that catalyze biological reactions, are known for their catalytic efficiency and specificity. Enzymes typically accelerate reactions by factors of 1 million or more. Many reactions in biological systems do not occur at perceptible rates in the absence of enzymes.
Enzymes are highly specific in the type of reaction catalyzed as well as in the particular substrates which are acted upon. One broad category of enzymes includes the proteolytic enzymes which catalyze the hydrolysis of peptide bonds. Proteolytic enzymes, also known as proteases, vary significantly in their degree of specificity. For example, subtilisin, which comes from certain bacteria, will cleave peptide bonds regardless of the nature of the side chains adjacent to the bond. Trypsin is quite specific in that it splits peptide bonds on the carboxyl side of lysine and arginine residues only. Thrombin, an enzyme participating in blood clotting, is even more specific than trypsin. Thrombin only cleaves between arginine and glycine residues. These are only a very few examples of proteases; many other proteases are known. There are several general categories of proteases. These categories include serine, cysteine, aspartic, and metalloproteases. This classification is based on the most prominent functional group at the active site of the proteases. The serine proteases are of particular interest relative to the current invention.
Much information now exists on the molecular structure and function of many serine proteases from diverse species. The majority of these enzymes consist of a single polypeptide chain of molecular weight 25,000-30,000. Chymotrypsin and subtilisin are both members of the serine protease family. Like other proteases, serine proteases cleave peptide bonds within a polypeptide to produce two smaller peptides. The cleavage reaction will typically proceed through an intermediate transition state which is facilitated by the presence of the protease. For serine proteases, the formation of an acyl-enzyme intermediate involving a reactive serine residue is the first step in the hydrolysis reaction. Deacylation of the acyl-enzyme intermediate is the second step in the hydrolysis. Like other proteases, serine proteases achieve their catalytic activity by lowing the activation energy for a specific hydrolysis reaction.
Proteases can be obtained from a wide variety of sources including fungi, bacteria, and eukaryotic cells. Although proteases have been obtained from many bacteria, relatively few proteases have been identified from bacteria which are known to live in extremely hot environments. Bacteria capable of growing at or above 80xc2x0 C.-100xc2x0 C. are generally known as extreme thermophiles or hyperthermophiles. Such highly thermophilic microorganisms have been the object of considerable scrutiny by researchers attempting to gain insight into the biochemical mechanism which enables these microbes to survive under such extreme conditions.
A number of microorganisms have been isolated from extremely hot environments. These microorganisms have been studied and certain useful compounds have been identified. For example, thermostable DNA polymerases have been obtained from Thermus aquaticus. Proteases have been isolated from thermophiles including T. aquaticus, Desulfurococcus species, Pyrococcus furiosus, Sulfolobus acidocaldarius, Thermococcus stetteri, and Pyrobaculum aerophilum. However, difficulties in culturing extremophiles have limited the number of these microbes which have been characterized as well as the number of useful compounds isolated therefrom (Brennan, Chemical and Engineering News, Oct. 14, 1996).
Stetter, et al. identified microorganisms from the hot springs of Vulcano Island, Italy, that flourish at temperatures exceeding 100xc2x0 C. (Stetter, K. O. xe2x80x9cMicrobial Life in Hyperthermal Environments,xe2x80x9d ASM News 61:285-290,1995; Stetter, K. O., Fiala, G., Huber, R. And Segerer, A. xe2x80x9cHyperthermophilic Microorganisms,xe2x80x9d FEMS Microbiol. Rev. 75:117-124, 1990). While thermophilic organisms that grow optimally at 60xc2x0 C. have been known for many years, the hyperthermophilic (or extremely thermophilic) microorganisms belong to a new evolutionary class called Archaea (Woese, C. R., Kandler, O. and Wheelis, M. L. xe2x80x9cTowards a Natural System of Organisms: Proposal for the Domains Archaea, Bacteria, and Eucarya,xe2x80x9d Proc. Natl. Acad. Sci. USA 87:4576-4579, 1990). The Archaea are believed to have originated over a billion years ago during the epoch when the Earth was cooling. Consequently their evolutionary development was set in motion within the environment of hot springs and deep sea hydrothermal vents. One member of this new group is Pyrococcus furiosus which grows optimally at 100xc2x0 C.-110xc2x0 C. (Fiala, G. and Stetter, K. O. xe2x80x9cPyrococcus furiosus s. Nev. Represents a Novel Genus of Marine Heterotrophic Archaebacteria Growing Optimally at 100xc2x0 C., xe2x80x9d Arch. Microbiol. 145:56-61, 1986). Pyrococcus furiosus is an obligate heterotroph that can be grown on polymeric substrates including protein and starch at temperatures of up to about 103xc2x0 C. Preparations containing proteolytic enzymes prepared from Pyrococcus furiosus have been previously described in U.S. Pat. Nos. 5,242,817 and 5,391,489. These patents do not describe the enzymes identified by the current applicant. Other publications describing proteases from P. furiosus also do not describe the current enzymes. See, for example, Blumentals, Ilse I., Robinson, Anne S., and Kelly, Robert M., xe2x80x9cCharacterization of Sodium Dodecyl Sulfate-Resistant Proteolytic Activity in the Hyperthermophilic Archaebacterium Pyrococcus furiosus.xe2x80x9d Applied and Environmental Microbiology, 56,7:1992-1998, (1990); Eggen, Rik, Geerling, Ans, Watts, Jennifer and de Vos, Willem M., xe2x80x9cCharacterization of pyrolysin, a hyperthermoactive serine protease from the archaebacterium Pyrococcus furiosus.xe2x80x9d FEMS Microbiology Letters, 71:17-20 (1990); Voorhorst, Wilfried G. B., Eggen, Rik I. L., Geerling, Ans C. M., Platteeuw, Christ, Siezen, Roland J., de Vos, Willem M., xe2x80x9cIsolation and Characterization of the Hyperthermostable Serine Protease, Pyrolysin, and Its Gene from the Hyperthermophilic Archaeon Pyrococcus furiosus.xe2x80x9d Journal of Biological Chemistry, 271,34: 20426-20431 (1996).
The use of proteolytic enzymes for selective peptide bond synthesis has been previously investigated. The majority of studies so far on protease-mediated peptide synthesis have utilized what has been called xe2x80x9csemi-synthesisxe2x80x9d. In these reactions, the acyl donor is a substrate for the enzyme (amide or ester). The substrate is utilized to acylate the enzyme (e.g., a serine or thiol protease) followed by deacylation by C-terminally blocked amino acid or peptide. (See Nakatsuka, T., Sasaki, T., and Kaiser E. T. xe2x80x9cPeptide Segment Coupling Catalyzed by the Semisynthetic Enzyme Thiolsubtilisin.xe2x80x9d J. Am. Chem Soc. 109:3808-3810,1987; Abrahmsen, L., Tom, J., Bumier, J., Butsher, K. A., Kossiakoff, A., and Wells, J. A., xe2x80x9cEngineering Subtilisin and its Substrates for Efficient Ligation of Peptide Bonds in Aqueous Solution.xe2x80x9d Biochemistry 30:4151-4159, 1991; Christenen, U., Drohse, H. B., and Molgaard, L., xe2x80x9cMechanism of Carboxypeptidase-Y-catalyzed Peptide Semisynthesisxe2x80x9d Eur J. Biochem., 210:467-473, 1992.
The ability to synthesize peptides and ligate polypeptides in aqueous solution under controlled conditions would be highly advantageous. Current protein synthesis methodologies result in much reactant and solvent toxic waste, which must be disposed of.
In one embodiment, the subject invention provides new proteases useful in the efficient hydrolysis of peptide bonds. Advantageously, these proteases have been found to be active both as endo- and exopeptidases. Therefore, these enzymes can be used in a wide variety of applications where it is needed to remove amino acids from the end of a polypeptide, or cleave the polypeptide at an internal site.
In a preferred embodiment, the proteases of the subject invention have a molecular weight of about 81 kD and are serine proteases which retain enzymatic activity at about 100xc2x0C. In a specific embodiment a protease of the subject invention can be obtained from the extreme thermophile Pyrococcus furiosus. 
A further embodiment of the subject invention concerns nucleotide sequences which encode the proteases of the subject invention. These sequences, which can be obtained from, for example, P. furiosus, can be used to express the enzymes of the subject invention. These sequences, and portions thereof, are also useful as nucleotide probes to identify and characterize other related sequences. The nucleotide sequences of the subject invention can also be used as primers in PCR procedures used to obtain or characterize additional nucleotide sequences of the subject invention.
A further aspect of the subject invention concerns antibodies to the proteases described herein. These antibodies can be used to identify and/or characterize the proteases of the subject invention.
A further aspect of the subject invention pertains to the use of the proteases described herein in polypeptide synthesis procedures. These enzymes can be used to facilitate highly specific and efficient peptide synthesis. The enzymes of the subject invention can be used to ligate two or more peptides (reversal of endopeptidase activity), or successively add single amino acids to a peptide chain (reversal of carboxypeptidase activity). The enzymes of the subject invention can be used to synthesize peptide bonds at high temperatures with high yields. The synthesis of peptide bonds occurs, according to the subject invention, at equilibrium. The enzyme catalyzed peptide syntheses according to the subject invention are stereospecific, require little if any side chain protection and are devoid of recemization problems. Also, the ability to carry out these reactions in an aqueous solution is advantageous compared to current peptide synthesis procedures which result in the production of substantial quantities of solvent toxic waste.
A further aspect of the subject invention concerns methods for identifying thermostable proteases. These methods involve the identification of the formation of protein or peptide synthesis products produced by the ligation of substrates when a composition containing these known substrates is heated. The formation of polypeptides from the known substrates is indicative of the thermostable proteases present in the mixture.
In one embodiment, the subject invention pertains to novel serine proteases which can be obtained from extremely thermophilic microorganisms. The enzymes of the subject invention are catalytically active at temperatures above 60xc2x0 C. and, therefore, are useful in a variety of industrial processes.
Specifically exemplified herein is a novel serine protease which can be obtained from the extreme thermophile Pyrococcus furiosus. This enzyme has an apparent molecular weight of about 81 kDa as determined by SDS gel electrophoresis. Those skilled in the art will recognize that the apparent molecular weight of a protein as determined by gel electrophoresis will sometimes differ from the true molecular weight. Therefore, reference herein to the 81 kDa enzyme of the subject invention is understood to refer to proteins which migrate on a gel, as described herein, in a manner which is consistent with a protein of approximately that size, even if the true molecular weight is somewhat different.
The serine protease specifically exemplified herein is a carboxypeptidase enzyme. Thus, it belongs to the class of enzymes known as serine carboxypeptidases. The exemplified enzyme can act as an amidase, anilidase, and esterase. The enzyme recognizes both arginine and aromatic residues such as phenylalanine in the P1 position (nomenclature of Schechter and Berger)(Schechter, I., and Berger, A. xe2x80x9cOn the Size of the Active Site in Proteases. I. Papain.xe2x80x9d Biochem. Biophys. Res. Commun. 27:157-162, 1967). The enzyme is also an endopeptidase since it yields prophe+argpNA from PPANA (D-pro-phe-arg-pNA).
Certain of the properties of the serine protease specifically exemplified herein are very unique: 1) the enzyme is both an endopeptidase as well as a carboxypeptidase, 2) the enzyme displays intense product inhibition toward several synthetic peptide substrates, and 3) it is able to catalyze high-yield peptide synthesis.
The broad proteolytic activity of the enzymes of the subject invention as well as their thermal stability make these enzymes useful in a variety of protease applications. The high temperature proteolysis carried out using the enzymes of the subject invention is useful for many industrial applications including the food processing industry and waste removal.
The enzymes of the subject invention can also be used in peptide and protein synthesis. For this use, peptides (or polypeptides) can be efficiently joined in the presence of the enzymes of the subject invention by increasing the temperature of the reaction mixture until the thermodynamics favor the formation of peptide bonds and, thus, the synthesis of a longer polypeptide from peptide fragments. This use of the enzymes of the subject invention is made possible by the enzymes retention of enzymatic activity at elevated temperatures.
Thus, in addition to their utility as proteases, the enzymes of the subject invention are capable of synthesizing peptide bonds with high yields. The utilization of these enzymes in protein synthesis has many advantages over current protein synthesis methods, which are based on semi-synthesis. One of the major practical problems associated with xe2x80x9csemi-synthesisxe2x80x9d is that it must be kinetically monitored, or controlled. That is, the synthetic reaction must be terminated at or near the time when synthetic yield is at a maximum. Otherwise, proteolysis of the synthetic product will supervene and it will be driven thermodynamically to essentially complete hydrolysis. Equilibrium peptide synthesis according to the subject invention does not suffer this disadvantage. Also, use of these enzymes in protein synthesis is particularly advantageous because stereospecificity is preserved. Furthermore, group protection and toxic solvents are unnecessary when polypeptide synthesis is carried out according to the subject invention. Unlike previously known procedures, the peptide synthesis carried out according to the subject invention can be done without the use of harmful organic solvents.
The subject invention further provides methods for identifying thermostable enzymes. In one embodiment crude cellular preparations (or other compositions which may contain a thermostable enzyme) can be assayed for the presence of thermostable enzymes. In this embodiment, peptide and/or polypeptide substrates can be added to the crude preparation. The composition can then be heated and analyzed for the presence of ligated peptides or polypeptides. In this embodiment, thermostable enzymes will catalyze the synthesis of polypeptides from the peptide or polypeptide substrates. Thus, the presence of thermostable enzymes can be identified by the formation of ligated polypeptides after heat treatment. The enzyme(s) responsible for the activity can then be identified through sequential isolation steps which remove inactive compounds and result in the isolation of the thermostable enzymes. The enzymes can then be purified and characterized according to standard procedures. The subject invention includes the enzymes obtained according to this assay procedure.
The new proteins provided here are defined according to several parameters. One critical characteristic of the proteins described herein is thermostable enzymatic activity. In a specific embodiment, these proteins are serine proteases. The enzymes and genes of the subject invention can be further defined by their amino acid and nucleotide sequences. The sequences of the molecules can be defined in terms of homology to certain exemplified sequences as well as in terms of the ability to hybridize with certain exemplified sequences. The enzymes provided herein can also be identified based on their immunoreactivity with certain antibodies.
The polynucleotide sequences and enzymes useful according to the subject invention include not only the full length sequences disclosed herein but also fragments of these sequences, as well as variants, mutants, and fusion proteins which retain the characteristic enzymatic activity of the proteins specifically exemplified herein. As used herein, the terms xe2x80x9cvariantsxe2x80x9d or xe2x80x9cvariationsxe2x80x9d of genes refer to nucleotide sequences which encode the same enzyme or which encode equivalent enzymes having proteolytic activity. As used herein, the term xe2x80x9cequivalent enzymesxe2x80x9d refers to enzymes having the same or essentially the same biological activity as the exemplified enzymes, albeit with different specificity.
It would be apparent to a person skilled in this art that genes encoding active enzymes can be identified and obtained through several means. The gene encoding the specific enzyme exemplified herein may be obtained from the specific isolate described herein. This gene, or portions or variants thereof, may also be constructed synthetically, for example, by use of a gene synthesizer. Variations of genes may be readily constructed using standard techniques for making point mutations. Also, fragments of these genes can be made using commercially available exonucleases or endonucleases according to standard procedures. For example, enzymes such as Bal31 or site-directed mutagenesis can be used to systematically cut off nucleotides from the ends of these genes. Also, genes which encode active fragments may be obtained using a variety of restriction enzymes. Proteases may be used to directly obtain active fragments of these enzymes.
Equivalent enzymes and/or genes encoding these equivalent enzymes can be derived from extreme thermophile isolates and/or DNA libraries using the teachings provided herein. There are a number of methods for obtaining the enzymes of the instant invention. For example, antibodies to the specific enzyme disclosed and claimed herein can be used to identify and isolate other such enzymes from a mixture of proteins. Specifically, antibodies may be raised to the portions of the enzyme which are most distinct from other enzymes. These antibodies can then be used to specifically identify equivalent enzymes with the characteristic activity by immunoprecipitation, enzyme linked immunosorbent assay (ELISA), or western blotting. Antibodies to the enzyme disclosed herein, or to equivalent enzymes, or fragments of these enzymes, can readily be prepared using standard procedures in this art. The genes which encode these enzymes can then be obtained from the host cell.
The subject invention concerns not only the polynucleotide sequences which encode these enzymes but also the use of these polynucleotide sequences to produce recombinant hosts which express the enzymes. The enzyme-encoding genes of the subject invention can be introduced into a wide variety of microbial or plant hosts. Expression of the gene results, directly or indirectly, in the intracellular production and maintenance of the enzyme.
Fragments and equivalents which retain the enzymatic activity of the exemplified proteins would be within the scope of the subject invention. Also, because of the redundancy of the genetic code, a variety of different DNA sequences can encode the amino acid sequences disclosed herein. It is well within the skill of a person trained in the art to create these alternative DNA sequences encoding the same, or essentially the same, proteins. These variant DNA sequences are within the scope of the subject invention. As used herein, reference to xe2x80x9cessentially the samexe2x80x9d sequence refers to sequences which have amino acid substitutions, deletions, additions, or insertions which do not materially affect enzymatic activity. Fragments retaining enzymatic activity are also included in this definition.
A further method for identifying the proteins and genes of the subject invention is through the use of oligonucleotideprobes. These probes are detectable nucleotide sequences. These sequences may be detectable by virtue of an appropriate label or may be made inherently fluorescent as described in International Application No. WO93/16094. As is well known in the art, if the probe molecule and nucleic acid sample hybridize by forming a strong bond between the two molecules, it can be reasonably assumed that the probe and sample have substantial homology. Preferably, hybridization is conducted under stringent conditions by techniques well-known in the art, as described, for example, in Keller, G. H., M. M. Manak (1987) DNA Probes, Stockton Press, New York, N.Y., pp. 169-170.
As used herein xe2x80x9cstringentxe2x80x9d conditions for hybridization refers to conditions which are able to distinguish genes encoding heat stable serine proteases from unrelated genes. Specifically, hybridization of immobilized DNA on Southern blots with 32P-labeled gene-specific probes can be performed by standard methods (Maniatis et al). For double-stranded DNA gene probes, hybridization can be carried out overnight at 20-25xc2x0 C. below the melting temperature (Tm) of the DNA hybrid in 6xc3x97SSPE, 5xc3x97Denhardt""s solution, 0.1% SDS, 0.1 mg/ml denatured DNA. The melting temperature is described by the following formula (Beltz, G. A., K. A. Jacobs, T. H. Eickbush, P. T. Cherbas, and F. C. Kafatos [1983] Methods of Enzymology, R. Wu, L. Grossman and K. Moldave [eds.] Academic Press, New York 100:266-285).
Tm=81.5xc2x0 C.+16.6 Log[Na+]+0.41(%G+C)xe2x88x920.61(%formamide)xe2x88x92600/length of duplex in base pairs.
Washes are typically carried out as follows:
(1) Twice at room temperature for 15 minutes in 1xc3x97SSPE, 0.1% SDS (low stringency wash).
(2) Once at Tmxe2x80x9420xc2x0 C. for 15 minutes in 0.2xc3x97SSPE, 0.1% SDS (moderate stringency wash).
For oligonucleotide probes, hybridization can be carried out overnight at 10-20xc2x0 C. below the melting temperature (Tm) of the hybrid in 6xc3x97SSPE, 5xc3x97Denhardt""s solution, 0.1% SDS, 0.1 mg/ml denatured DNA. Tm for oligonucleotide probes can be determined by the following formula:
Tm (xc2x0 C.)=2(number T/A base pairs)+4(number G/C base pairs)
(Suggs, S. V., T. Miyake, E. H. Kawashime, M. J. Johnson, K. Itakura, and R. B. Wallace [1981] ICN-UCLA Symp. Dev. Biol. Using Purified Genes, D. D. Brown [ed.], Academic Press, New York, 23:683-693).
Washes can be typically carried out as follows:
(1) Twice at room temperature for 15 minutes 1xc3x97SSPE, 0.1% SDS (low stringency wash).
(2) Once at the hybridizationtemperature for 15 minutes in 1xc3x97SSPE, 0.1% SDS (moderate stringency wash).
With the teachings provided herein, one skilled in the art could readily produce and use the various enzymes and polynucleotide sequences of the novel enzymes described herein.
Detection of the probe provides a means for determining in a known manner whether hybridization has occurred. Such a probe analysis provides a rapid method for identifying enzyme-encoding genes of the subject invention. The nucleotide segments which are used as probes according to the invention can be synthesized using a DNA synthesizer and standard procedures. These nucleotide sequences can also be used as PCR primers to amplify genes of the subject invention.
Certain enzymes of the subject invention have been specifically exemplified herein. Since these enzymes are merely exemplary of the enzymes of the subject invention, it should be readily apparent that the subject invention comprises variant or equivalent enzymes (and nucleotide sequences coding for equivalent enzymes) having the same or similar enzymatic activity of the exemplified serine protease. Equivalent enzymes will have amino acid homology with the exemplified enzyme. This amino acid homology will typically be greater than 60%, preferably be greater than 75%, more preferably greater than 80%, more preferably greater than 90%, and can be greater than 95%. The amino acid homology will be highest in critical regions of the enzyme which account for biological activity or are involved in the determination of three-dimensional configuration which ultimately is responsible for the biological activity. In this regard, certain amino acid substitutions are acceptable and can be expected if these substitutions are in regions which are not critical to activity or are conservative amino acid substitutions which do not affect the three-dimensional configuration of the molecule. For example, amino acids may be placed in the following classes: non-polar, uncharged polar, basic, and acidic. Conservative substitutions whereby an amino acid of one class is replaced with another amino acid of the same type fall within the scope of the subject invention so long as the substitution does not materially alter the biological activity of the compound. Table 1 provides a listing of examples of amino acids belonging to each class.
In some instances, non-conservative substitutions can also be made. The critical factor is that these substitutions must not significantly detract from the biological activity of the enzyme.