The field of the present invention is the area of enzymes which degrade plant cell walls, and certain other substrates, in particular, the phenolic acid esterases, feruloyl esterases and/or coumaroyl esterase, nucleotide sequences encoding them and recombinant host cells and methods for producing them.
Plant cell wall material is one of the largest sources of renewable energy on earth. Plant cell walls are composed mainly of cellulose, hemicelluloses, lignin and pectin. Arabinoxylan is one of the main constituents of hemicelluloses. It is composed of a chain of xcex2(1xe2x86x924) linked xylose units that are substituted by arabinose, acetate, and glucuronic acid. The arabinose has ester linked ferulic and p-coumaric acids [Borneman et al. (1993) In: Hemicellulose and Hemicellulases, Coughlan and Hazlewood, Eds., pp. 85-102]. Ferulic acid has been shown to link hemicellulose and lignin [Ralph et al. (1995) Carbohydrate Research 275:167-178]. Feruloyl esterases are involved in breaking the bond between the arabinose and ferulic acid, thus releasing the covalently bound lignin from hemicelluloses. Feruloyl esterases have been found in many bacteria as well as fungi, but have not been extensively studied nor is there much sequence data available [Christov and Prior (1993) Enzyme. Microb. Technol. 15(6):460-75].
Clostridium thermocellum is a gram-positive bacterium that produces a multienzymatic structure termed the cellulosome. The cellulosome is one of the most active cellulose degrading complexes described to date. The cellulosome has a multi-polypeptide structure, including a scaffolding subunit which has nine cohesins binding to nine catalytic subunits, a dockerin domain for attachment to the cell wall, and a cellulose binding domain [Felix and Ljungdahl (1993) Annu. Rev. Microbiol. 47:791-819]. The catalytic subunits include endoglucanase, cellobiohydrolase, lichenase, and xylanase, many of which have been cloned and sequenced. They all have multidomain structures that include at least a dockerin domain for binding to the scaffolding domain, a linker, and a catalytic domain. They may also contain cellulose binding domains and fibronectin-like domains. There are reports that some enzymatic components may have more than one catalytic domain. Two of these are xylanase Y [XynY, Fontes et al. (1995) Biochem. J. 307: 151-158] and xylanase Z [XynZ, Grxc3xa9pinet et al. (1988) J. Bacteriol. 170(10):4582-8]. XynY has a C-terminal domain whereas XynZ N-terminal domain without any functions determined. Although enzymes with dual catalytic domains (xylanase and xcex2-glucanase) have been found in other bacteria [Flint et al. (1993) J. Bacteriol. 175:2943-2951] neither phenolic acid esterase nor bifunctional enzymes have been found in C thermocellum. 
There is a need in the art for phenolic acid esterases, feruloyl esterases and/or coumaroyl esterases in pure form which degrade plant cell wall materials, and certain other substrates, and for DNA encoding these enzymes to enable methods of producing ferulic acid and/or coumaric acid as well as facilitating degradation of plant cell wall materials.
The present invention provides novel phenolic acid esterases, having feruloyl esterase and coumaroyl esterase activities, and coding sequences for same.
One phenolic acid esterase of the present invention corresponds to a domain of previously unknown function from xylanase Y of Clostridium thermocellum. The recombinantly expressed domain polypeptide is active and has an amino acid sequence as given in FIG. 1 as xe2x80x9cXynY_Clotm.xe2x80x9d The nucleotide sequence encoding the esterase polypeptide is given in Table 5, nucleotides 2383-3219, exclusive of translation start and stop signals. See also SEQ ID NOs:11 and 12.
A second phenolic acid esterase of the present invention corresponds to a domain of previously unknown function of xylanase Z from C. thermocellum. The amino acid sequence of the esterase domain, which also is active when expressed as a recombinant polypeptide, is given in FIG. 1 as xe2x80x9cXynZ_Clotm.xe2x80x9d The nucleotide sequence encoding this polypeptide is given in Table 6, nucleotides 58-858. The present invention further provides a phenolic acid esterase polypeptide further comprising a cellulose binding domain. A specifically identified cellulose binding domain has an amino acid sequence as given in Table 6, 289-400, with a corresponding coding sequence as given in Table 6, nucleotides 867-1200. See also SEQ ID NOs:13 and 14.
An additional object of the present invention is a phenolic acid esterase (i.e., a feruloyl esterase) derived from a previously uncharacterized portion of a Ruminococcus xylanase (See FIG. 1). The coding (nucleotides 2164-2895, exclusive of translation start and stop signals) and deduced amino acid sequences (amino acids 546-789) are given in Table 10. See also SEQ ID NOs: 15 and 16.
The present invention further provides a feruloyl (phenolic acid) esterase from the anaerobic fungus Orpinomyces PC-2. The coding sequence and deduced amino acid sequences of the mature esterase protein are given in Table 9, and the purification of the Orpinomyces enzyme is described herein below. See also SEQ ID NOs: 17 and 18.
A further aspect of the present invention methods for the recombinant production of the phenolic (especially ferulic) acid esterases of the present invention. Escherichia coli, Bacillus subtilis, Streptomyces sp., Saccharomyces cerevisiae, Aureobasidium pullulans, Pichia pastoris, Trichoderma, Aspergillus nidulans or any other host cell suitable for the production of a heterologous protein can be transfected or transformed with an expression vector appropriate for the chosen host. Compatible combinations of vectors and host cells are well known in the art as are appropriate promoters to be used to direct the expression of a particular coding sequence of interest. The recombinant host cells are cultured under conditions suitable for growth and expression of the phenolic acid esterase and the recombinant esterase is then collected or the recombinant host cells in which the esterase has been produced are collected. The coding sequence of the esterase can be operably linked to a nucleotide sequence encoding a signal peptide which is known in the art and functional in the desired host cell if secretion of the esterase into the culture medium is desired. In that case, the culture medium serves as the source of esterase after growth of the host cells.
It is recognized by those skilled in the art that the DNA sequences may vary due to the degeneracy of the genetic code and codon usage. All DNA sequences which encode a phenolic acid esterase polypeptide having a specifically exemplified amino acid sequence are included in this invention, including DNA sequences encoding them having an ATG preceding the coding region for the mature protein and a translation termination codon (TAA, TGA or TAG) after the coding sequence.
Additionally, it will be recognized by those skilled in the art that allelic variations may occur in the phenolic acid esterase polypeptide coding sequences which will not significantly change activity of the amino acid sequences of the polypeptides which the DNA sequences encode. All such equivalent DNA sequences are included within the scope of this invention and the definition of a phenolic acid esterase. The skilled artisan will understand that the amino acid sequence of an exemplified phenolic acid esterase polypeptide and signal peptide(s) can be used to identify and isolate additional, nonexemplified nucleotide sequences which will encode functional equivalents to the polypeptides defined by the amino acid sequences given herein or an amino acid sequence of greater than 40% identity thereto and having equivalent biological activity. All integer percents between 40 and 100 are encompassed by the present invention. DNA sequences having at least about 75% homology to any of the ferulic acid esterases coding sequences presented herein and encoding polypeptides with the same function are considered equivalent to thereto and are included in the definition of xe2x80x9cDNA encoding a phenolic acid esterase.xe2x80x9d Following the teachings herein, the skilled worker will be able to make a large number of operative embodiments having equivalent DNA sequences to those listed herein.
The present invention encompasses feruloyl esterase proteins which are characteristic by at least a portion having from at least about 40% amino acid sequence identity with an amino acid sequence as given in SEQ ID NO:18, amino acids 227 to 440 (within the feruloyl esterase protein of Orpinomyces PC-2 of the present invention. All integer percent identities between 40 and 100% are also within the scope of the present invention. Similarly, the present invention encompasses feruloyl esterase proteins having from about 40% to about 100% identity with an amino acid sequence from the group comprising amino acids 581 to 789 of SEQ ID NO:16, amino acids 845 to 1075 of SEQ ID NO:12, amino acids 69 to 286 of SEQ ID NO:14, amino acids 69 to 307 of SEQ ID NO 14, and amino acids 69 to 421 of SEQ ID NO:14. Specifically exemplified feruloyl esterases of the present invention are characterized by amino acid sequences from the group comprising amino acids 227 to 440 of SEQ ID NO:18, amino acids 581 to 789 of SEQ ID NO:16, amino acids 845 to 1075 of SEQ ID NO:12, amino acids 69 to 286 of SEQ ID NO:14, amino acids 69 to 307 of SEQ ID NO:14, and amino acids 69 to 421 of SEQ ID NO:14. Feruloyl esterase proteins of the present invention include those having the following amino acid sequences: SEQ ID NO:18, amino acids 1 to 530; SEQ ID NO:12, amino acids 795 to 1077; SEQ ID NO:16, amino acids 546 to 789; SEQ ID NO:14, amino acids 20 to 286; SEQ ID NO:14, amino acids 20 to 307; and SEQ ID NO:14, amino acids 20 to 421.
Specifically exemplified nucleotide sequences encoding the feruloyl esterase proteins of the present invention include the following: SEQ ID NO:17, nucleotides 1 to 1590; SEQ ID NO:11, nucleotides 2582-3430; SEQ ID NO:15, nucleotides 2164 to 2895; SEQ ID NO:13, nucleotides 158 to 958; SEQ ID NO:13, nucleotides 158 to 1021; SEQ ID NO:13, nucleotides 158 to 1363.
The phenolic acid esterase coding sequences, including or excluding that encoding a signal peptide of this invention, can be used to express a phenolic acid esterase of the present invention in recombinant fungal host cells as well as in bacteria, including without limitation, Bacillus spp., Streptomyces sp. and Escherichia coli. Any host cell in which the signal sequence is expressed and processed may be used. Preferred host cells are Aureobasidium species, Aspergillus species, Trichoderma species and Saccharomyces cerevisiae, as well as other yeasts known to the art for fermentation, including Pichia pastoris [See, e.g., Sreekrishna, K. (1993) In; Industrial Microorganisms: Basic and Applied Molecular Genetics, Baltz, R. H., et al. (Eds.) ASM Press, Washington, D.C. 119-126). Filamentous fungi such as Aspergillus, Trichoderma, Penicillium, etc. are also useful host organisms for expression of the DNA of this invention. [Van den Handel, C. et al. (1991) In: Bennett, J. W. and Lasure, L. L. (eds.), More Gene Manipulations in Fungi, Academy Press, Inc., New York, 397-428].