Carbohydrates and glycol-conjugates are substrates for glycosyl transferases (GTs) and glycoside hydrolases (GHs). The structure of glycoside hydrolases began to be solved starting from the 1980s. At the same time, new GH proteins were discovered and their amino acid sequence determined. Two main observations emerged from the new data. 1) The classical E.C. nomenclature system for naming enzyme families was not precise enough to classify the increasing number of enzymes that had different structure yet performed the same enzymatic reaction. 2) Enzymes related by homology could have different enzymatic activity thus also making the E.C. nomenclature system confusing for these related enzymes. A new family based nomenclature system was proposed by Bernard Henrissat in 1991 based on the structure of the enzymes (Henrissat B., A classification of glycosyl hydrolases based on amino-acid sequence similarities. Biochem. J. 280:309-316 (1991); Henrissat B., Bairoch A. New families in the classification of glycosyl hydrolases based on amino-acid sequence similarities. Biochem. J. 293:781-788 (1993); Henrissat B., Bairoch A. Updating the sequence-based classification of glycosyl hydrolases. Biochem. J. 316:695-696 (1996) and Davies G., Henrissat B. Structures and mechanisms of glycosyl hydrolases. Structure 3:853-859 (1995)). The classification of glycoside hydrolases in families based on amino acid sequence similarities was introduced because there is a direct relationship between sequence and folding similarities, and such a classification is expected to:
(i) reflect the structural features of the enzymes, which cannot be reflected by the substrate specificity alone,
(ii) help to reveal the evolutionary relationships between the enzymes, and
(iii) provide a convenient tool to derive mechanistic information.
Amino acid sequences grouped by nature of their similarity to a particular GH family can give ideas as to the activity of the new hypothetical protein. Some of these amino acid sequences, grouped in a GH family by homology have later been suggested to have certain enzymatic activity. So, in short, grouping a new amino acid sequence in a GH family does not specifically indicate the exact enzymatic activity. The enzymatic activity must be demonstrated by an activity assay of the cloned or purified protein. If the assay is difficult determination of the proteins actual function can remain un-revealed for years.
Publicly available information on the GH-61 family counts presently only 6 nucleotide sequences of unknown function. One document discloses, however, a guess that one of these sequences (SwissProt sequence O14405) encodes an endoglucanase enzyme (Saloheimo M., Nakari-Setaelae T., Tenkanen M., Penttilae M., 1997 “cDNA cloning of a Trichoderma reesei cellulase and demonstration of endoglucanase activity by expression in yeast”; Eur. J. Biochem. 249:584-591. Work by the same group confirmed that the enzyme GH61A, when purified showed very weak cellulase activity. The group itself admitted that since the activity was three orders of magnitude lower than normal cellulases, that perhaps the cellulose was not the correct native substrate for the enzyme. The group also made an exhaustive study of the purified enzyme with all other known carbohydrate assays (mannanase, galactanase, etc.) and found that the enzyme had no activity for these substrates. The authors conclude in their discussion that: “It is therefore unlikely that the fungus would produce Cel61A for its endoglucanase activity when it is already producing more efficient endoglucanases . . . . It is possible that both TrCel61A and AbCel61A are active against specific parts of more complex natural cellulosic substrate. However, further studies are needed to reveal the function of the glycoside hydrolase 61 enzymes” (page 6505).
Presently, the web-site of CAZY lists the GH-61 family as unclassified, meaning that properties like mechanism, catalytic nucleophile/base, catalytic proton donors, and 3-D structure are not known for enzymes belonging to this family. Here the only listed known activity is endoglucanase activity.
Despite extensive screening of Trichoderma reesei recombinant yeast libraries for cellulases and recovering many other cellulases from other GH families we have not unambiguously identified any endoglucanase belonging to the GH-61 family from Trichoderma reesei thus also indicating that GH61, if that family does include cellulases, have only very weak activity and thus cannot be detected by the normally very sensitive recombinant yeast activity screening. Hence, the present 6 publicly disclosed nucleotide sequences belonging the GH-61 family are either unknown open reading frames sharing homology to SwissProt sequence O14405 or sequences cloned based on purification and sequencing of a cellulose induced gene (Isolation and characterization of a cellulose-growth-specific gene from Agaricus bisporus”; Gene 119:183-190 (1992), and besides the Cel61A protein mentioned above, there is in the art no knowledge of the function and properties of any protein or peptide belonging to the GH-61 family nor have any enzyme and/or its function been reliably demonstrated.