Isoprenoid compounds are organic molecules produced by a wide range of organisms (e.g., plants, bacteria, fungi, etc). To date, over 23,000 individual isoprenoid molecules have been characterized with tens to hundreds of new structures identified each year. These molecules can fulfill a variety of roles. For example, monoterpenes can be used as fragrances and flavors. Sesquiterpenes and diterpenes can serve as pheromones, defensive agents, visual pigments, antitumor drugs, and components of signal transduction pathways. Triterpenes can serve important functions as membrane constituents and precursors of steroid hormones and bile acids. Polyprenols function as photoreceptive agents and cofactor side chains, and can also exist as natural polymers.
The diverse molecular compounds produced by the isoprenoid pathway are created from diphosphate esters of monounsaturated isoprene units. Isoprenes are added together in multiples of 2, 3, or 4 by prenyl transferases to make C10, C15, and C20 units, respectively. The C10, C15, and C20 molecules, named geranyl diphosphate (GPP), famesyl diphosphate (FPP), and geranylgeranyl diphosphate (GGPP), respectively, serve as substrates for terpene synthases.
Terpene synthases catalyze the production of isoprenoid compounds via one of the most complex reactions known in chemistry or biology. In general, terpene synthases are moderately sized enzymes having molecular weights of about 40 to 100 kD. As an enzyme, terpene synthases can be classified as having low to moderate turnover rates coupled with exquisite reaction specificity and preservation of chirality. Turnover comprises binding of substrate to the enzyme, establishment of substrate conformation, conversion of substrate to product and product release. Reactions can be performed in vitro in aqueous solvents, typically require magnesium ions as cofactors, and the resulting products, which are often highly hydrophobic, can be recovered by partitioning into an organic solvent.
Terpene synthase genes are found in a variety of organisms including bacteria, fungi and plants. Swapping regions approximating exons between different terpene synthases has identified functional domains responsible for terminal enzymatic steps. For example, work performed on 5-epi-aristolochene synthase (TEAS) from Nicotiana tabacum (tobacco) and Hyoscyamus muticus vetispiradiene synthase (HVS) from henbane revealed that exon 4 and exon 6, respectively, were responsible for reaction product specificity. Combining functional domains resulted in novel enzymes capable of synthesizing new reaction products (U.S. Pat. No. 5,824,774).
Studies have led to proposed reaction mechanisms for isoprenoid production; see, e.g., Cane et al., 1985, Bioorg. Chem., 13:246-265; Wheeler and Croteau, 1987, Proc. Natl. Acad. Sci. USA, 84:4856-4859; and Pyun et al., 1994, Arch. Biochem. Biophys., 308:488-496. The studies used substrate analogs and suicide inhibitors (Croteau, 1994, Arch. Biochem. Biophys., 251:777-782; Cane et al., 1995, Biochemistry, 34:2471-2479; and Croteau et al., 1993, Arch. Biochem. Biophys., 307:397-404), as well as chemical-modifying reagents and site-directed mutagenesis in efforts to identify amino acids essential for catalysis (Cane et al., 1995, Biochemistry, 34:2480-2488; Rajaonarivony et al., 1992, Arch. Biochem. Biophys., 296:49-57; and Rajaonarivony et al., 1992, Arch. Biochem. Biophys., 299:77-82). However, these studies have resulted in limited success in defining the active site due to inherent limitations with these techniques.
The invention describes a method of identifying alpha-carbon atoms found in the active site of a terpene synthase and describes these atoms in three-dimensional space as well as the spatial relationships among them. The present invention also describes R-groups associated with such alpha-carbons and methods of altering these R-groups in order to create novel terpene synthases capable of generating novel reaction products.
Until the invention taught in this present application, the active site of synthase proteins, the amino acid residues located therein, the amino acid residues involved in catalysis, and the configuration of xcex1-carbons and R-groups within the active site have not been known. The current invention now teaches the structure of synthases, as well as provides the means of making and using the information obtained therefrom to develop and produce new and novel synthases having new and novel synthetic capabilities. The data generated using the methods described herein are useful for creation and production of synthase mutants that can use a variety of isoprenoid substrates and produce a variety of isoprenoid products.
In one embodiment, the invention features an isolated terpene synthase having about 20% or greater sequence identity to residues 265 to 535 of SEQ ID NO: 2. Such a synthase comprises nine xcex1-carbons having interatomic distances in Angstroms between the xcex1-carbons that are xc2x12.3 Angstroms of the interatomic distances shown in Table 6. The center point of each xcex1-carbon is positioned within a sphere having a radius of 2.3 Angstroms. The center point of each such sphere has the structural coordinates given in Table 5. Each xcex1-carbon has an associated R-group, and the synthase has an ordered arrangement of R-groups associated with each alpha-carbon other than the ordered arrangements of R-groups shown in Table 9. The synthase can have about 25% or greater sequence identity to residues 265 to 535 of SEQ ID 2, or about 35% or greater sequence identity to residues 265 to 535 of SEQ ID 2. Such a synthase can catalyse the formation of a terpenoid product from a monoterpene substrate, a sesquiterpene substrate, or a diterpene substrate. The product can be a cyclic terpenoid hydrocarbon or an acyclic terpenoid hydrocarbon. Either type of product can be hydroxylated or non-hydroxylated. The R-group associated with xcex1-carbon 1 can be selected from one of the following groups: the group consisting of Cys, Ser, and Thr, the group consisting of Phe, Tyr and Trp, the group consisting of Pro, Gly, and Ala, the group consisting of Glu and Asp, the group consisting of Met, Ile, Val and Leu, the group consisting of Arg and Lys, and the group consisting of Gln, Asn and His. R-groups associated with xcex1-carbons 2 to 9 can be any amino acid except those having the ordered arrangements of Table 9. Similarly, the R-group associated with each of xcex1-carbons 2-9 can be selected independently from the group consisting of Cys, Ser and Thr, the group consisting of Phe, Tyr and Trp, the group consisting of Pro, Gly, and Ala, the group consisting of Glu and Asp, the group consisting of Met, Ile, Val and Leu, the group consisting of Arg and Lys, and the group consisting of Gin, Asn and His. In these embodiments, R-groups associated with the remaining eight xcex1-carbons except those having the ordered arrangements of Table 9.
In some embodiments, the ordered arrangement of R-groups associated with xcex1-carbons 1 to 9 is Trp, Ile, Thr, Thr, Tyr, Leu, Cys, Thr and Phe, respectively, Ser, Ile, Thr, Thr, Tyr, Leu, Cys, Thr and Tyr, respectively, Trp, Ile, Thr, Thr, Tyr, Leu, Trp, Thr and Tyr, respectively, Ser, Ile, Thr, Thr, Tyr, Leu, Trp, Thr and Tyr, respectively, or Glu, Ile, Thr, Thr, Tyr, Leu, Cys, Thr and Tyr, respectively.
The invention also features a terpene synthase made by aligning the primary amino acid sequence of a preselected terpene synthase polypeptide to the amino acid sequence of residues 265 to 535 of SEQ ID NO: 2, mutating a nucleic acid encoding the preselected polypeptide at one or more codons for nine amino acid residues in a region of the polypeptide primary amino acid sequence having about 20% or greater sequence identity to residues 265 to 535 of SEQ ID NO: 2, the nine residues in the polypeptide aligning with residues 273, 294, 402, 403, 404, 407, 440, 519 and 520 of SEQ ID NO: 2; and expressing the mutated nucleic acid so that a mutated terpene synthase is made.
The invention also features an isolated terpene synthase having about 20% or greater sequence identity to residues 265 to 535 of SEQ ID NO: 2, the synthase comprising sixteen xcex1-carbons having interatomic distances in Angstroms between the xcex1-carbons that are xc2x12.3 Angstroms of the interatomic distances given in Table 4. The center point of each a-carbon is positioned within a sphere having a radius of 2.3 Angstroms. The center point of each of the spheres has the structural coordinates given in Table 3. Each xcex1-carbon has an associated R-group, and the synthase has an ordered arrangement of R-groups other than the ordered arrangements of R-groups given in Table 8. The synthase can have about 25% or greater sequence identity to residues 265 to 535 of SEQ ID NO: 2, or about 35% or greater sequence identity to residues 265 to 535 of SEQ ID NO: 2. The synthase can catalyse the formation of a terpenoid product from a monoterpene substrate, a sesquiterpene substrate, or a diterpene substrate. The product can be, for example, a cyclic terpenoid hydrocarbon. The ordered arrangement of R-groups in the synthase associated with xcex1-carbons 1 to 16 can be Cys, Trp, lie, Ile, Ser, Thr, Thr, Tyr, Leu, Cys, Val, Thr, Tyr, Asp, Phe and Thr, respectively.
The invention also features an isolated terpene synthase having about 20% or greater sequence identity to residues 265 to 535 of SEQ ID NO: 2, the synthase comprising nineteen xcex1-carbons having interatomic distances in Angstroms between the xcex1-carbons that are xc2x12.3 Angstroms of the interatomic distances given in Table 2. The center point of each xcex1-carbon is positioned within a sphere having a radius of 2.3 Angstroms. The center points of each sphere have the structural coordinates given in Table 1. Each xcex1-carbon has an associated R-group, and the synthase has an ordered arrangement of the R-groups other than the ordered arrangements of R-groups given in Table 7. The synthase can have about 25% or greater sequence identity to residues 265 to 535 of SEQ ID NO: 2, or about 35% or greater sequence identity to residues 265 to 535 of SEQ ID NO: 2. The synthase can catalyse the formation of a terpenoid product from a monoterpene substrate, a sesquiterpene substrate, or a diterpene substrate. The product can be, for example, a cyclic terpenoid hydrocarbon.
The invention also features an isolated protein comprising a first domain having an amino terminal end and a carboxyl terminal end. The first domain comprises amino acids that align structurally in three-dimensional space with a glycosyl hydrolase catalytic core, the glycosyl hydrolase catalytic core selected from the group consisting of amino acids 36 to 230 of glucoamylase protein databank (PDB) code 3GLY of Aspergillus awamori and amino acids 36 to 230 of endoglucanase CelD PDB code 1CLC. The isolated protein also comprises a second domain having an amino terminal end and carboxyl terminal end. The second domain comprises amino acids that align structurally in three-dimensional space with avian FPP synthase. The carboxyl terminal end of the first domain is linked to the amino terminal end of the second domain. The second domain has about 20% or greater sequence identity to residues 265 to 535 of SEQ ID NO: 2, and comprises nine xcex1-carbons having interatomic distances in Angstroms between the xcex1-carbons that are xc2x12.3 Angstroms of the interatomic distances given in Table 6. The center point of each xcex1-carbon is positioned within a sphere having a radius of 2.3 Angstroms, the center point of each sphere having the structural coordinates given in Table 5. Each xcex1-carbon has an associated R-group, and the synthase has an ordered arrangement of R-groups other than the ordered arrangements of R-groups given in Table 9. The protein can have about 25% or greater sequence identity to SEQ ID NO: 2, or about 35% or greater sequence identity to SEQ ID NO: 2. The synthase can catalyse the formation of a terpenoid product from a monoterpene substrate, a sesquiterpene substrate, or a diterpene substrate. The product can be, for example, a cyclic terpenoid hydrocarbon.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 343 to 606 of SEQ ID NO: 20, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 348, 351, 372, 375, 376, 454, 479, 480, 481, 482, 485, 519, 523, 597, 600, 601, 605, 607 and 608 of SEQ ID NO: 20 are residues other than amino acids Y, L, C, I, T, Y, S, C, G, H, S, L, G, F, G, Y, D, Y and S, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more of the ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 316 to 586 of SEQ ID NO: 22, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 321, 324, 345, 348, 349, 427, 452, 453, 454, 455, 458, 492, 496, 569, 572, 573, 577, 579 and 580 of SEQ ID NO: 22 are residues other than amino acids C, W, N, I, T, Y, S, I, S, G, M, L, D, A, M, Y, D, H and G. respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 352 to 622 of SEQ ID NO: 58, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 357, 360, 381, 384, 385, 463, 487, 488, 489, 490, 493, 528, 532, 606, 609, 610, 614, 616 and 617 of SEQ ID NO: 58 are residues other than amino acids Y, M, C, V, T, F, V, S, S, G, I, L, G, F, V, Y, D, Y and T, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more of the ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to amino acid residues 272 to 540 encoded by SEQ ID NO: 33, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 277, 280, 301, 304, 305, 383, 408, 409, 410, 411, 414, 448, 452, 524, 527, 528, 532, 534 and 535 encoded by SEQ ID NOS: 33 are residues other than amino adds G, W, I, A, S, Y, T, S, G, Y, L, C, D, M, L, Y, D, Y and T, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more of the ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 319 to 571 of SEQ ID NO: 42, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 324, 327, 348, 351, 352, 430, 455, 456, 457, 458, 461, 495, 499, 571, 574, 575, 579, 581 and 582 of SEQ ID NO: 42 are residues other than amino acids I, W, V, I, S, Y, T, T, G, L, V, I, N, T, S, Y, D, Y, and T, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more of the ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 579 to 847 of SEQ ID NO: 44, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 584, 587, 606, 609, 610, 688, 713, 714, 715, 716, 719, 753, 757, 831, 834, 835, 839, 841 and 842 of SEQ ID NO: 44 are residues other than amino acids V, S, G, Q, V, Y, S, V, G, L, C, W, N, V, F, Y, D, Y and G, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more of the ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 495 to 767 of SEQ ID NO: 46, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 500, 503, 524, 527, 528, 606, 631, 632, 633, 634, 637, 674, 678, 751, 754, 755, 759, 761 and 762 of SEQ ID NO: 46 are residues other than amino acids F, L, A, Q, T, Y, S, I, G, Q, L, S. D, T, I, F, D, F and G, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more of the ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 295 to 564 of SEQ ID NO: 48, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 300, 303, 324, 327, 328, 406, 431, 432, 433, 434, 437, 471, 475, 548, 551, 552, 556, 558 and 559 of SEQ ID NO: 48 are residues other than amino acids Y, W, A, C, T, Y, S, S, G, M, L, G, D, L, , Y, D, L and Y, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more of the ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 307 to 578 of SEQ ID NO: 50, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 312, 315, 336, 339, 340, 419, 444, 445, 446, 447, 450, 484, 488, 562, 565, 566, 570, 572 and 573 of SEQ ID NO: 50 are residues other than amino acids F, W, A, M, T, Y, N, T, G, M, L, S, D, I, M, Y, D, F and S, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more of the ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 264 to 533 of SEQ ID NO: 52, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 269, 272, 293, 296, 297, 375, 401, 402, 403, 404, 407, 441, 445, 517, 520, 521, 525, 527 and 528 of SEQ ID NO: 52 are residues other than amino acids C, W, L, T, S, Y, S, A, G, Y, I, A, N, A, L, Y, D, Y and S, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more of the ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 585 to 853 of SEQ ID NO: 56, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 590, 593, 614, 617, 618, 696, 721, 722, 723, 724, 727, 761, 765, 837, 840, 841, 845, 847 and 848 of SEQ ID NO: 56 are residues other than amino acids I, S, S, T, V, Y, S, I, A, L, V, G, N, M, F, Y, D, L and T, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more of the ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 307 to 574 of SEQ ID NO: 54, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 312, 315, 336, 339, 340, 418, 443, 444, 445, 446, 449, 483, 487, 560, 563, 564, 566, 568 and 569 of SEQ ID NO: 54 are residues other than amino acids C, W, I, I, T, Y, S, I, S, A, l, L, D, A, I, Y, D, D and G, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more of the ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 309 to 577 of SEQ ID NO: 24, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 314, 317, 338, 341, 342, 420, 446, 447, 448, 449, 452, 485, 489, 560, 563, 564, 569, 571 and 572 of SEQ ID NO: 24 are residues other than amino acids C, W, N, V, T, Y, I, G, G, I, L, L, D, A, I, Y, D, F and G, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more of the ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 315 to 584 of SEQ ID NO: 26, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 320, 323, 344, 347, 348, 426, 451, 452, 453, 454, 457, 492, 496, 568, 571, 572, 576, 578 and 579 of SEQ ID NO: 26 are residues other than amino acids S, W, I, A, T, Y, S, V, A, S, I, L, D, A, I, Y, D, F, and G, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more of the ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 265 to 536 of SEQ ID NO: 28, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 270, 273, 294, 297, 298, 376, 401, 402, 403, 404, 407, 440, 444, 518, 521, 522, 528, 530 and 531 of SEQ ID NO: 28 are residues other than amino acids A, W, V, C, G, F, T, S, C, I, M, G, N, C, S, Y, D, Y and S, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more of the ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 342 to 612 of SEQ ID NO: 30, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 347, 350, 371, 374, 375, 453, 478, 479, 480, 481, 483, 518, 522, 596, 599, 600, 604, 606 and 607 of SEQ ID NO: 30 are residues other than amino acids F, L, C, V, T, Y, S, S, A, Y, V, L, G, L, L, Y, D, F and S, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more of the ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features an isolated synthase having a region with about 40% or greater sequence identity to residues 273 to 541 of SEQ ID NO: 32, wherein one or more amino acid residues of the synthase that align with amino acid residues at positions 278, 281, 302, 305, 306, 384, 409, 410, 411, 412, 415, 448, 452, 524, 527, 528, 533, 535 and 536 of SEQ ID NO: 32 are residues other than amino acids C, W, I, I, S, Y, T, S, T, Y, L, C, D, I, T, Y, D, Y and T, respectively. In some embodiments, the sequence identity can be about 20% or greater, 25% or greater, or 35% or greater. In some embodiments, one or more ordered arrangements of residues as given in Table 7 are not found in such a synthase.
The invention also features a method for making a terpene synthase, comprising identifying, in a preselected polypeptide having a region with 20% or greater sequence identity to residues 265 to 535 of SEQ ID NO: 2, nine amino acid residues whose xcex1-carbons have interatomic distances in Angstroms between the xcex1-carbons that are xc2x12.3 Angstroms of the interatomic distances given in Table 6. The center point of each xcex1-carbon is positioned within a sphere having a radius of 2.3 Angstroms. The center point of each sphere has the structural coordinates given in Table 5. The method then comprises synthesizing a polypeptide that is modified from the preselected polypeptide. The modified polypeptide has one or more R-groups associated with the nine xcex1-carbons other than the R-groups associated with the xcex1-carbons in the preselected polypeptide. The synthesizing step can comprise the formation of a nucleic acid encoding the preselected polypeptide in which the coding sequence for one or more amino acids corresponding to the nine xcex1-carbons is replaced by a coding sequence that codes for an amino acid different from the amino acid present in the preselected polypeptide. The preselected polypeptide can be, for example, any one of the polypeptides given in SEQ ID NOS: 2, 4, 6, 8, 10, 12, 20, 22, 24, 26, 28, 30, 32, 34-40, 42, 44, 46, 48, 50, 52, 54, 56, or 58.
The invention also features a method of using a terpene synthase, comprising identifying, in a preselected polypeptide having a region with 20% or greater sequence identity to residues 265 to 535 of SEQ ID NO: 2, amino acid residues at nine positions that align with amino acid residues 273, 294, 402, 403, 404, 407, 440, 519 and 520 of SEQ ID NO: 2; and synthesizing a polypeptide that is modified from the preselected polypeptide. The novel polypeptide is modified by having amino acid residues at one or more of the nine positions other than the amino acid residues present in the preselected polypeptide. In some embodiments, the identifying step can comprise identifying sixteen amino acid residues in the preselected polypeptide that align with amino acid residues 270, 273, 294, 297, 298, 402, 403, 404, 407, 440, 516, 519, 520, 525, 527 and 528 of SEQ ID NO: 2, and the synthesizing step can comprise synthesizing a polypeptide that is modified from the preselected polypeptide, the modified polypeptide having amino acid residues at one or more of the sixteen positions other than the amino acid residues present in the preselected polypeptide. In some embodiments, the identifying step can comprise identifying nineteen amino acid residues in the preselected polypeptide that align with amino acid residues 270, 273, 294, 297, 298, 376, 401, 402, 403, 404, 407, 440, 444, 516, 519, 520, 525, 527 and 528 of SEQ ID NO: 2, and the synthesizing step can comprise synthesizing a polypeptide that is modified from the preselected polypeptide, the modified polypeptide having amino acid residues at one or more of the nineteen positions other than the amino acid residues present in the preselected polypeptide. The synthesizing step can comprise the formation of a nucleic acid encoding the preselected polypeptide in which the coding sequence in the nucleic acid coding for one or more of the identified amino acid residues is replaced by a coding sequence that encodes an amino acid different from the amino acid present in the preselected polypeptide. The preselected polypeptide can be, for example, any one of the polypeptides given in SEQ ID NOS: 2, 4, 6, 8, 10, 12, 20, 22, 24, 26, 28, 30, 32, 34-40, 42, 44, 46, 48, 50, 52, 54, 56, or 58. The method can further comprise: contacting the modified polypeptide with an isoprenoid substrate under conditions effective for the compound to bind the polypeptide; and measuring the ability of the modified polypeptide to catalyze the formation of a reaction product from the isoprenoid substrate. The isoprenoid substrate can be a monoterpene, a sesquiterpene, or a diterpene.
The invention also features a method of making a terpene synthase, comprising creating a population of nucleic acid molecules that encode polypeptides, the population having members that differ from one another at one or more of nine codons specifying amino acids of a preselected terpene synthase having a region with about 20% or greater sequence identity to residues 265 to 535 of SEQ ID NO: 2, xcex1-carbons of the nine amino acids having interatomic distances in Angstroms between the xcex1-carbons that are xc2x12.3 Angstroms of the interatomic distances given in Table 6. The center point of each xcex1-carbon is positioned within a sphere having a radius of 2.3 Angstroms, and the center point of each sphere has the structural coordinates given in Table 5. In some embodiments, the codons specify amino acids as described in Tables 1-2 or 3-4 of a preselected terpene synthase. A portion, or all, of the nucleic acid population is expressed so that a population of polypeptides is made. At least one member of the population of polypeptides is a mutant terpene synthase. The expressing step can comprise in vitro transcription and in vitro translation of the nucleic acid population. In some embodiments, the expressing step comprises cloning members of the nucleic acid population into an expression vector; introducing the expression vector into host cells and expressing the cloned nucleic acid population members in the host cells so that the population of polypeptides is made. The preselected terpene synthase polypeptide can be a monoterpene synthase, a sesquiterpene synthase, or a diterpene synthase. The host cells can be prokaryotic cells or eukaryotic cells, including, without limitation, bacterial cells, fungal cells, and animal cells, e.g., mammalian cells or insect cells. The host cells can also be plant cells, e.g., a cell from a Graminaceae plant, a cell from a Legumineae plant, a cell from a Solanaceae plant, a cell from a Brassicaeae plant or a cell from a Conifereae plant.
The invention also features a nucleic acid encoding a synthase as described herein, and a host cell containing such a nucleic acid. The invention also features a transgenic plant containing such a nucleic acid, or a transgenic animal cell culture containing such a nucleic acid.
In some embodiments, a synthase polypeptide of the invention comprises a domain that contains an active site comprised of nine xcex1-carbon atoms having the coordinates of Table 5, and interatomic distances between the xcex1-carbons xc2x12.3 angstroms of the distances given in Table 6. The xcex1-carbon atoms align structurally in three dimensional space in the presence or absence of bound substrate or substrate analogue, with avian FPP synthase. In another embodiment, a synthase of this invention comprises the following: (i) a first domain containing amino acid residues that align in three-dimensional space (in solution or crystal form, and either having a bound or unbound substrate) with a glycosyl hydrolase catalytic core selected from the group consisting of (a) amino acids 36-230 of glycosyl hydrolase (PDB code 3GLY) of Aspergillus awarmori, and (b) amino acids 36-230 of endogluconase CellB (PDB code 1CLC), and (ii) a second domain that aligns structurally in three dimensional space with or without substrate or substrate analogues bound in the active site with avian FPP synthase. The second domain contains an active site comprised of nine, sixteen or nineteen xcex1-carbon atoms having the structural coordinates and interatomic distances of Tables 1-2, 3-4 or 5-6. These xcex1-carbon atoms have R-groups attached thereto that can interact, either directly or indirectly, with an isoprenoid substrate.
The invention also features a method for generating mutant terpene synthases possessing catalytic activity. The method comprises the steps of (a) providing a crystallographic model of a preselected catalytically active terpene synthase having an active site, and (b) using the model to design a terpene synthase having at least one altered R-group in the active site relative to the preselected synthase. The invention also features terpene synthases having altered substrate specificity, methods of making the same, and procedures for generating three-dimensional structures thereof.
Although methods and materials similar or equivalent to those described herein can be used to practice the invention, suitable methods and materials are described below. All publications, patent applications, patents and other references mentioned herein are incorporated by reference in their entirety.
Other aspects, embodiments, advantages, and features of the present invention will become apparent from the specification.