The invention relates to transacylase enzymes and methods of using such enzymes to produce Taxol(trademark) and related taxoids.
The complex diterpenoid Taxol(trademark) (paclitaxel) (Wani et al., J. Am. Chem. Soc. 93: 2325-2327, 1971) is a potent antimitotic agent with excellent activity against a wide range of cancers, including ovarian and breast cancer (Arbuck and Blaylock, Taxol(trademark): Science and Applications, CRC Press, Boca Raton, 397-415, 1995; Holmes et al., ACS Symposium Series 583: 31-57, 1995). Taxol(trademark) was originally isolated from the bark of the Pacific yew (Taxus brevifolia). For a number of years, Taxol(trademark) was obtained exclusively from yew bark, but low yields of this compound from the natural source coupled to the destructive nature of the harvest, prompted new methods of Taxol(trademark) production to be developed. Taxol(trademark) is currently produced primarily by semisynthesis from advanced taxane metabolites (Holton et al., Taxol(trademark): Science and Applications, CRC Press, Boca Raton, 97-121, 1995) that are present in the needles (a renewable resource) of various Taxus species. However, because of the increasing demand for this drug (both for use earlier in the course of cancer intervention and for new therapeutic applications) (Goldspiel, Pharmacotherapy 17: 110S-125S, 1997), availability and cost remain important issues. Total chemical synthesis of Taxol(trademark) is not economically feasible. Hence, biological production of the drug and its immediate precursors will remain the method of choice for the foreseeable future. Such biological production may rely upon either intact Taxus plants, Taxus cell cultures (Ketchum et al., Biotechnol. Bioeng. 62: 97-105,1999), or, potentially, microbial systems (Stierle et al., J. Nat. Prod. 58: 1315-1324, 1995). In all cases, improving the biological production yields of Taxol depends upon a detailed understanding of the biosynthetic pathway, the enzymes catalyzing the sequence of reactions, especially the rate-limiting steps, and the genes encoding these proteins. Isolation of genes encoding enzymes involved in the pathway is a particularly important goal, since overexpression of these genes in a producing organism can be expected to markedly improve yields of the drug.
The Taxol(trademark) biosynthetic pathway is considered to involve more than 12 distinct steps (Floss and Mocek, Taxol: Science and Applications, CRC Press, Boca Raton, 191-208,1995; and Croteau et al., Curr. Top. Plant Physiol. 15: 94-104, 1996), however, very few of the enzymatic reactions and intermediates of this complex pathway have been defined. The first committed enzyme of the Taxol(trademark) pathway is taxadiene synthase (Koepp et al., J. Biol. Chem. 270: 8686-8690, 1995) that cyclizes the common precursor geranylgeranyl diphosphate (Hefner et al., Arch. Biochem. Biophys. 360: 62-74, 1998) to taxadiene (FIG. 1). The cyclized intermediate subsequently undergoes modification involving at least eight oxygenation steps, a dehydrogenation, an epoxide rearrangement to an oxetane, and several acylations (Floss and Mocek, Taxol(trademark): Science and Applications, CRC Press, Boca Raton, 191-208, 1995; Croteau et al., Curr. Top. Plant Physiol. 15: 94-104, 1996). Taxadiene synthase has been isolated from T. brevifolia and characterized (Hezari et al., Arch. Biochem. Biophys. 322: 437-444, 1995), the mechanism of action defined (Lin et al., Biochemistry 35: 2968-2977, 1996), and the corresponding cDNA clone isolated and expressed (Wildung and Croteau, J. Biol. Chem. 271: 9201-9204,1996).
The second specific step of Taxol(trademark) biosynthesis is an oxygenation reaction catalyzed by taxadiene-5xcex1-hydroxylase (FIG. 1). The enzyme, characterized as a cytoebrome P450, has been demonstrated in Taxus microsome preparations to catalyze the stereospecific hydroxylation of taxa-4(5), 11 (12)-diene, with double bond rearrangement, to taxa-4(20), 11 (12)-dien-5xcex1-ol (Hefuer et al., Chem. Biol. 3:479-489, 1996).
The third specific step of Taxol(trademark) biosynthesis appears to be the acetylation of taxa-4(20),11 (12)-dien-5xcex1-ol to taxa-4(20),11 (12)-dien-5xcex1-yl acetate by an acetyl CoA-dependent transacetylase (Walker et al., Arch. Biochem. Biophys. 364:273-279, 1999), since the resulting acetate ester is then further efficiently oxygenated to a series of advanced polyhydroxylated Taxol(trademark) metabolites in microsomal preparations that have been optimized for cytochrome P450 reactions (FIG. 1). The enzyme has been isolated from induced yew cell cultures (Taxus canadensis and Taxus cuspidata), and the operationally soluble enzyme was partially purified by a combination of anion exchange, hydrophobic interaction, and affinity chromatography on immobilized coenzyme A resin. This acetyl transacylase has a pI and pH optimum of 4.7 and 9.0, respectively, and a molecular weight of about 50,000 as determined by gel-permeation chromatography. The enzyme shows high selectivity and high affinity for both cosubstrates with Km values of 4.2 xcexcM and 5.5 xcexcM for taxadienol and acetyl CoA, respectively. The enzyme does not acetylate the more advanced Taxol(trademark) precursors, 10-deacetylbaccatin III or baccatin III. This acetyl transacylase is insensitive to monovalent and divalent metal ions, is only weakly inhibited by thiol-directed reagents and Co-enzyme A, and in general displays properties similar to those of other O-acetyl transacylases. This acetyl CoA:taxadien-5xcex1-ol O-acetyl transacylase from Taxus (Walker et al., Arch. Biochem. Biophys. 364:273-279, 1999) appears to be substantially different in size, substrate selectivity, and kinetics from an acetyl CoA: 10-hydroxytaxane O-acetyl transacylase recently isolated and described from Taxus chinensis (Menhard and Zenk, Phytochemistry 50:763-774, 1999).
Acquisition of the gene encoding the acetyl CoA:taxa-4(20),11 (12)-dien-5xcex1-ol O-acetyl transacylase that catalyzes the first acylation step of Taxol(trademark) biosynthesis and genes encoding other acyl transfer steps would represent an important advance in efforts to increase Taxol(trademark) yields by genetic engineering and in vitro synthesis.
The invention stems from the discovery of twelve amplicons (regions of DNA amplified by a pair of primers using the polymerase chain reaction (PCR)). These amplicons can be used to identify transacylases, for example, the transacylases shown in SEQ ID NOs: 26, 28, 45, 50, 52, 54, 56, and 58 that are encoded by the nucleic acid sequences shown in SEQ ID NOs: 25, 27, 44, 49, 51, 53, 55, and 57. These sequences are isolated from the Taxus genus, and the respective transacylases are useful for the synthetic production of Taxol(trademark) and related taxoids, as well as intermediates within the Taxol(trademark) biosynthetic pathway. The sequences can be also used for the creation of transgenic organisms that either produce the transacylases for subsequent in vitro use, or produce the transacylases in vivo so as to alter the level of Taxol(trademark) and taxoid production within the transgenic organism.
Another aspect of the invention provides the nucleic acid sequences shown in SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, and 23 and the corresponding amino acid sequences shown in SEQ ID NOs: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, and 24, respectively, as well as fragments of the nucleic acid and the amino acid sequences. These sequences are useful for isolating the nucleic acid and amino acid sequences corresponding to full-length transacylases. These amino acid sequences and nucleic acid sequences are also useful for creating specific binding agents that recognize the corresponding transacylases.
Accordingly, another aspect of the invention provides for the identification of transacylases and fragments of transacylases that have amino acid and nucleic acid sequences that vary from the disclosed sequences. For example, the invention provides transacylase amino acid sequences that vary by one or more conservative amino acid substitutions, or that share at least 50% sequence identity with the amino acid sequences provided while maintaining transacylase activity.
The nucleic acid sequences encoding the transacylases and fragments of the transacylases can be cloned, using standard molecular biology techniques, into vectors. These vectors can then be used to transform host cells. Thus, a host cell can be modified to express either increased levels of transacylase or decreased levels of transacylase.
Another aspect of the invention provides methods for isolating nucleic acid sequences encoding full-length transacylases. The methods involve hybridizing at least ten contiguous nucleotides of any of the nucleic acid sequences shown in SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 44, 49, 51, 53, 55, and 57 to a second nucleic acid sequence, wherein the second nucleic acid sequence encodes a transacylase. This method can be practiced in the context of, for example, Northern blots, Southern blots, and the polymerase chain reaction (PCR). Hence, the invention also provides the transacylases identified by this method.
Yet another aspect of the invention involves methods of adding at least one acyl group to at least one taxoid. These methods can be practiced in vivo or in vitro, and can be used to add acyl groups to various intermediates in the Taxol(trademark) biosynthetic pathway, and to add acyl groups to related taxoids that are not necessarily in a Taxol(trademark) biosynthetic pathway.
The nucleic and amino acid sequences listed in the accompanying sequence listing are shown using standard letter abbreviations for nucleotide bases, and three-letter code for amino acids. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood to be included by any reference to the displayed strand.
SEQ ID NO: 1 is the nucleotide sequence of Probe 1.
SEQ ID NO: 2 is the deduced amino acid sequence of Probe 1.
SEQ ID NO: 3 is the nucleotide sequence of Probe 2.
SEQ ID NO: 4 is the deduced amino acid sequence of Probe 2.
SEQ ID NO: 5 is the nucleotide sequence of Probe 3.
SEQ ID NO: 6 is the deduced amino acid sequence of Probe 3.
SEQ ID NO: 7 is the nucleotide sequence of Probe 4.
SEQ ID NO: 8 is the deduced amino acid sequence of Probe 4.
SEQ ID NO: 9 is the nucleotide sequence of Probe 5.
SEQ ID NO: 10 is the deduced amino acid sequence of Probe 5.
SEQ ID NO: 11 is the nucleotide sequence of Probe 6.
SEQ ID NO: 12 is the deduced amino acid sequence of Probe 6.
SEQ ID NO: 13 is the nucleotide sequence of Probe 7.
SEQ ID NO: 14 is the deduced amino acid sequence of Probe 7.
SEQ ID NO: 15 is the nucleotide sequence of Probe 8.
SEQ ID NO: 16 is the deduced amino acid sequence of Probe 8.
SEQ ID NO: 17 is the nucleotide sequence of Probe 9.
SEQ ID NO: 18 is the deduced amino acid sequence of Probe 9.
SEQ ID NO: 19 is the nucleotide sequence of Probe 10.
SEQ ID NO: 20 is the deduced amino acid sequence of Probe 10.
SEQ ID NO: 21 is the nucleotide sequence of Probe 11.
SEQ ID NO: 22 is the deduced amino acid sequence of Probe 11.
SEQ ID NO: 23 is the nucleotide sequence of Probe 12.
SEQ ID NO: 24 is the deduced amino acid sequence of Probe 12.
SEQ ID NO: 25 is the nucleotide sequence of the full-length acyltransterse clone TAX2.
SEQ ID NO: 26 is the deduced amino acid sequence of the full-length acyltransferase clone TAX2.
SEQ ID NO: 27 is the nucleotide sequence of the full-length acyltransferase clone TAX1.
SEQ ID NO: 28 is the deduced amino acid sequence of the full-length acyltransferase clone TAX1 .
SEQ ID NO: 29 is the amino acid sequence of a transacylase peptide fragment.
SEQ ID NO: 30 is the amino acid sequence of a transacylase peptide fragment.
SEQ ID NO: 31 is the amino acid sequence of a transacylase peptide fragment.
SEQ ID NO: 32 is the amino acid sequence of a transacylase peptide fragment.
SEQ ID NO: 33 is the amino acid sequence of a transacylase peptide fragment.
SEQ ID NO: 34 is the AT-FOR1 PCR primer.
SEQ ID NO: 35 is the AT-FOR2 PCR primer.
SEQ ID NO: 36 is the AT-FOR3 PCR primer.
SEQ ID NO: 37 is the AT-FOR4 PCR primer.
SEQ ID NO: 38 is the AT-REV1 PCR primer.
SEQ ID NO: 39 is an amino acid sequence variant that allowed for the design of the AT-FOR3 PCR primer.
SEQ ID NO: 40 is an amino acid sequence variant that allowed for the design of the AT-FOR4 PCR primer.
SEQ ID NO: 41 is a consensus amino acid sequence that allowed for the design of the AT-REV1 PCR primer.
SEQ ID NO: 42 is a PCR primer, useful for identifying transacylases.
SEQ ID NO: 43 is a PCR primer, useful for identifying transacylases.
SEQ ID NO: 44 is the nucleotide sequence of the full-length O-acetyltransferase TAX6.
SEQ ID NO: 45 is the deduced amino acid sequence of the full-length O-acetyltransferase clone TAX6.
SEQ ID NO: 49 is the nucleotide sequence of the full-length acyltransferase clone TAX5.
SEQ ID NO: 50 is the deduced amino acid sequence of the full-length acyltransferase clone TAX5.
SEQ ID NO: 51 is the nucleotide sequence of the full-length acyltransferase clone TAX7.
SEQ ID NO: 52 is the deduced amino acid sequence of the full-length acyltransferase clone TAX7.
SEQ ID NO: 53 is the nucleotide sequence of the full-length acyltransferase clone TAX10.
SEQ ID NO: 54 is the deduced amino acid sequence of the full-length acyltransferase clone TAX10.
SEQ ID NO: 55 is the nucleotide sequence of the full-length acyltransferase clone TAX12.
SEQ ID NO: 56 is the deduced amino acid sequence of the full-length acyltransferase clone TAX12.
SEQ ID NO: 57 is the nucleotide sequence of the full-length acyltransferase clone TAX13.
SEQ ID NO: 58 is the deduced amino acid sequence of the full-length acyltransferase clone TAX13.
SEQ ID NO: 59 is the amino acid sequence identified as xe2x80x9caab61522xe2x80x9d in FIG. 6A-6N.
SEQ ID NO: 60 is the amino acid sequence identified as xe2x80x9ccab10319xe2x80x9d in FIG. 6A-6N.
SEQ ID NO: 61 is the amino acid sequence identified as xe2x80x9ccab40761xe2x80x9d in FIG. 6A-6N.
SEQ ID NO: 62 is the amino acid sequence identified as xe2x80x9caab61523xe2x80x9d in FIG. 6A-6N.
SEQ ID NO: 63 is the amino acid sequence identified as xe2x80x9caab95283xe2x80x9d in FIG. 6A-6N.
SEQ ID NO: 64 is the amino acid sequence identified as xe2x80x9caab97723xe2x80x9d in FIG. 6A-6N.
SEQ ID NO: 65 is the amino acid sequence identified as xe2x80x9caac17079xe2x80x9d in FIG. 6A-6N.
SEQ ID NO: 66 is the amino acid sequence identified as xe2x80x9caac18062xe2x80x9d in FIG. 6A-6N.
SEQ ID NO: 67 is the amino acid sequence identified as xe2x80x9caac27152xe2x80x9d in FIG. 6A-6N.
SEQ ID NO: 68 is the amino acid sequence identified as xe2x80x9caac99311xe2x80x9d in FIG. 6A-6N.
SEQ ID NO: 69 is the amino acid sequence identified as xe2x80x9caad12025xe2x80x9d in FIG. 6A-6N.
SEQ ID NO: 70 is the amino acid sequence identified as xe2x80x9ccaa20531xe2x80x9d in FIG. 6A-6N.
SEQ ID NO: 71 is the amino acid sequence identified as xe2x80x9ccaa64636xe2x80x9d in FIG. 6A-6N.
SEQ ID NO: 72 is the amino acid sequence identified as xe2x80x9ccaa94432xe2x80x9d in FIG. 6A-6N.
SEQ ID NO: 73 is the amino acid sequence identified as xe2x80x9ccab06427xe2x80x9d in FIG. 6A-6N.
SEQ ID NO: 74 is the amino acid sequence identified as xe2x80x9ccab10318xe2x80x9d in FIG. 6A-6N.