Cells synthesize both primary and secondary metabolites. Primary metabolites are necessary for basal growth and maintenance of the cell and include certain nucleic acids, amino acids, proteins, fats, and carbohydrates. In contrast, secondary metabolites are not necessary for basal function, but often confer highly desirable traits to an organism. These metabolites are a chemically diverse group of compounds that includes alkaloid compounds (e.g., terpenoid indole alkaloids and indole alkaloids), phenolic compounds (e.g., quinones, lignans and flavonoids), and terpenoid compounds (e.g. monoterpenoids, iridoids, sesquiterpenoids, diterpenoids and triterpenoids).
Plant secondary metabolites have great value as pharmaceuticals, food colors, flavors and fragrances. Plant pharmaceuticals include taxol, digoxin, colchicine, codeine, morphine, quinine, shikonin, ajmalicine and vinblastine. Examples of secondary metabolites that are useful as food additives include anthocyanins, vanillin, and a wide variety of other fruit and vegetable flavors and texture modifying agents. In addition, some plant secondary metabolites are part of the plant""s defense system, conferring protection against UV light, herbivores, pathogens, microbes, insects and nematodes, as well as the ability to grow at low light intensity.
A particularly valuable secondary metabolite class is the terpenoid class. Plant terpenoids represent a very diverse class of chemicals, comprising about 30,000 different molecules. They play a central role in plant biology, for example, in defense against pathogens and herbivores, and in attracting pollinators. Their physical and chemical properties are quite diverse. Terpenoids range from large polymers such as rubber to small volatile molecules such as menthol, and include many valuable chemicals used to make medicines and fine chemicals. Alone, worldwide sales of plant terpenoid-derived drugs amount to over $10 billion yearly.
In many cases, a key limiting factor to commercial production of secondary metabolites is the rate at which plants synthesize them. Problematically, only very small or variable amounts of these compounds are present in plants. The recovery of useful metabolites from their natural sources is thus in many instances difficult due to the enormous amounts of source material that may be required for the isolation of utilizable quantities of the desired products. Extraction is both costly and tedious, requiring large quantities of raw material and extensive use of chromatographic fractionation procedures. homology to known transcription factors. By design, this screening method excludes identification of many potentially useful transcription factors, such as those structurally unrelated to transcription factors already implicated in biosynthetic pathways. Furthermore, this method does not identify transcription factors that act may act in combination, in particular, ones that may act synergistically to effect gene expression.
Therefore, there is a need for a high-throughput method to identify transcription factors that regulate metabolite biosynthesis in plants. A desirable approach would be to express a pool of transcription factors in cells and to measure the effect on expression of a biosynthetic pathway gene. This invention fulfills this and other needs.
In one aspect, the present invention provides a high-throughput method for determining whether a polynucleotide encodes a transcription factor for a pathway gene. The method entails determining whether a member of a pool of test transcription factor polynucleotides encodes a pathway transcription factor. A nucleic acid comprising a pathway gene promoter operably linked to a reporter gene and a pool of nucleic acid members comprising test transcription factor polynucleotides are introduced into a cell and expression from the pathway gene promoter in the cell is detected. Thereby it is determined whether a member of the test transcription factor polynucleotide pool encodes a pathway transcription factor.
The method can be also be used to allow for high-throughput screening for determining functional interactions between multiple test transcription factors and multiple pathway gene promoters simultaneously. Preferably, the methods of this invention are directed towards identification of transcription factors for genes in pathways relating to metabolite biosynthesis or environmental stresses (biotic or abiotic). With respect to metabolite biosynthesis, the invention is preferably directed to the pathway for the biosynthesis of terpenoids or alkaloids. Preferred terpenoids include, but are not limited to, monoterpenes, diterpenes, and sesquiterpenes. The genes from which promoters may be derived include, but are not limited to, genes from Nicotiana, Mentha, and Taxus. In addition, these genes include, but are not limited to, 5-epi-aristolochene synthase, limonene synthase, and taxadiene synthase.
In another embodiment, a pool of known or putative promoters may be screened. In another embodiment, polynucleotides encoding the test transcription factors are preferably expressed transiently in the plant cell by methods including, but not limited to, Agrobacterium-mediated expression. In yet another embodiment, the expression level of the pathway gene is determined using a promoter of the gene under study operably linked to a reporter gene, such as GUS. In a further embodiment, the expression level of the genes is determined indirectly by measurement of metabolite accumulation in a plant cell or a whole plant regenerated from a cell. In yet a further embodiment, the expression level is directly measured by quantitation of RNA levels in the plant cell or plant.
In a further embodiment, the method may further entail deconvoluting the pool of nucleic acid members to identify the minimum number of test transcription factor polynucleotides necessary to detect expression from said pathway gene promoter.
In another aspect, and if the method is employed to identify test transcription factors for a metabolite pathway, the method may entail introducing into a cell a pool of nucleic acid members comprising test transcription factor polynucleotides and detecting accumulation of metabolites, such as terpenoids, in the cell.
In yet another aspect, the present invention also comprises biosynthetic pathway transcription factors disclosed as SEQ ID NOs: 2, 4, 6, 8 and nucleic acids encoding them or related biosynthetic pathway transcription factors and a transgenic plant or plant cell comprising a nucleic acid encoding a pathway transcription factor identified by the methods provided.
Definitions
As used herein, the term xe2x80x9ctranscription factorxe2x80x9d refers to any polypeptide that may act by itself or in combination with at least one other polypeptide to regulate gene expression levels and the term is not limited to polypeptides that directly bind DNA sequences. The transcription factor typically increases expression levels. However, in some cases it may be desirable to suppress expression of a particular pathway. The transcription factor may be a transcription factor identified by sequence analysis or a naturally-occuring reading frame sequence that has not been previously characterized as a transcription factor. The polypeptide may also be an artificially generated or chemically or enzymatically modified polypeptide. A given nucleic acid sequences may be modified, e.g., according to standard mutagenesis or artificial evolution or domain swapping methods to produce modified sequences. Accelerated evolution methods are described, e.g., by Stemmer (1994) Nature 370:389-391, and Stemmer (1994) Proc. Natl. Acad. Sci. USA 91:10747-10751. Chemical or enzymatic alteration of expressed nucleic acids and polypeptides can be performed by standard methods. For example, sequence can be modified by addition of phosphate groups, methyl groups, lipids, sugars, peptides, organic or inorganic compounds, by the inclusion of modified nucleotides or amino acids, or the like. Further the transcription factor may be derived from a collection of transcripts, such as a cDNA library, and the sequence of the transcript may be unknown.
The phrase xe2x80x9ctest transcription factorxe2x80x9d refers to a polypeptide that is being tested for its ability to act as a transcription factor to regulate a pathway gene, for example, a biosynthetic pathway gene, an environmental (biotic or abiotic) stress gene or the like. Test transcription factors used in assays of this invention may be selected from a pool on the basis of structural similarity to known transcription factors for one or more pathways under investigation. Test transcription factors may also be selected based on their expression patterns in cells or plants that conform to when pathway genes are expressed. Test transcription factors may also be selected randomly or without bias.
As used herein, the term xe2x80x9cpoolxe2x80x9d refers to a collection of transcription factors. The pool may comprise at least two transcription factors, at least three transcription factors, at least four transcription factors, at least 5 transcription factors and including additional one transcription factor increments up to 40, 80, 100, 500, 1000, 2000, 3000 or more transcription factors. The pool may be subdivided into subpools which are introduced into a single cell when the screening is performed. Preferably, any given subpool may comprise between 2 to 20 transcription factors, more preferably between 4 and 16 transcription factors. Therefore, if a total of 2000 transcription factors are screened and 4 transcription factors polynucleotides are transformed simultaneously into each cell (or subpool), then 500 cells would be tested for expression from at least one promoter.
The term xe2x80x9csecondary metabolitexe2x80x9d refers to any compound that is not essential to the basal function of a cell. Typical secondary metabolites include alkaloid compounds, phenolic compounds, and terpenoid compounds.
A xe2x80x9cpolynucleotidexe2x80x9d is a nucleic acid sequence comprising a plurality of polymerized nucleotide residues, e.g., at least about 15 consecutive polymerized nucleotide residues, optionally at least about 30 consecutive nucleotides, at least about 50 consecutive nucleotides. In many instances, a polynucleotide comprises a nucleotide sequence encoding a polypeptide (or protein) or a domain or fragment thereof. Additionally, the polynucleotide may comprise a promoter, an intron, an enhancer region, a polyadenylation site, a translation initiation site, 5xe2x80x2 or 3xe2x80x2 untranslated regions, a reporter gene, a selectable marker, or the like. The polynucleotide can be single stranded or double stranded DNA or RNA. The polynucleotide optionally comprises modified bases or a modified backbone. The polynucleotide can be, e.g., genomic DNA or RNA, a transcript (such as an mRNA), a cDNA, a PCR product, a cloned DNA, a synthetic DNA or RNA, or the like. The polynucleotide can comprise a sequence in either sense or antisense orientations.
The term xe2x80x9cpromoterxe2x80x9d refers to regions or sequence located upstream and/or downstream from the start of transcription and which are involved in recognition and binding of RNA polymerase and other proteins to initiate transcription. The promoter may be of a known or unknown sequence and may be known to drive expression of a particular gene or may be a putative promoter. A xe2x80x9cplant promoterxe2x80x9d is a promoter capable of initiating transcription in plant cells.
The term xe2x80x9ccellxe2x80x9d refers to a cell from any organism, including plants, bacteria, fungi or animals
The term xe2x80x9cplantxe2x80x9d includes whole plants, shoot vegetative organs/structures (e.g., leaves, stems and tubers), roots, flowers and floral organs/structures (e.g., bracts, sepals, petals, stamens, carpels, anthers and ovules), seed (including embryo, endosperm, and seed coat) and fruit (the mature ovary), plant tissue (e.g., vascular tissue, ground tissue, and the like) and cells (e.g., guard cells, egg cells, and the like), and progeny of same. The class of plants that can be used in the method of the invention is generally as broad as the class of higher and lower plants amenable to transformation techniques, including angiosperms (monocotyledonous and dicotyledonous plants), gymnosperms, ferns, and multicellular algae.
The phrase xe2x80x9cstructural similarityxe2x80x9d refers to a polynucleotide or polypeptide having a minimal level of sequence identity to another polynucleotide or polypeptide. The minimal level of sequence identity may be as low as 20% to 30% over any segment of a sequence.
A xe2x80x9ctransiently transfectedxe2x80x9d cell expresses a desired polynucleotide, but only for a limited period of time.
The term xe2x80x9chigh-value secondary metabolitesxe2x80x9d refers to those secondary metabolites that have valuable commercial applications.
As used herein, the term xe2x80x9ctransgenicxe2x80x9d refers to a plant cell or plant where a nonendogenous nucleic acid has been introduced into the plant by any means. Examples of means by which this can be accomplished are described below, and include Agrobacterium-mediated transformation, biolistic methods, electroporation, and the like.
SEQ ID NO: 1 is the polynucleotide sequence of G993, a clone that activates transcription of the taxadiene synthase gene. SEQ ID NO: 2 is the corresponding polypeptide.
SEQ ID NO: 3 is the polynucleotide sequence of G1845, a clone that activates transcription of the taxadiene synthase gene. SEQ ID NO: 4 is the corresponding polypeptide.
SEQ ID NO: 5 is the polynucleotide sequence of G1386, a clone that activates transcription of the taxadiene synthase gene or the limonene synthase gene. SEQ ID NO: 6 is the corresponding polypeptide.
SEQ ID NO: 7 is the polynucleotide sequence of G872, a clone that activates transcription of the taxadiene synthase gene. SEQ ID NO: 8 is the corresponding polypeptide.