This invention relates to an isolated Douglas-fir 2S seed-storage protein promoter sequence, and methods for its use.
2S Seed-Storage Proteins
xe2x80x9c2Sxe2x80x9d seed-storage proteins are proteins produced in seeds in a variety of different plants. Genes encoding 2S seed-storage proteins have been characterized in several plants, including Arabidopsis (Krebbers et al., Plant Physiol. 87:859-866, 1988; and Guerche et al., Plant Cell 2:469-478, 1990), canola (Baszcynski and Fallis, Plant Mol. Biol. 14:633-635, 1990; Jefferson et al., EMBO J. 6:3901-3907, 1987; and Scofield and Crouch, J. Biol. Chem. 262:12202-12208, 1987), radish (Raynal et al., Gene 99:77-86, 1991), Brazil nut (Gander et al., Plant Mol. Biol. 16:437-448, 1991), sunflower (Allen et al., Mol. Gen. Genet. 210:211-218, 1987), and rice (Adachi et al., Plant Mol. Biol. 21:239-248, 1993). In all of these plants, the 2S seed-storage proteins are encoded by multigene families with copy numbers in the range of 4-20. Most of the previously characterized 2S seed-storage proteins are produced from genes lacking introns; however, the BE2S1 and BE2S2 genes from Brazil nut and the HaG5 gene from sunflower contain a single intron.
The expression of storage-protein genes is restricted both spatially (to the tissues of the embryo and/or the endosperm) and temporally (to maturation and late embryogenesis) during seed development (Goldberg et al., Cell 56:149-160, 1989). The expression characteristics of the storage-protein genes parallel those desirable for the expression of heterologous proteins in seed. Such expressions offer the possibility of, for example, producing large quantities of easily harvested polypeptides and expressing proteins that improve grain quality. Discussions of this concept can be found in U.S. Pat. No. 5,714,474 issued to VanOoijen et al.
The strict spatial and temporal expression of genes encoding seed-storage proteins, is believed to be controlled primarily at the transcriptional level (Goldberg et al., Cell 56:149-160, 1989; and Stxc3xa5lberg et al., Plant Mol. Biol. 23:671-683, 1993). An increase in mRNA accumulation occurs simultaneously with an increase in transcriptional activity (DeLisle and Crouch, Plant Physiol. 91:617-623, 1989). Similarly, mRNA levels peak and decline as seed development proceeds, in parallel with peaks and declines in transcription levels (Gatehouse and Shirsat, Control of Plant Gene Expression, CRC Press, Boca Raton, Florida, pp.357-372, 1993). A number of studies have identified regulatory promoter elements associated with the expression level, developmental stage, and tissue-specific gene expression of genes for 2S seed-storage proteins (De Clercq et al., Plant Physiol. 92:899-907, 1990; Grossi de Sa et al., Plant Sci. 103:189-198, 1994; Radke et al., Theor. Appl. Genet. 75:685-694, 1988; and Stxc3xa5lberg et al., Plant Mol. Biol. 23:671-683, 1993).
It is generally accepted that transcriptional gene regulation depends on the interaction between promoter elements and transcription factors. For storage-protein genes, a number of nuclear factors have been identified that bind specifically to conserved cis-regulatory sequences and may activate gene transcription (Morton et al., Regulation of Seed Storage Protein Gene Expression, Kigel and Galili (eds.) Marcel Dekker, Inc., New York, pp. 103-138, 1995). The most studied plant-protein factors are those interacting with the G-box and related elements. Several cDNA clones encoding G-box binding factors have been isolated from a number of plant species using the G-box as a probe. These include the wheat Em-binding protein EmBP-1 (Guitinan et al., Science 250:267-271, 1990), G-box binding factors GBF-1, GBF-2, and GBF-3 from Arabidopsis (Schindler et al., EMBO J. 11:1275-1289, 1992), and tobacco TAF-1 (Oeda et al., EMBO J. 10:1798-1802, 1991). Most of these proteins belong to the basic leucine zipper (bZIP) family of transcription factors that are characterized by a bipartite DNA-binding domain consisting of a basic region involved in sequence-specific binding, and a leucine zipper region required for dimerization (de Vetten and Ferl, Int. J. of Biochem. 26:1055-1068, 1994). However, the proteins differ in their overall structure and in their binding site-preferences for the ACGT-core sequence (SEQ. ID NO: 4; Schindler et al., Plant Cell 4:1309-1319, 1992), the TGACG (SEQ. ID NO: 20) sequence (Schindler et al., EMBO J. 11:1275-1289, 1992), or the CANNTG motif (SEQ. ID NO: 1; Kawagoe and Murai, Plant J. 2:927-936, 1992). Some G-box binding factors can recognize deviant sequences (Foster et al., Plant J. 8:192-200, 1994), which allows the factors to out-compete other bZIP proteins and, in some cases, to negatively regulate gene transcription (Chem et al., Plant Cell 8:305-321, 1996; and Chem et al., Plant J. 10:135-148, 1996). Opaque2 gene products in monocotyledonous plants also contain a bZIP domain that recognizes the ACGT-core-containing DNA sequence (Schmidt et al., Plant Cell 4:689-700, 1992; Williams et al., Plant Cell 4:485-496, 1992; and Vettore et al., Plant Mol. Biol. 36:249-263, 1998). The gene products are notable for their role in regulation of storage-protein (zein) gene expression in maize endosperm (Unger et al., Plant Cell 5:831-841, 1993).
Recently, several cDNAs have been isolated from seed-specific cDNA expression libraries using cis-elements derived from genes for seed-storage proteins as probes. These cDNAs encode proteins containing bZIP, Zinc-finger, RING-finger, or basic helix-loop-helix DNA-binding domains and may represent novel types of trans-acting factors (Kawagoe and Murai, Plant Sci. 116:47-57, 1996; and Wohlfarth et al., J. Plant Physiol. 152:600-606, 1998).
The invention provides a promoter for a seed-storage protein from Douglas-fir (Pseudotsuga menziesii). The seed-storage protein promoter (termed herein the df2SSP promoter) is capable of driving the expression of a transgene as well as the endogenous seed-storage protein gene. The promoter and variants of the promoter are useful for expressing heterologous proteins either transiently in host cells or transgenically in stably transformed cells. The df2SSP promoter (SEQ ID NO: 17) can allow for tissue-specific expression of genes that are placed under its control.
One aspect of the invention provides the df2SSP promoter (SEQ ID NO: 17), fragments/deletion mutants thereof, and variants thereof. The variant df2SSP promoters are characterized by their retention of at least 50% sequence identity with the disclosed promoter sequence (SEQ ID NO: 17), or by their retention of at least 20, 30, 40, 50, or 60 consecutive nucleic acid residues of the disclosed promoter sequence (SEQ ID NO: 17). In each case these promoters, at a minimum, retain promoter activity. In some cases these promoters retain native df2SSP promoter activity.
It is contemplated that promoters such as the CaMV 35S promoter may be altered through the introduction of sequences found in the df2SSP promoter. The resulting promoter also will be characterized by its retention of at least 20, 30, 40, 50, or 60 consecutive nucleic acid residues of the disclosed promoter sequence (SEQ ID NO: 17).
Another aspect of the invention provides vectors containing the above-described promoters and variants thereof. The vectors can be transformed into host cells. In some cases the resulting host cell can give rise to a transgenic plant.
The invention also provides transgenes. These transgenes include one of the above-described promoter sequences operably linked to one or more open reading frames (ORFs). The transgenes can be cloned into vectors and subsequently used to transform host cells, such as bacterial, insect, mammalian, fungal, yeast, or plant cells.
Accordingly, the invention also provides transgenic plants such as maize, wheat, rice, millet, tobacco, sorghum, rye, barley, brassica, sunflower, seaweeds, lemna, oat, soybean, cotton, legumes, rape/canola, alfalfa, flax, sunflower, safflower, brassica, cotton, flax, peanut, and clover; lettuce, tomato, cucurbits, cassava, potato, carrot, radish, pea, lentil, cabbage, cauliflower, broccoli, Brussels sprouts, peppers and other vegetables; citrus, apples, pears, peaches, apricots, walnuts, and other fruit trees; orchids, carnations, roses, and other flowers; cacao; poplar, elms, and other deciduous trees; pine, Douglas-fir, spruce, and other conifers; turf grasses; cacao; and rubber trees and other members of the genus Hevea.
In yet another embodiment, the invention provides methods for expressing proteins in host cells, such as plant host cells. Such methods involve operably linking a promoter, such as those described above, to at least one ORF to produce a transgene and introducing the transgene into a plant. Accordingly, the invention also provides proteins that are produced by these methods.
An alternative method for characterizing promoters includes analyzing the various promoter elements found within the promoter sequence. Hence, the invention also provides promoters that maintain promoter activity and include at least eight promoter elements selected from the group consisting of E-box motifs (SEQ ID NO: 1), RY-repeat elements (SEQ ID NO: 2), AT-rich regions (SEQ ID NO: 3), ACGT-core elements (SEQ ID NO: 4), opaque-2-like elements (SEQ ID NO: 5), and conserved gymnosperm-like regions (SEQ ID NOs: 6 and 7) and duplicates thereof, wherein the promoter displays promoter activity. The invention also provides promoters that contain all the following promoter elements in the following orientation 3xe2x80x2-E-box motif (SEQ ID NO: 1); ACGT-core element (SEQ ID NO: 4); E-box motif (SEQ ID NO: 1); E-box motif (SEQ ID NO: 1); E-box motif (SEQ ID NO: 1); E-box motif (SEQ ID NO: 1); ACGT-core element (SEQ ID NO: 4); ACGT-core element (SEQ ID NO: 4); ACGT-core element (SEQ ID NO: 4); AT-rich region (SEQ ID NO: 3); ACGT-core element (SEQ ID NO: 4); ACGT-core element (SEQ ID NO: 4); gymnosperm-like region (SEQ. ID NOs: 6 or 7); ACGT-core element (SEQ ID NO: 4); E-box motif (SEQ ID NO: 1); opaque-2-like element (SEQ ID NO: 5); gymnosperm-like region (SEQ. ID NOs: 6 or 7); ACGT-core element (SEQ ID NO: 4); opaque-2-like element (SEQ ID NO: 5); gymnosperm-like region (SEQ ID NOs: 6 or 7); ACGT-core element (SEQ ID NO: 4); RY-repeat element (SEQ ID NO: 2); E-box motif (SEQ ID NO: 1); and opaque-2-like element (SEQ ID NO: 5) -5xe2x80x2.
Finally, the invention also provides vectors, host cells, and transgenic plants that include the promoters that are described above.
These and other aspects of the invention will become readily apparent from the following detailed description.