Through the use of recombinant DNA technology and genetic engineering, it has become possible to introduce foreign DNA sequences into plant cells to allow for the expression of proteins of interest. However, obtaining desired levels of expression remains a challenge. To express agronomically important transgenes in crops at desired levels requires the ability to control the regulatory mechanisms governing expression in plants, and this requires suitable regulatory sequences that can function with the desired transgenes.
A given project may require use of several different expression elements, for example one set to drive a selectable marker or reporter gene and another to drive the gene of interest. The selectable marker may not require the same expression level or pattern as that required for the gene of interest. Depending upon the particular project, there may be a need for constitutive expression, which directs transcription in most or all tissues at all time, or there may be a need for tissue specific expression.
Cells use a number of regulatory mechanisms to control which genes are expressed and the level at which they are expressed. Regulation can be transcriptional or post-transcriptional and can include, for example, mechanisms to enhance, limit, or prevent transcription of the DNA, as well as mechanisms that limit the life span of the mRNA after it is produced. The DNA sequences involved in these regulatory processes can be located upstream, downstream or even internally to the structural DNA sequences encoding the protein product of a gene.
Initiation of transcription of a gene is regulated by the promoter sequence located upstream (5′) of the coding sequence. Eukaryotic promoters generally contain a sequence with homology to the consensus TATA box about 10–35 base pairs (bp) upstream of the transcription start (CAP) site. Most maize genes have a TATA box about 29 to 34 base pairs upstream of the CAP site. In most instances the TATA box is required for accurate transcription initiation. Further upstream, often between −80 and −100, there can be a promoter element with homology to the consensus sequence CCAAT. This sequence is not well conserved in many species including maize. However, genes having this sequence appear to be efficiently expressed. In plants, the CCAAT “box” is sometimes replaced by the AGGA “box”. Other sequences conferring tissue specificity, response to environmental signals or maximum efficiency of transcription may be found interspersed with these promoter elements or found further in the 5′ direction from the CAP site. Such sequences are often found within 400 bp of the CAP site, but may extend as far as 1000 bp or more.
Promoters can be classified into two general categories. “Constitutive” promoters are expressed in most tissues most of the time. Expression from a constitutive promoter is more or less at a steady state level throughout development. Genes encoding proteins with house-keeping functions are often driven by constitutive promoters. Examples of constitutively expressed genes in maize include actin and ubiquitin. Wilmink et al. (1995), Plant Molecular Biology 28:949–955. “Regulated” promoters are typically expressed in only certain tissue types (tissue specific promoters) or at certain times during development (temporal promoters). Examples of tissue specific genes in maize include the zeins which are abundant storage proteins found only in the endosperm of seed. Kriz, A. L. et al. (1987), Molecular and General Genetics 207: 90–98. Many genes in maize are regulated by promoters that are both tissue specific and temporal.
It has been demonstrated that promoters can be used to control expression of foreign gene sequences in transgenic plants in a manner similar to the expression pattern of the gene from which the promoter was originally derived. The most thoroughly characterized promoter tested with recombinant genes in plants has been the 35S promoter from the Cauliflower Mosaic Virus (CaMV) and its derivatives. U.S. Pat. No. 5,352,065; Wilmink, et al. (1995); Datla, R. S. S. et al. (1993), Plant Science 94:139–149. Elegant studies conducted by Benfey, et al. (1984) reveal that the CaMV 35S promoter is modular in nature with regards to binding to transcription activators. U.S. Pat. No. 5,097,025; Benfey P. N., L. Ren and N.-H. Chua. (1989), EMBO Journal 8:2195–2202; Benfey, P. N., and Nam-Hai Chua. (1990), Science 250:959–966. Two independent domains result in the transcriptional activation that has been described by many as constitutive. The 35S promoter is very efficiently expressed in most dicots and is moderately expressed in monocots. The addition of enhancer elements to this promoter has increased expression levels in maize and other monocots. Constitutive promoters of monocot origin have not been as thoroughly studied to date and include the polyubiquitin-1 promoter and the rice actin-1 promoter. Wilmink, et al. (1995). In addition, a recombinant promoter, Emu, has been constructed and shown to drive expression in monocots in a constitutive manner. Wilmink, et al. (1995).
DNA sequences called enhancer sequences have been identified which have been shown to enhance gene expression when placed proximal to the promoter. Such sequences have been identified from viral, bacterial, mammalian, and plant gene sources. An example of a well characterized enhancer sequence is the ocs sequence from the octopine synthase gene in Agrobacterium tumefaciens. This short (40 bp) sequence has been shown to increase gene expression in both dicots and monocots, including maize, by significant levels. Tandem repeats of this enhancer have been shown to increase expression of the GUS gene eight-fold in maize. It remains unclear how these enhancer sequences function. Presumably enhancers bind activator proteins and thereby facilitate the binding of RNA polymerase II to the TATA box. Grunstein, M. (1992), Scientific American, October 68–74. PCT Published Application WO95/14098 describes testing of various multiple combinations of the ocs enhancer and the mas (mannopine synthase) enhancer which resulted in several hundred fold increase in gene expression of the GUS gene in transgenic tobacco callus.
The 5′ untranslated leader sequence of mRNA, introns, and the 3′ untranslated region of mRNA affect expression by their effect on post-transcription events, for example by facilitating translation or stabilizing mRNA.
Expression of heterologous plant genes has also been improved by optimization of the non-translated leader sequence, i.e. the 5′ end of the mRNA extending from the 5′ CAP site to the AUG translation initiation codon of the mRNA. The leader plays a critical role in translation initiation and in regulation of gene expression. For most eukaryotic mRNAs, translation initiates with the binding of the CAP binding protein to the mRNA CAP. This is then followed by the binding of several other translation factors, as well as the 43S ribosome pre-initiation complex. This complex travels down the mRNA molecule scanning for an AUG initiation codon in an appropriate sequence context. Once located, a 60S ribosomal subunit binds the complex to create the complete 80S ribosomal complex that initiates mRNA translation and protein synthesis. Pain (1986), Biochem. J., 235:625–637; Kozak (1986), Cell 44:283–292. Optimization of the leader sequence for binding to the ribosome complex has been shown to increase gene expression as a direct result of improved translation initiation efficiency. Significant increases in gene expression have been produced by addition of leader sequences from plant viruses or heat shock genes. Raju, S. S. D. et al (1993), Plant Science 94: 139–149.
The 3′ end of the mRNA can also have a large effect on expression, and is believed to interact with the 5′ CAP. Sullivan, M. L and P. Green (1993), Plant Molecular Biology 23: 1091–1104. The 3′untranslated region (3′UTR) has been shown to have a significant role in gene expression of several maize genes. Specifically, a 200 base pair, 3′UTR has been shown to be responsible for suppression of light induction of the maize small m3 subunit of the ribulose-1,5-biphosphate carboxylase gene (rbc/m3) in mesophyll cells. Viret, J.-F. et al. (1994), Proc. Nat Acad. Sci. 91:8577–8581. Some 3′UTRs have been shown to contain elements that appear to be involved in instability of the transcript. Sullivan, et al. (1993). The 3′UTRs of most eukaryotic genes contain consensus sequences for polyadenylation. In plants, especially maize, this sequence is not very well conserved. The 3′ UTR, including a polyadenylation signal, derived from a nopaline synthase gene (3′ nos) is frequently used in plant genetic engineering. Few examples of heterologous 3′UTR testing in maize have been published.
Important aspects of the present invention are based on the discovery that a 3′ UTR derived from a constitutive maize lipase gene, viviparous 1 (Vp1) described by Paek, et al. (1998) Mol. Cells, 8(3), 336–342, and a 3′ UTR of the maize general regulatory factor-1 (GRF1) gene described by de Vetten et al. (1994), Plant Physiol, 106(4), 1593–604, are exceptionally useful for stabilizing recombinant transgene mRNAs in plants.