Gene expression encompasses a number of steps originating from the DNA template ultimately to the final protein or protein product. Control and regulation of gene expression can occur through numerous mechanisms. The initiation of transcription of a gene is generally thought of as the predominant control of gene expression. The transcriptional controls (or promoters) are generally relegated to relatively short sequences imbedded in the 5'-flanking or upstream region of the transcribed gene. There are DNA sequences which affect gene expression in response to environmental stimuli, nutrient availability, or adverse conditions including heat shock, anaerobiosis or the presence of heavy metals. There are also DNA sequences which control gene expression during development or in a tissue, or organ specific fashion.
Promoters contain the signals for RNA polymerase to begin transcription so that protein synthesis can proceed. DNA binding, nuclear proteins interact specifically with these cognate promoter DNA sequences to promote the formation of the transcriptional complex and eventually initiate the gene expression process.
One of the most common sequence motifs present in the promoters of genes transcribed by eukaryotic RNA polymerase II (polII) system is the "TATA" element which resides upstream of the start of transcription. Eukaryotic promoters are complex and are comprised of components which include a TATA box consensus sequence at about 35 base pairs 5' relative to the transcription start site or cap site which is defined as +1. The TATA motif is the site where the TATA-binding-protein (TBP) as part of a complex of several polypeptides (TFIID complex) binds and productively interacts (directly or indirectly) with factors bound to other sequence elements of the promoter. This TFIID complex in turn recruits the RNA polymerase II complex to be positioned for the start of transcription generally 25 to 30 base pairs downstream of the TATA element and promotes elongation thus producing RNA molecules. The sequences around the start of transcription (designated INR) of some polI genes seem to provide an alternate binding site for factors that also recruit members of the TFIID complex and thus "activate" transcription. These INR sequences are particularly relevant in promoters that lack functional TATA elements providing the core promoter binding sites for eventual transcription. It has been proposed that promoters containing both a functional TATA and INR motif are the most efficient in transcriptional activity. (Zenzie-Gregory et al, 1992. J. Biol. Chem. 267:2823-2830).
In most instances sequence elements other than the TATA motif are required for accurate transcription. Such elements are often located upstream of the TATA motif and a subset may have homology to the consensus sequence CCAAT.
Other DNA sequences have been found to elevate the overall level of expression of the nearby genes. One of the more common elements that have been described reside far upstream from the initiation site and seem to exhibit position and orientation independent characteristics. These far upstream elements have been designated enhancers.
One of the less common elements by virtue of their specificities are sequences that interact with specific DNA binding factors. These sequence motifs are collectively known as upstream elements which are usually position and orientation dependent.
Many upstream elements have been identified in a number of plant promoters based initially on function and secondarily on sequence homologies. These promoter upstream elements range widely in type of control: from environmental responses like temperature, moisture, wounding, etc., developmental cues, (germination, seed maturation, flowering, etc.) to spatial information (tissue specificity). These elements also seem to exhibit modularity in that they may be exchanged with other elements while maintaining their characteristic control over gene expression.
Promoters are usually positioned 5' or upstream relative to the start of the coding region of the corresponding gene, and the entire region containing all the ancillary elements affecting regulation or absolute levels of transcription may be comprised of less than 100 base pairs or as much as 1 kilobase pair.
A number of promoters which are active in plant cells have been described in the literature. These include nopaline synthase (NOS) and octopine synthase (OCS) promoters (which are carried on tumor inducing plasmids of Agrobacterium tumefaciens). The cauliflower mosaic virus (CaMV) 19S and 35S promoters, the light-inducible promoter from the small subunit of ribulose bisphosphate carboxylase (ssRUBICSO, a very abundant plant polypeptide), and the sucrose synthase promoter are also included. All of these promoters have been used to create various types of DNA constructs which have been expressed in plants. (See for example PCT publication WO84/02913 Rogers, et al).
Two promoters that have been widely used in plant cell transformations are those of the genes encoding alcohol dehydrogenase, AdhI and AdhII. Both genes are induced after the onset of anaerobiosis. Maize AdhI has been cloned and sequenced as has been AdhII. Formation of an AdhI chimeric gene, Adh-Cat comprising the AdhI promoter links to the chloramphenicol acetyltransferase (CAT) coding sequences and nopaline synthase (NOS) 3' signal caused CAT expression at approximately 4-fold higher levels at low oxygen concentrations than under control conditions. Sequence elements necessary for anaerobic induction of the ADH-CAT chimeric have also been identified. The existence of anaerobic regulatory element (ARE) between positions -140 and -99 of the maize AdhI promoter composed of at least two sequence elements positions -133 to -124 and positions -113 to 99 both of which have found to be necessary and are sufficient for low oxygen expression of ADH-CAT gene activity. The Adh promoter however responds to anaerobiosis and is not a constitutive promoter drastically limiting its effectiveness.
Another commonly used promoter is the 35S promoter of Cauliflower Mosaic Virus. The (CaMV) 35S promoter is a dicot virus promoter however it directs expression of genes introduced into protoplasts of both dicots and monocots. The 35S promoter is a very strong promoter and this accounts for its widespread use for high level expression of traits in transgenic plants. The CaMV35S promoter however has also demonstrated relatively low activity in several agriculturally significant graminaceous plants such as wheat. While these promoters all give high expression in dicots, few give high levels of expression in monocots. A need exists for a synthetic promoters and other elements that induce expression in transformed monocot protoplast cells.