For transcription of a gene to occur, RNA polymerase must bind to the gene promoter and initiate transcription. In general, RNA polymerase I transcribes genes encoding ribosomal RNA; RNA polymerase II transcribes genes encoding messenger RNA, some small nuclear RNAs and microRNAs; while RNA polymerase III transcribes genes encoding transfer RNAs and other small RNAs.
Transcription is regulated in order to control when transcription occurs and how much RNA is created. Transcription of a gene by RNA polymerase can be regulated by at least five mechanisms:
(i) Specificity factors alter the specificity of RNA polymerase for a given promoter or set of promoters, making it more or less likely to bind to them (e.g. sigma factors used in prokaryotic transcription).
(ii) Repressors bind to the Operator (coding sequences on the DNA strand that are close to or overlapping the promoter region) impeding RNA polymerase's progress along the strand, thus impeding the expression of the gene.
(iii) Transcription factors position RNA polymerase at the start of a protein-coding sequence and then release the polymerase to transcribe the mRNA.
(iv) Activators enhance the interaction between RNA polymerase and a particular promoter, encouraging the expression of the gene. Enhancers are sites on the DNA helix that are bound by activators in order to loop the DNA bringing a specific promoter to the initiation complex.(v) Silencers are regions of DNA sequences that, when bound by particular transcription factors, can silence expression of the gene.
A typical mammalian promoter consists of a 50-100 base pair core region to which the basic transcription machinery binds, and an enhancer region to which one or more transcriptional activator proteins (transactivators) may bind. The number and type of transactivators that are able to bind at an enhancer depends on which specific binding sites are present. The rate of initiation is governed by the number and type of transactivators actually bound at a promoter's enhancer. In addition to enhancers, there are silencers which, when bound by different transcription factors, lower gene expression.
Core promoters are made up of various different elements, of which there are two categories: canonical and non-canonical. Canonical core promoter elements include: TATA box, the initiator (Inr), the TFIIB recognition element (BRE), downstream promoter element (DPE) and downstream core element (DCE). These elements may be found within the core promoters of many but not all protein-coding genes. The TATA box (sequence TATAA) is usually found 20-30 bp upstream of the transcription start site (TSS) and acts as a binding site for the TFIID general transcription factor. When the Inr element (consensus sequence YYANT/AYY) is present, it encompasses the TSS, with the first A of the consensus being the first base of the transcript. BRE elements can be found both upstream of the TATA box (BREu consensus G/C G/C G/A CGCC) or downstream (BREd consensus G/A T T/A T/G T/G T/G T/G). Although not technically a canonical core promoter element, the CCAAT box (located between 50 and 100 bp upstream the TSS) is often included in this category. The CCAAT box also contributes to general transcription factor (TF) binding.
Non-canonical core promoter elements include the CpG island, the ATG desert and the transcription initiation platform (TIP). CpG islands generally span a 500-2000 bp stretch of DNA that contains a relatively high proportion of CpG dinucleotides. CpG dinucleotides would normally be methylated on the C residue, reducing transcription, but within CpG islands they remain unmethylated, promoting transcription. An ATG desert is a region of DNA with a lower frequency of ATG trinucleotides than surrounding regions. They extend approximately 1000 bp up and downstream of the TSS and are generally associated with promoters that do not contain TATA boxes.
A strong promoter is one that initiates transcription with a high frequency and can be a very useful tool. In biochemistry for example, strong promoters can be used to study transcription processes or to drive the production of recombinant proteins. Strong promoters are also useful in genetics: for example, they can be used to drive shRNA expression for gene knockdowns or for cDNA overexpression to deduce a protein's function.
A particularly potent promoter could also have medical applications: in a recombinant virus vaccine for example, higher antigen expression results in a better immune response.
The strongest promoters used in mammalian systems generally come from either constitutively expressed cellular genes or from viral genes. However, the promoters which are the strongest in terms of the expression levels of an associated gene are often the most cell-type dependent ones, i.e. they are limited in the types of cells in which they will work.