Transcription of any given eukaryotic gene is carried out by one of three RNA polymerase enzymes each of which acts on a particular subset of genes. While transcription of large ribosomal RNAs is performed by RNA polymerase I and small ribosomal RNA and tRNA is transcribed by RNA polymerase III, protein-coding DNA sequences and most small nuclear RNAs are transcribed by RNA polymerase II. For each type of gene, transcription requires interaction of the appropriate polymerase with the gene's promoter sequences and formation of a stable transcription initiation complex. In general, transcription from any of the three polymerases also requires interaction of some binding factor with the promoter sequence and recognition of the binding factor by a second factor which thereby permits polymerase interaction with the gene sequence. While this mechanism is the minimum requirement for transcription with RNA polymerases I and III, the process leading to transcription with RNA polymerase II is more intricate.
Presumably due to the vast array of gene sequences transcribed by RNA polymerase II and the fact that the regulation patterns for these genes are highly variable within the same cell and from cell to cell, transcription by RNA polymerase II is affected by binding of numerous transcription factors in the initiation complex in addition to the interactions of other binding proteins to regulatory DNA sequences other than the promoter. These other binding proteins can serve to activate transcription beyond a basal level or repress transcription altogether. Repressor binding can also be viewed as a means to prevent activation in view of observations that basal transcriptional in higher eukaryotes is normally very low. Activation, on the other hand, is ordinarily the end response to some physiological signal and requires either removal of repressor binding proteins or alteration in chromatin structure in order to permit formation of an active transcription initiation complex.
At the core of transcription complex formation, and a prerequisite for basal levels of gene expression, is the promoter sequence called the "TATA" box which is located upstream from the polymerase II transcription start site. The TATA box is the binding site for a ubiquitous transcription factor designated TFIID, but as mentioned, transcription from the promoter sequence in most genes is strongly influenced by additional regulatory DNA sequences which can either enhance or repress gene transcription. DNA elements of this type are in variable positions with respect to coding sequences in a gene and with respect to the TATA box. These additional transcriptional regulatory elements often function in a tissue- or cell-specific manner.
In expression of recombinant proteins, it is particularly important to select regulatory DNA which includes the promoter TATA sequence and additional regulatory elements compatible with the host cell's transcriptional machinery. For this reason, regulatory DNA endogenous to the host cell of choice is generally preferred. Alternatively, considerable success has been achieved using regulatory DNA derived from viral genomic sequences in view of the broad host range of viruses in general and the demonstrated activity of viral regulatory DNA in different cell types. Well known and routinely utilized viral regulatory DNAs for recombinant protein expression include, for example, the SV40 early gene promoter/enhancer Dijkema, et al., EMBO J. 4:761 (1985)!, Rous sarcoma virus long terminal repeat DNA Gorman, et al., Proc. Natl. Acad. Sci. (USA) 79:6777 (1982b)!, bovine papillomavirus fragments Sarver, et al., Mo. Cell. Biol. 1:486 (1981)! and human cytomegalovirus promoter/enhancer elements Boshart et al., Cell 41:521 (1985)!. Despite the broad range of cell types in which viral regulatory DNAs have been demonstrated to be functional, it is possible that non-viral promoter/enhancer DNA elements exist which permit elevated transcription of recombinant proteins in specific cell lines through more efficient use of host cell transcriptional machinery.
Thus there exists a need in the art to identify promoter/enhancer regulatory DNA sequences which function in homologous and heterologous cell types to increase recombinant protein expression and provide a high yield of a desired protein product. Of particular importance is the need to identify such promoter/enhancer regulatory DNA which can be utilized most efficiently in mammalian cells in order to increase production of recombinant proteins in vitro which are glycosylated in a manner akin to glycosylation patterns which result from in vivo protein expression. Proteins expressed in this manner and administered therapeutically or prophylactically are less likely to be antigenic and more likely to be physiologically active. Regulatory DNA sequences of this type are also amenable to being inserted into host cells in order to increase expression of genes endogenous to the host cells or genes previously introduced into the genome of the host cell by techniques well known and routinely practiced in the art.