The level of production of a protein in a host cell is determined by three major factors: the number of copies of its structural gene within the cell, the efficiency with which the structural gene copies are transcribed and the efficiency with which the resulting messenger RNA ("mRNA") is translated. The transcription and translation efficiencies are, in turn, dependent on nucleotide sequences which are normally situated ahead of the desired structural genes or the coded sequence. These nucleotide sequences (expression control sequences) define, inter alia, the location at which the RNA polymerase binds (the promoter sequence to initiate transcription; see also EMBO J. 5, 2995-3000 [1986]) and at which the ribosomes bind and interact with the mRNA (the product of transcription) to initiate translation.
Not all expression control sequences have the same efficiency. It is therefore often advantageous to separate the specific coding sequence for a desired protein from its adjacent nucleotide sequences and to link it with other expression control sequences to achieve a higher expression rate. After this has been accomplished, the newly combined DNA fragment can be inserted into a plasmid having a high copy number or a derivative of a bacteriophage to increase the structural gene copies within the cell, whereby simultaneously the yield of the desired protein can be improved.
Since the overproduction of a normally nontoxic gene product is often harmful to the host cells and lowers the stability of a specific host cell-vector system, an expression control sequence should, in addition to improving the transcription and translation efficiency of a cloned gene, be regulatable to permit the regulation of the expression during the growth of the microorganisms. Some regulatable expression control sequences can be switched off during the growth of the host cells and then can be switched on again at a desired point in time, to favour the expression of large amounts of the desired protein.
Various expression control sequences which fulfill the previously-mentioned conditions have been used for the expression of DNA sequences and genes which code for desired proteins. Such expression control sequences are known, for example, from Science 198, 1056-1063 (1977) (Itakura et al.), Proc. Natl. Acad. Sci. U.S.A. 76, 106-110 (1979) (Goeddel et al.), Nature 283, 171-174 (1980) (Emtage et al.), Science 205, 602-607 (1979) (Martial et al.), Gene 5, 59-76 (1979) (Bernard et al.), Gene 25, 167-178 (1983) (Ammann et al.), Proc. Natl. Acad. Sci. U.S.A., 80, 21-25 (1983) (de Boer et al.) and from European Patent Applications Publication Nos. 41767 and 186069.