This invention relates to the application of recombinant DNA technology to the production of polypeptide in vertebrate cell cultures. More specifically, this invention relates to utilizing the coding sequence for a secondary control polypeptide as a tool in controlling production of a foreign polypeptide by the vertebrate cell culture.
The general principle of utilizing a host cell for the production of a heterologous protein--i.e., a protein which is ordinarily not produced by this cell--is well known. However, the technical difficulties of obtaining reasonable quantities of the heterologous protein by employing vertebrate host cells which are desirable by virtue of their properties with regard to handling the protein formed are many. There have been a number of successful examples of incorporating genetic material coding for heterologous proteins into bacteria and obtaining expression thereof. For example, human interferon, desacetyl-thymosin alpha-1, somatostatin, and human growth hormone have been thus produced. Recently, it has been possible to utilize non-bacterial hosts such as yeast cells (see, e.g., co-pending application, U.S. Ser. No. 237,913, filed Feb. 25, 1981;) and vertebrate cell cultures (U.S. application Ser. No. 298,235, filed Aug. 31, 1981) as hosts. The use of vertebrate cell cultures as hosts in the production of mammalian proteins is advantageous because such systems have additional capabilities for modification, glycosylation, addition of transport sequences, and other subsequent treatment of the resulting peptide produced in the cell. For example, while bacteria may be successfully transfected and caused to express "alpha thymosin", the polypeptide produced lacks the N-acetyl group of the "natural" alpha thymosin found in mammalian system.
In general, the genetic engineering techniques designed to enable host cells to produce heterologous proteins include preparation of an "expression vector" which is a DNA sequence containing,
(1) a "promoter", i.e., a sequence of nucleotides controlling and permitting the expression of a coding sequence; PA1 (2) a sequence providing mRNA with a ribosome binding site; PA1 (3) a "coding region", i.e., a sequence of nucleotides which codes for the desired polypeptide; and PA1 (4) a "termination sequence" which permits transcription to be terminated when the entire code for the desired protein has been read; and PA1 (5) if the vector is not directly inserted into the genome, a "replicon" or origin of replication which permits the entire vector to be reproduced once it is within the cell.
In the construction of vectors in the present invention, the same promoter controls two coding sequences, one for a desired protein, and the other for a secondary protein. Transcription termination is also shared by these sequences. However, the proteins are produced in discrete form because they are separated by a stop and start translational signal.
Ordinarily, the genetic expression vectors are in the form of plasmids, which are extrachromosomal loops of double stranded DNA. These are found in natural form in bacteria, often in multiple copies per cell. However, artificial plasmids can also be constructed, (and these, of course, are the most useful), by splicing together the four essential elements outlined above in proper sequence using appropriate "restriction enzymes". Restriction enzymes are nucleases whose catalytic activity is limited to lysing at a particular base sequence, each base sequence being characteristic for a particular restriction enzyme. By artful construction of the terminal ends of the elements outlined above (or fractions thereof) restriction enzymes may be found to splice these elements together to form a finished genetic expression vector.
It then remains to induce the hose cell to incorporate the vector (transfection), and to grow the host cells in such a way as to effect the synthesis of the polypeptide desired as a concomitant of normal growth.
Two typical problems are associated with the above-outlined procedure. First, it is desirable to have in the vector, in addition to the four essential elements outlined above, a marker which will permit a straightforward selection for those cells which have, in fact, accepted the genetic expression vector. In using bacterial cells as hosts, frequently used markers are resistance to an antibiotic such as tetracycline or ampicillin. Only those cells which are drug resistant will grow in cultures containing the antibiotic. Therefore, if the cell culture which has been sought to be transfected is grown on a medium containing the antibiotic, only the cells actually transfected will appear as colonies. As the frequency of transformation is quite low (approximately 1 cell in 10.sup.6 being transfected under ideal conditions) this is almost an essential prerequisite as a practical matter.
For vertebrate cells as hosts, the transformation rate achieved is more efficient (about 1 cell in 10.sup.3). However, facile selection remains important in obtaining the desired transfected cells. Selection is rendered important, also, because the rate of cell division is about fifty times lower than in bacterial cells--i.e., although E. coli divide once in about every 20-30 minutes, human tissue culture cells divide only once in every 12 to 24 hours.
The present invention, in one aspect, addresses the problem of selecting for vertebrate cells which have taken up the genetic expression vector for the desired protein by utilizing expression of the coding sequence for a secondary protein, such, for example, as an essential enzyme in which the host cell is deficient. For example, dihydrofolate reductase (DHFR) may be used as a marker using host cells deficient in DHFR.
A second problem attendant on production of polypeptides in a foreign host is recovery of satisfactory quantities of protein. It would be desirable to have some mechanism to regulate, and preferably enhance, the production of the desired heterologous polypeptide. In a second aspect of the invention, a secondary coding sequence which can be affected by externally controlled parameters is utilized to allow control of expression by control of these parameters. Furthermore, provision of both sequences on a polycistron in itself permits selection of transformants with high expression levels of the primary sequence.
It has been shown that DHFR coding sequences can be introduced into, expressed in, and amplified in mammalian cells. Genomic DNA from methotrexate resistant Chinese Hamster Ovary (CHO) cells has been introduced into mouse cells and results in transformants which are also resistant to methotrexate (1). The mechanism by which methotrexate (MTX) resistance in mouse cells is developed appears to be threefold: through gene amplification of the DHFR coding sequence (2, 3, 4); through decrease in uptake of MTX (5, 6) and through reduction in affinity of the DHFR produced for MTX (7).
It appears that amplification of the DHFR gene through MTX exposure can result in a concommitant amplification of a cotransfected gene sequence. It has also been shown that mouse fibroblasts, transfected with both a plasmid containing hepatitis B DNA sequences, and genomic DNA from a hamster cell line containing a mutant gene for MTX-resistant DHFR, secrete increased amounts of hepatitis B surface antigen (HBsAg) into the medium when MTX is employed to stimulate DHFR sequence amplification (8). Further, mRNA coding for the E. coli protein XGPRT is amplified in the presense of MTX in CHO cells co-transfected with the DHFR and XGPRT gene sequences under control by independent promoters (9). Finally, increased expression of a sequence endogenous to the promoter in a DHFR/SV40 plasmid combination in the presence of MTX has been demonstrated (10).