Alteration of the genetic makeup of a host cell is the modus operandus of current biotechnology. By suitable modification, such host cells may be caused to produce protein sequences normally unavailable in large quantity, for example, the fibroblast or leukocyte interferons; or their metabolism may be altered so as to induce them to perform some unaccustomed function such as, for example, the conversion of starch to simple sugar. A number of techniques for altering the genetic makeup of a host organism are known, including transduction and mutation. By far, however, the most useful technique in specific control of such genetic alteration is transformation of a host cell with a suitable recombinant vector, typically a plasmid. Depending on the host cell used, a variety of transformation conditions is used. Such conditions do result in transformed cells; however, the frequency of transformation may be relatively low, i.e., about 1 cell per 10.sup.3 for procaryotic hosts, and from one cell per 10.sup.7 to one cell per 10.sup.2 for eucaryotic hosts. Therefore, it is essential that a selection procedure be available to screen the population of host cells for the relatively few transformants. Typically this is achieved by including a sequence encoding a marker(s) on the transforming vector. Such a marker confers a phenotypic characteristic which will permit successful transformants to grow under conditions which are not capable of supporting the growth of the non-transformed cells.
Markers useful in procaryotic systems are well known and their use has become substantially routine. Such markers include Amp.RTM., a DNA sequence which encodes .beta.-lactamase, an enzyme capable of degrading ampicillin, thus permitting the organism to grow in media containing the antibiotic; Tet.RTM., an analogous sequence for protection against tetracycline; and gene sequences encoding proteins which confer resistance to chloramphenicol, neomycin, and a number of other antibiotics. Plant cell transformations have often relied on infection with vectors such as those associated with Agrobacterium tumefaciens which confer selectable (but undesirable) traits on the transformants. These traits include overproduction and premature production of plant hormones, so that tumors result.
Selectable markers which are useful in other eucaryotic cells are not quite so well established. Some well known markers include the LEU2 gene which encodes the protein, .beta.-isopropyl malate dehydrogenase, and thus permits a host organism which is normally incapable of synthesizing leucine (most commonly a (LEU2.sup.-) yeast mutant) to grow in the absence of leucine; a herpes virus thymidine kinase marker (TK) which is a sequence encoding this enzyme essential for DNA synthesis and thus permits the growth of mutant (tk.sup.-) organisms otherwise deficient in it; a sequence which encodes dihydrofolate reductase (DHFR) which permits growth in DHFR deficient strains; and xanthine-guanosine ribosyl transferase (XGRT), which similarly replaces a deficiency in this enzyme. Such markers were employed in the processes for transforming eucaryotic cells disclosed by Axel, et al, in U.S. Pat. No. 4,399,216.
It will immediately be noted that eucaryotic markers typically exert their effect by replacing a deficiency in the host, in the case of mammalian or yeast cells, or by conferring undesirable characteristics in the case of plants. Thus, these markers are not suitable for selecting transformants of a population of wild-type eucaryotic cells. This is clearly disadvantageous, as it confines recombinant manipulations to laboratory strains of eucaryotes and cultures of mammalian cells which are in some way abnormal. It precludes their use in industrial eucaryotes, such as ordinary Baker's yeast, and forces recombinant production of protein conducted in mammalian or plant cells to employ hosts which have (often undesirable) abnormalities, such as a malignancy or to confer undesirable characteristics on the transformed host.
This deficiency has particular impact with respect to attempts to utilize recombinant techniques in industrial strains of yeast. There is a reservoir of commercial experience in handling these strains, and they have been developed to have desirable fermentation properties, e.g., high level ethanol production, formation of desirable secondary metabolic products; to have desirable physical properties, e.g., flocculation; and to grow on inexpensive minimal nutrients. The lack of a dominant (to wild type) selectable marker operable in cells confines the yeast recombinant technology to the more sensitive and fastidious laboratory strains and effectively constitutes a bar to commercial utilization of recombinant yeast.
In summary, the recombinant techniques which have been, in the past decade, used successfully to transform procaryotic hosts with expression systems for desired protein products or desired metabolic characteristics, and which have relied on selection of transformants using co-transforming markers, could be extended to eucaryotic systems, including plants, mammals, and industrial yeasts, if an effective marker permitting selection of successfully transformed wild type hosts were available. Such marker would permit selection of eucaryotic transformed hosts capable of, for example, producing any desired proteins. Desired proteins would include, among others, the interferons, hormones, enzymes, growth factors such as PDGF or CSF, toxin intermediates, or antigenic determinants for the manufacture of vaccines.
Recently, a selectable marker which is effective in confering a phenotype useful to distinguish transformed wild-type from untransformed wild-type cells has been studied. An aminoglycoside antibiotic, G418, is toxic not only to bacteria but to yeast, plant, and mammalian cells. Thus, a wide range of cells is unable to grow in its presence, absent a protective enzyme. The antibiotic is inactivated (via phosphorylation) by the enzyme activity referred to as aminoglycoside phosphotransferase (APH). Two different enzymes with this activity are known, APH-I and -II. The coding sequences for these enzymes are located on transposons Tn601 (also known as Tn903) (Sharp, P. A., et al, J Mol Biol (1973) 75:235, and on Tn5 (Jorgensen, R. A., et al, Mol Gen Genet (1979) 177:65), respectively. These two enzymes are unrelated except for their ability to inactivate G418, which they, in fact, effect with differing efficiencies; APH-I is approximately four times as effective as APH-II. In view of the toxicity of G418 to a wide variety of cells, the coding sequence for either of these enzymes becomes a candidate dominant selectable marker for use in a wide variety of hosts. The coding sequence for APH-I is, in fact, known and a 271 amino acid sequence for the enzyme has been deduced from it (Oka, A. et al, J Mol Biol (1981) 147:217).
Indeed, a number of workers have cloned systems which permit the expression of these sequences in non-bacterial hosts. Southern, P. J., et al J Mol Appl Genetics (1982) 1:327 showed that mammalian cells transformed with vectors containing the coding sequence for APH-II presumably under control of a SV40 viral promoter, acquired resistance to G418. Colbere-Garapin, F., et al, J Mol Biol (1981) 150:1 disclosed expression of the APH-II coding sequence in tk.sup.- and in monkey and human cell lines. The vectors used for transformation contained the TK promoter, presumably in such position to effect the expression levels. However, the bacterial promoter was also retained when tk.sup.- cells were used as hosts (though not when normal cells were used); transformation efficiency was extremely low in non-tk.sup.- cells.
Jimenez, A., et al Nature (1980) 287:869 presumably achieved some expression of the APH-I gene in yeast by co-transforming the Leu.sup.- -host Saccharomyces cerevisae with a mixture of pYE13 containing a LEU2 marker and ColicinEI derivative plasmid carrying the desired APH-I gene sequence presumably under control of sequences either indigenous to ColEI or to the enzyme encoding segment. Expression was shown only following selection for Leu.sup.+. Thus, Jiminez did not disclose a method to utilize the APH-I gene as a selection tool. Whether or not selection could have been made using G418 resistance as a criterion is unclear. No effort was made by Jimenez to place the coding sequence under the control of a yeast promoter. It would thus be unlikely that the gene expression levels would be sufficient to provide amounts of protein effective against selection pressure (as opposed to the ability of selected cultures to exhibit G418 resistance).
Fraley, R. T., et al Proc Natl Acad Sci (U.S.A.) (1983) 80:4803 describe the utilization of the coding sequence from the APH-I gene under control of a promoter derived from a bacterium capable of infecting plants, and terminated with a 3' untranslated terminating sequence also active in plants. Presumably under such control, expression of the gene for this enzyme, and also for the related APH-II analog was achieved in petunia cells. Finally, U.K. patent application No. GB2100738A, published Jan. 6, 1983, discloses expression of both the gene encoding APH-I and that encoding a protein which confers resistance to hygromycin B under the control of an SV40 promoter. Expression was achieved both in yeast and in mammalian mouse Ltk- cells.
Webster, T. D., et al, Gene (1983) 26:243 disclose the use of native APH-I as a selection marker useable after 18 hours from transformation in laboratory yeast strains.
In all of the above cases, the coding sequence for the desired G418 resistance gene was preceded by an uncontrolled number of nucleotides between the ntural ATG start codon and the control sequences and, indeed, the nearest upstream convenient restriction site which might permit manipulation with respect to such control sequences. As a result, random occurrences of ATG codons and TGA, TAA, or TAG termination codons in various reading frames in this preceding sequence interfere with a precise reproducible translation of the desired sequence. In addition, because the start codon is not immediately downstream from a convenient restriction site, it has not been possible to synthesize fusion proteins which contain the APH-I sequences at the C-terminal end. Thus, it has not been possible to use such a construction as a fusion flag whereby, analogous to the situation with .beta.-galactosidase, a selectable characteristic is conferred on a desired fusion protein. Such selectable fusion proteins are useful in optimizing the expression of the coding sequence for desired proteins fused to the N-terminal end of the flag, as well as in stabilizing peptide products which may be heterogeneous and unfamiliar to the host organism. If the techniques and vectors for so utilizing it were available, the employment of a marker such as APH-I in a fusion sequence would permit it to be used as an extremely convenient "affinity signal" for the purification of the complete fusion protein from the rest of the cellular milieu. In addition, such a fusion sequence may be used to confer added immunogenicity on the desired short peptide immediately preceding it.
Thus, what is desirable over and above the constructions now known in the art is a dominant selectable marker cassette which has restriction sites immediately preceding its start codon so as to permit, first, efficient construction of expression vectors and, second, production of fusion products. These capabilities will permit efficient selection of transformants in wild type eucaryotic systems, enable precise control of the expression of genes for desired sequences (as N-terminal sequences of the fusion proteins) in both procaryotic and eucaryotic hosts and (also as fusion proteins) will provide desired proteins with stabilization, immunogenicity, and desirable characteristics for purification.