A. The CPD Glycosylase DNA Repair Enzymes
DNA repair is a function essential to every living organism. A DNA repair enzyme is a protein which contributes to the restoration of damaged DNA to its native state. See Ronen and Glickman, 2001; Sancar et al., 2004; J. A. Nickoloff and M. F. Hoekstra, eds., DNA Damage and Repair, Volumes I and II, 1998. The types of DNA repair enzymes found in human and mammalian cells generally are found in other organisms. Additional types of DNA repair enzymes may be found in non-mammalian organisms, such as photolyases.
Particularly important DNA repair enzymes are the cyclobutane pyrimidine dimer glycosylases, known as CPD glycosylases. For example, the CPD glycosylase product of the bacteriophage T4 denV gene, specifically known as T4 endonuclease V, has become an important protein with wide application in research and in emerging pharmaceutical products (see U.S. Pat. Nos. '211, '231, '389).
The CPD glycosylases have a specificity for cyclobutane pyrimidine dimers in DNA in which, as currently understood in the art, the glycosylase action is mediated via an imino intermediate between the C1′ of the sugar of the DNA and an amino group in the glycosylase, followed typically by a β-elimination reaction resulting in cleavage of the phosphodiester bond of the DNA. As also currently understood in the art, the protein releases the 5′ pyrimidine of the cyclobutane pyrimidine dimer from the sugar of the DNA (hence the term glycosylase), which creates an apyrimidinic site in the DNA which is very sensitive to hydrolysis under alkaline conditions. CPD glycosylases have been identified in many different organisms, in some cases by purification, in other cases by detecting the unique activity of this protein in an extract, and in still other cases, the presence of the protein has been deduced from the close homology of the putative amino acid sequence of the protein coded by a nucleotide sequence to known CPD glycosylase proteins.
Attempts to express the CPD glycosylase, T4 endonuclease V, in large amounts have been disappointing. After bacteriophage T4 DNA enters its E. coli host, the denV gene is transcribed to produce T4 endonuclease V only within the first 2 minutes of infection. The first DNA clones containing the denV gene were unstable (Lloyd and Hanawalt, 1981). Stable cloning of the entire intact gene has proved to be impossible because the nucleotide sequences in this region of the phage genome are lethal to the host, and scientists have speculated that it is likely because the endogenous promoter is too strong for constitutive expression in E. coli (Valerie et al., 1986a, Chenevert, 1986). In fact, sequence analysis of the early gene promoters, including the denV promoter, shows that their consensus sequence is significantly different than that of the host consensus promoter sequence derived from analysis of 112 E. coli genes (Liebig and Rüiger, 1989). These authors note that “strong promoters can be cloned stably only when one or several strong terminators of transcription are inserted downstream from the cloning site” (p532).
Ultimately the native denV structural gene was cloned by piecing together fragments of the gene without its native promoter (Valerie et al.; 1984; Radany et al., 1984). The sequence is listed in the National Center for Biotechnology Information Gene database as GeneID 1258606, and it is the native structural gene sequence without the promoter that we refer to herein as the native denV gene.
Once cloned, the native denV gene was placed downstream of various promoters. In one case, the gene was positioned downstream of the λ phage leftward operator and the rightward promoter (OLPR; Recinos et al., 1986). Expression from the unrepressed promoter in a variety of E. coli host strains resulted in T4 endonuclease V levels no more than 0.2% of total cellular protein. Growth temperatures above 25° C. and glucose levels above 0.05% inhibited denV gene expression, although these conditions are ordinarily beneficial to the host cell growth.
In a second case, the native denV gene was placed under the control of the E. coli TAC promoter, whose expression is induced by the lactose analog isopropyl-thiogalactoside (IPTG; Chenevert et al., 1986). Induction of T4 endonuclease V from this expression vector in E. coli produced filamentation and cell death. Chenevert et al. report that the denV protein was 10% of the total cell protein based on a single gel (Chenevert et al., 1986: FIG. 5). In connection with the development of the present invention, more than a dozen fermentations at commercial scale showed that this expression vector is only able to produce T4 endonuclease V at about 1% of total protein (see Table 8 of Example 4 below).
Chenevert et al. also cloned the native denV gene under the control of the yeast GAL1 and ADH promoters; however, they reported neither the efficiency of expression of T4 endonuclease V nor the effect on cell growth in induced yeast (Chenevert et al., 1986). Significantly, they reported that the construct made wild type yeast more sensitive to UV, not more resistant as would be expected if the expressed enzyme was compatible with cell viability. Valerie et al. (1986b) cloned the native denV gene under the control of the yeast AAH5 promoter, and they found that the induced construct did not change the growth characteristics or cell morphology of the yeast host. They reported that the T4 endonuclease was estimated to be “several percent” of total protein by scanning of a single gel (Valerie et al. 1986b: FIG. 2), but this may be an overestimate since the gel is clearly overloaded with protein. They found that the construct in UV-sensitive yeast recombinants increased UV resistance.
The native denV gene has also been placed under the control of a CaMV 35S promoter and transferred into tobacco plants (Lapointe et al., 1996). The amount of T4 endonuclease V produced in the plant was not reported. As in the case of some yeast systems, the construct increased, rather than decreased, the sensitivity of the plants to UV and alkylating agents, i.e., the construct lacked DNA repair activity.
Attempts to improve T4 endonuclease V by changing the native denV gene nucleic acid sequence, resulting in a change in the amino acid sequence, have not met with success. For over a decade, the Lloyd laboratory modified the T4 endonuclease V amino acid sequence either by recombinant selection or by site-directed mutagenesis. Dodson and Lloyd (1989) summarized the effect of changes in the amino acid sequence on T4 endonuclease V activity. None of the dozens of recombinants that were reviewed showed greater activity than the native denV sequence, and many proved to be unstable in the cell.
Other laboratories, such as the Henderson laboratory (Green et al., 1993) and the Ohtsuka laboratory (Ishida et al., 1990; Doi et al., 1992; Hori et al., 1992) made numerous changes in the sequence of the native denV gene, resulting in substitutions in the amino acid sequence of the enzyme, all without significantly increasing the enzyme activity. The single exception from the Lloyd laboratory was a change in tyrosine at position 129 to a less polar aromatic amino acid, such as phenylalanine (see U.S. Pat. No. 5,308,762). This resulted in an increase in specific activity, but only under conditions of low salt. As these are not physiological conditions, this recombinant enzyme has no therapeutic value.
Attempts have been made to change the nucleotide sequence of the native denV without altering the amino acid sequence, but they have been unsuccessful. The first attempt was to change two AUA isoleucine codons (nucleotides 103-105, 137-139) to the AUC triplet, which did not change expression levels (Recinos et al., 1986). The Ohtsuka laboratory constructed a synthetic denV gene. They did not change the amino acid sequence, but did change the nucleotide sequence, for the purpose in their words, “to introduce extra restriction sites and to facilitate enzymatic rejoining with DNA ligase by avoiding self-complementary joining sites” (Inaoka et al., 1989). The gene was cloned under control of the trp promoter, and after induction the amount of T4 endonuclease V was reported to be about 15% of the total protein.
Standard expression systems for T4 endonuclease V or even Inaoka et al.'s system (collectively, systems which produce T4 endonuclease V at a level less than or equal to 15 percent of total protein) are unsatisfactory for commercial purposes, e.g., pharmaceutical purposes. Host E. coli contain the HU-α protein that is very similar in size and charge to T4 endonuclease V, and this protein contaminates preparations of T4 endonuclease V. These contaminants are not sufficiently removed by standard purification methods used in industrial scale purification, such as size exclusion or ion exchange. The resulting preparations inevitably contain >10% HU-α contamination.
The commercial practitioner is thus faced with a dilemma: it can increase purity by reducing the yield, but this makes the product too costly for commercialization; or it can increase the yield by reducing the purity, but this makes the product unacceptably contaminated for pharmaceutical use.
B. High Expression and the Production of Inclusion Bodies
In addition to the foregoing difficulties, high expression of proteins in-host cells causes the proteins to be expressed as inclusion bodies. These are precipitated or coagulated accumulations of inactive proteins that are insoluble. Inclusion bodies therefore are lost to further purification, and the increased production of a protein from an expression vector is of no value if the proteins are not recovered and/or they are not in active form. Inclusion bodies are a significant problem for high expression vector systems.
Prior attempts to deal with the inclusion problem have included the use of high hydrostatic pressure to refold protein aggregates (St. John et al., 1999; Hesterberg et al., 2005). This process is unsatisfactory for three main reasons: (1) the process uses volumes that are fixed by the size of the vessel, which is especially problematic in scaling up to commercial volumes, because each increase in scale requires that a new vessel must be built and validated; (2) the high pressure of the hydrostatic pressure method (29,000 psi) is more difficult and costly to achieve and maintain than lower pressures; and (3) the hydrostatic method still requires that the cells be lysed before processing.
As recently discussed by Kathy Liszewski in the Oct. 15, 2003 edition of Genetic Engineering News, there is no universal solution to the inclusion body problem. The approach by the current state of the art is to recover the inclusion bodies and attempt to refold them. Lieszweski quotes Paul Haney, Ph.D., senior research scientist at Pierce Biotechnology as saying, “Refolding proteins can be a cumbersome and time-consuming task, since refolding conditions have to be optimized for each protein in order to promote formation of the native fold and to prevent protein aggregation. There is no universal refolding buffer system.” The solution thus must be found for each individual protein, and there is no guarantee that a solution can be found.
C. Summary of the State of the Art
As shown by the history of the cloning of T4 endonuclease V set forth above and as discussed by Dr. Claes Gustafsson in Genetic Engineering News, 2005, the design of expression vectors is not at all an exact or routine undertaking. Consequently, high expression vectors for CPD glycosylases and, in particular, for T4 endonuclease V (i.e., expression vectors capable of producing the protein at levels equal to or greater than 25% of cellular protein) have not existed in the art. Alterations in the amino acid sequence have failed to increase yield or activity of T4 endonuclease V. Similarly, alterations in the nucleotide sequence coding for T4 endonuclease V have not produced significantly increased yield or activity.
The real and perceived barriers to high expression and recovery of CPD glycosylases have been:
(1) The art has not recognized that nucleotide sequences for CPD glycosylases have been optimized by evolution for expression at low levels consistent with the other functions of the cell. For example, the denV gene sequence has been optimized by evolution for the fastest and highest level of expression only during the early phase of phage T4 infection and for shutdown thereafter.
(2) The art has erroneously believed that high levels of CPD glycosylases, e.g., T4 endonuclease V, cannot be achieved because supposedly such levels are lethal to the cell, e.g., E. coli. 
(3) Proteins expressed at high levels typically form inclusion bodies, which are insoluble and inactive, and the art has not developed efficient and economical techniques for recovering activity from proteins in inclusion bodies, and, in particular, recovering DNA repair activity from CPD glycosylase proteins in inclusion bodies.