This invention relates to new DNA-glycosylases, in particular new cytosine-, thymine- and uracil-DNA glycosylases, and their use for mutagenesis, for NA modification and cell killing.
Damage to DNA arises continually throughout the cell cycle and must be recognised and repaired prior to the next round of replication to maintain the genomic integrity of the cell. DNA base damage can be recognised and excised by the ATP-dependent nucleoside excision repair systems or by base excision repair systems exemplified by the DNA glycosylases.
DNA glycosylases are enzymes that occur normally in cells. They release bases from DNA by cleaving the bond between deoxyribose and the base in DNA. Naturally occurring glycosylases remove damaged or incorrectly placed bases. This base excision repair pathway is the major cellular defence mechanism against spontaneous DNA damage.
DNA glycosylases which have been identified are directed to specific bases or modified bases. An example of a DNA glycosylase which recognizes an unmodified base is uracil DNA glycosylase (UDG), which specifically recognises uracil in DNA and initiates base excision repair by hydrolysing the N-C1xe2x80x2 glycosylic bond linking the uracil base to the deoxyribose sugar. This creates an abasic site that is removed by a 5xe2x80x2-acting apurinic/apyrimidic (AP) endonuclease and a deoxyribophosphodiesterase, leaving a gap which is filled by DNA polymerase and closed by DNA ligase.
The activity of UDG serves to remove uracil which arises in DNA as a result of incorporation of dUMP instead of dTMP during replication or from the spontaneous deamination of cytosine. Deamination of cytosine to uracil creates a premutagenic U:G mismatch that, unless repaired, will cause a GCxe2x86x92AT transition mutation.
In vivo, UDGs specifically recognise and remove uracil from within DNA and cleave the glycosylic bond to initiate the uracil excision pathway. In vitro, UDG""s can recognise and remove uracil from both single stranded DNA (ssDNA) and double-stranded DNA (dsDNA) substrates.
UDGs are ubiquitous enzymes and have been isolated from a number of sources. Amino acid sequencing reveals that the enzymes are conserved throughout evolution with greater than about 55% amino acid identity between human and bacterial proteins. A cDNA for human UDG has been cloned and the corresponding gene has been named UNG (Olsen et al. (1989) EMBO J., 8: 3121-3125).
The crystal structures of the human enzyme (Mol et al., (1995) Cell, 80: 869-878) and the herpes simplex virus enzymes (Sava et al. (1995) Nature, 373: 487-493) have recently been determined and reveal that uracil binds in a rigid pocket at the base of the DNA binding groove of human UDG. The absolute specificity of the enzyme for uracil over the structurally related DNA bases thymine and cytosine is conferred by shape complementary, as well as main chain and side chain hydrogen bonds.
Although UDG""s do not have activity against other bases as a result of the afore-mentioned specific spatial and charge characteristics of the active site, other glycosylases with different activities have been identified, which may or may not be restricted to single substrates.
A naturally-occurring thymine-DNA glycosylase has been identified which in addition to releasing thymine also releases uracil (Nedderman and Jiricny (1993) J. Biol. Chem., 268: 21218-21114; Nedderman and Jiricny (1994) J. Proc. Natl. Acad. Sci. U.S.A., 91: 1642-1646). This thymine-DNA glycosylase however has activity in respect of only certain substrates and has an absolute requirement for a mismatched U or T opposite of a G in a double-stranded substrate and will not recognise T or U from T(U): A matches or a single-stranded substrate. DNA glycosylases which recognize and release unmodified bases other than uracil and thymine (in certain substrates, as mentioned above) have not been identified.
A DNA glycosylase recognizing unmodified cytosine has not been reported, although a 5-hydroxymethylcytosine-DNA glycosylase activity was detected in mammalian cells (Cannon et al. (1988) Biochem. Biophys. Res. Comm., 151: 1173-1179). The sequences of the afore-mentioned thymine and 5-hydroxymethylcytosine DNA glycosylases have not yet been reported and it is unknown whether their active site may be structurally related to UDG.
It has now surprisingly been found that the substitution of certain of the UDG amino acids has a profound effect on the substrate specificity of the glycosylase. In particular, the replacement of Asn204 by Asp204 results in the production of a mutant enzyme which has acquired cytosine-DNA glycosylase (CDG) activity, while retaining some UDG-activity. Alternatively, replacing Tyr147 with Ala147 allows for binding of thymine, resulting in an enzyme that has acquired thymine-DNA glycosylase (TDG) activity.
These new DNA glycosylases are not product-inhibited by added uracil, in contrast to UDG and other UDG-mutants. Compared with the efficiency of wild type UDG in removal of uracil, the activity of the new DNA glycosylases that remove normal pyrimidines in DNA is low, but distinct and easily detectable. However, it should be noted that the very high turnover of UDG appears to be unique among DNA glycosylases and turnover numbers of other DNA glycosylases may be as low, or even lower than those of the engineered glycosylases CDG and TDG. This may result from the narrow substrate specificity of UDG.
Furthermore, an additional new UDG has been identified. The complete sequence of the UNG gene was recently published (Haug et al., 1996, Genomics, 36, p408-416). As mentioned previously, cDNA to this UNG gene has been identified by Olsen et al., 1989, supra (hereinafter referred to as UNG1 cDNA and the expressed protein referred to as UNG1). Other workers have described the location, gene structure and recombinant expression of UNG1 (Slupphaug et al., 1993, Nucl. Acids Res., Vol. 21, No. 11, p. 2579-2584; Haug et al., 1994, FEBS Letters, 353, p. 180-184; Slupphaug et al., 1995, Biochemistry, 34, p. 128-138, respectively). It has now surprisingly been found that alternative splicing of the genomic DNA (UNG) with an exon located 5xe2x80x2 of exon 1 which was not previously recognized results in a new distinct cDNA with an open reading frame of 313 amino acids. The new UNG CDNA is referred to hereinafter as UNG2 cDNA, and the product which it encodes, UNG2. The latter protein has a predicted size of 36 kDa.
UNG2 differs from the previously known form (UNG1, ORF 304 amino acid residues) in the 44 amino acids of the N-terminal presequence, which is not necessary for catalytic activity. The rest of the presequence and the catalytic domain, altogether 269 amino acids, are identical. The alternative presequence in UNG2 arises by splicing of a previously unrecognized exon (exon 1A) into a consensus splice site after codon 35 in exon 1B (previously designated exon 1). The UNG1 presequence starts at codon 1 in exon 1B and thus has 35 amino acids not present in UNG2. Coupled transcription/translation in rabbit reticulocyte lysates demonstrated that both proteins are catalytically active. Similar forms of UNG1 and UNG2 are expressed in mouse which has an identical organization of the homologous gene. Furthermore, the presequence of a putative Xiphophorus UNG2 protein predicted from the gene structure is homologous to mammalian UNG2, but much shorter, suggesting a very high degree of conservation from fish to man.
The invention therefore provides a DNA glycosylase capable of releasing cytosine bases from single stranded (ss) DNA and/or double stranded (ds) DNA or thymine bases from both single stranded (ss) DNA and double stranded (ds) DNA or from single stranded (ss) DNA or uracil bases from single stranded (ss) DNA and/or double stranded (ds) DNA, wherein said uracil-DNA glycosylase is encoded by a nucleic acid molecule comprising the sequence (SEQUENCE I.D. No 1):
or a fragment thereof encoding a catalytically active product comprising at least nucleotides 121 to 130, preferably 71 to 202 in addition to the catalytic domain, or a sequence which is degenerate, substantially homologous with or which hybridizes with at least nucleotides 121 to 130, preferably 71 to 202 of any such aforesaid sequence.
In particular, viewed from one aspect, the invention can be seen as providing a cytosine-DNA glycosylase (CDG) capable of releasing cytosine bases from ssDNA and/or dsDNA.
A further aspect of the invention provides a cytosine-DNA glycosylase (CDG) capable of releasing both cytosine and uracil bases from ssDNA and/or dsDNA.
Preferably, the cytosine-DNA glycosylase is one derived from a UDG and especially from the human UDG protein which has Asn at amino acid position 204. In particular, the novel CDG of the invention is preferably derived from human UDG and has an amino acid substitution or modification at position 204. Modification of UDG from other species at an equivalent residue is similarly preferred. Especially preferably, the glycosylase is human UDG having an aspartic acid residue (Asp) at position 204.
Another aspect of the invention provides a thymine-DNA glycosylase (TDG) capable of releasing thymine bases from both ssDNA and dsDNA.
A further aspect of the invention provides a thymine-DNA glycosylase (TDG) capable of releasing both thymine and uracil bases from both ssDNA and dsDNA.
Yet further aspects of the invention provide a thymine-DNA glycosylase (TDG) capable of releasing thymine bases from A:T DNA pairs and a thymine-DNA glycosylase (TDG) capable of releasing thymine bases from single stranded DNA.
Preferably, the thymine-DNA glycosylase is one derived from a UDG, and especially from the human UDG protein which has Tyr at amino acid position 147. In particular, the novel TDG of the invention is preferably derived from human UDG and has an amino acid substitution or modification at position 147. Modification of UDG from other species at an equivalent residue is similarly preferred. Especially preferably, the glycosylase is human UDG having a alanine residue (Ala) at position 147.
A yet further aspect of the invention provides a uracil-DNA glycosylase encoded by a nucleic acid molecule comprising the sequence (SEQUENCE I.D No 1)
or a fragment thereof encoding a catalytically active product comprising at least nucleotides 121 to 130, preferably 71 to 202 in addition to the catalytic domain, or a sequence which is degenerate, substantially homologous with or which hybridizes with at least nucleotides 121 to 130, preferably 71 to 202 of any such aforesaid sequence. Preferably such degeneracy, homology or hybridization applies to the entire sequence.
The above nucleic acid molecule encodes a protein having the amino acid sequence as indicated below (SEQUENCE I.D. Nos 1 and 2):
xe2x80x9cCatalytically active productxe2x80x9d as used herein refers to any product encoded by said sequence which exhibits uracil DNA glycosylase activity.
xe2x80x9cSubstantially homologousxe2x80x9d as used herein includes those sequences having a sequence homology of approximately 60% or more, eg. 70% or 80% or more, and also functionally-equivalent allelic variants and related sequences modified by single or multiple base substitution, addition and/or deletion. By xe2x80x9cfunctionally equivalentxe2x80x9d in this sense is meant nucleotide sequences which encode catalytically active polypeptides, ie. having uracil DNA glycosylase activity.
Sequences which xe2x80x9chybridizexe2x80x9d are those sequences binding under non-stringent conditions (eg. 6xc3x97SSC 50% formamide at room temperature) and washed under conditions of low stringency (eg. 2xc3x97SSC, room temperature, more preferably 2xc3x97SSC, 42xc2x0 C.) or conditions of higher stringency (eg. 2xc3x97SSC, 65xc2x0 C.) (where SSC=0.15M NaCl, 0.015M sodium citrate, pH 7.2). Generally speaking, sequences which hybridize under conditions of high stringency are included within the scope of the invention, as are sequences which, but for the degeneracy of the code, would hybridize under high stringency conditions.
The significance of the UNG1, UNG2 presequence has also been investigated in the present invention, by the use of constructs that express fusion products of UNG1 or UNG2 and green fluorescent protein (EGFP). Surprisingly, significant effects on subcellular targeting were observed and after transient transfection of HeLa cells, the pUNG1-EGFP-N1 product co-localized with mitochondria whereas the pUNG2-EGFP-N1 product targeted exclusively to nuclei. Whilst not wishing to be bound by theory, it appears that these sequences may be instrumental in the localization of the enzymes. The putative nuclear signal was identified as RKRH which also appears in the catalytic domain of both UNG1 and UNG2. Whilst it was recognized previously by Slupphaug et al., 1993, Nucl. Acids Res., 21(11), p2579-2584, that the signal for mitochondrial translocation resides in the UNG1 presequence, it was believed that the signal for nuclear import lay within the mature protein as in the absence of the presequence, UNG1 was transported to the nucleus. However, UNG2 has now been identified which has a presequence and which localizes to the nucleus. These presequence thus have utility for directing the subcellular localization of molecules attached to them.
Thus, viewed from a further aspect, the invention provides nuclear localization peptides encoded by a nucleic acid molecule comprising the sequence (SEQUENCE I.D. Nos 3 and 4):
or a fragment thereof encoding a functional equivalent or a sequence which is degenerate, substantially homologous with or which hybridizes with any such aforesaid sequence.
Functionally equivalent fragments refer to products which may serve as appropriate localization peptides. Especially preferred nuclear localizing peptides are those which include the amino acid sequence RKRH.
A further preferred feature of the invention comprises DNA glycosylases of the invention which additionally comprise at least one of the aforesaid nuclear localization peptide sequences or at least one mitochondrial localization peptide sequence encoded by a nucleic acid molecule comprising the sequence (SEQUENCE I.D. Nos 5 and 6):
or a fragment thereof encoding a functional equivalent or a sequence which is degenerate, substantially homologous with or which hybridizes with any such aforesaid sequence, e.g. CDG or TDG with a localization peptide. Such a composite may be prepared for example by appropriate modification of UNG1 or UNG2.
The novel DNA glycosylases of the invention conveniently may be obtained by modification of existing DNA glycosylase enzymes, such as the human UDG mentioned above. Such modification, for example by replacement, addition or deletion of one or more amino acid residues, or indeed chemical modification of amino acid residues, may readily be achieved using methods well known in the art and include modifications both at the protein level and also at the level of the encoding nucleic acid. For example, site-directed mutagenesis techniques are widely described in the literature. Other conventional mutagenesis treatments which may be used to obtain enzymes according to the invention include random or regional random mutagenesis by chemical agents, such as N-nitroso compounds, or physical agents, such as ultraviolet light, as well as random or regional random mutagenesis by polymerase chain reaction (PCR) methods. Regional random mutagenesis may be carried out by subcloning one or more relevant DNA sequences encoding segments of the starting protein e.g. UDG, followed by random mutagenesis on this fragment or fragments. After the fragments have been mutagenized they may be reinserted into a DNA sequence encoding the starting protein e.g. UDG. Screening of individual colonies for novel DNA glycosylases of the invention may then be performed using assay methods described herein.
Alternatively, the novel DNA glycosylases of the invention may be obtained by other techniques, for example polypeptide synthesis, construction of fusion proteins etc.
DNA glycosylase activity may readily be assayed according to techniques well known in the art, see for example Slupphaug et al. (1995) Biochemistry, 34: 128-138, and Nedderman and Jiricny, supra. Assays for DNA glycosylase may be used for identifying enzymes according to the invention. The enzymes may be naturally occurring or formed as the result of manipulations of naturally occurring gene sequences or products. Thus, for example, a cell-free extract may be assayed using a thymine or cytosine-containing substrate to identify enzymes which perform excision of one or more of the bases. For the purposes of assessment, the cytosine and thymine bases in the substrates are conveniently labelled, for example fluorescent or radiolabelled e.g. with 3H. Suitable substrates may be prepared by methods known in the art e.g. by nick translation, random priming, PCR or chemical synthesis. To ascertain if the enzymes are also capable of excising uracil, substrates including uracil may also be used. Conveniently, the uracil bases should be labelled to allow detection. Assays for the excision of different bases are preferably performed independently.
Thus, viewed from a yet further aspect, the invention provides an assay for the identification of DNA glycosylases of the invention in a sample, in which said assay comprises at least the step of assaying for activity in the sample which is capable of excising thymine or cytosine and optionally also uracil from an introduced ssDNA and/or dsDNA substrate. Optionally, the moiety responsible for such activity may be isolated. Suitable assays are described herein and are also known in the art.
DNA glycosylases of the invention include modifications of human UDG by amino acid replacement, as mentioned above, especially at positions 204 and 147. Such amino acid-substituted mutants of human UDG may also comprise additional modifications, for example truncation from the N- and/or C-terminal, or chemical derivation of amino acid residues and/or addition, deletion or mutation of constituent residues which do not affect the overall specificity of the enzyme.
Derivatives of UDG or other DNA glycosylase enzymes from other genera or species, having the CDG or TDG functional activity mentioned above, are also included within the scope of the invention. It will be appreciated that appropriate modification of such enzymes would be performed on comparable residues to those in the human enzyme which form part of the active site and which could be identified by methods known in the art, e.g. by sequence comparison to human UDG and/or by mutation of residues which are identified as potentially conferring specificity to the enzyme and subsequent substrate specificity analyses of the mutant enzymes thus obtained.
The novel DNA glycosylases of the invention may have a number of uses, for example as tools in molecular biology procedures, most notably in mutagenesis, both in vitro and in vivo, but also in other areas such as cell killing, removal of contaminating DNA, random degradation of DNA, enzymatic DNA sequencing etc.
In light of the identification of mitochondrial and nuclear localizing peptides, it is now possible to direct human uracil-DNA glycosylase either to nuclei or to mitochondria by making constructs containing either a nuclear localization signal, such as in UNG2, or a mitochondrial localization signal, such as in UNG1, as mentioned above. Whilst this alone may be used to mutate RNA in the cells, this is particularly useful in combination with site directed mutations that give rise to mutants that have either TDG activity or CDG activity because it allows for selective mutagenesis of nuclear DNA or mitochondrial DNA. Furthermore, it is useful in a system where either nuclear or mitochondrial DNA is the target for degradation for the purpose of killing cells, eg. cancer cells.
As mentioned above, DNA glycosylases according to the invention may be used in a mutagenesis system both in vitro and in vivo. These proteins have numerous advantages over typical chemical mutagens, particularly regarding their ease of use. Small molecular mutagens, such as methylnitrosurea (MNU), methylmethanesulfonate (MMS) or methylnitrosoguanidine (MNNG) are very toxic on contact with eyes, skin or mucosal membranes and may decompose to explosive and volatile toxic compounds. Other mutagens, such as dimethylnitrosamine and benzo(a)pyrene require metabolic activation by special enzymes that are only present in some cells. They can therefore only be used under certain experimental conditions and will often require the addition of a fraction containing activating enzymes. All these chemical mutagens therefore require specialised precautions in order to protect the user. One major advantage of DNA glycosylases according to the invention is that they are not volatile and are not harmful to the user, for example, by mere skin contact.
Mutagenesis in vitro may be performed on a complex sample, e.g. a cell-free extract, a partially refined sample, e.g. nucleic-acid enriched or purified sample or on a single population of nucleic acid material, e.g. amplified nucleic acid material. Random mutation may be performed using selected DNA glycosylases of the invention (possibly in combination with one another and/or with known DNA glycosylases), to release particular bases or combinations of bases from the nucleic acid substrate. Removal of the resulting abasic site and replacement of the removed base with another base may be performed by provision of appropriate enzymes and bases.
Specific mutagenesis may be performed in a number of ways. Depending on the specificity of the DNA glycosylase for ssDNA or dsDNA, either one or the other type of DNA may be targeted. One application of such a method may be to introduce labelled bases into the target DNA to identify its presence or amount in the total nucleic acid material. Alternatively, the substrate which is uniquely recognizable (e.g. dsDNA) may be made sensitive to digestion or degradation after release of the appropriate base by DNA glycosylase activity when replacement of the base has not been performed. This may then be used to remove certain ss-or ds-DNA from a sample. Such an application is discussed in more detail hereinafter.
Another application involves the introduction of selected bases after release of the specific bases recognized by the DNA glycosylase. In this way, replacement of specific bases by specific other bases may be performed. It is known from the art that the human UDG has sequence specificity for uracil excision in the sequence surrounding the uracil base (Slupphaug et al., 1995, supra). Appropriate selection of enzyme concentrations and other determinants may be employed to excise specific bases from known sequences or alternatively, by replacement with appropriately labelled bases, to determine the presence of such sequences in nucleic acid samples.
For mutagenesis in vivo, e.g. in a cell, a nucleotide sequence encoding a DNA glycosylase according to the invention under the control of an suitable expression vector may be introduced into the cell by any suitable means, for example, by transformation or through the use of liposomes.
A further aspect of the present invention thus provides a nucleic acid molecule comprising a nucleotide sequence which encodes a DNA glycosylase and/or nuclear localizing peptide of the invention as defined above. Such nucleic acid molecules may readily be prepared using conventional techniques well known in the art. Thus, for example, as already mentioned above, known gene sequences coding for DNA glycosylases, e.g. the UNG gene mentioned above, may be modified e.g. by nucleic acid substitution using standard techniques such as site-directed mutagenesis.
In further aspects the invention also provides an expression vector containing a nucleic acid molecule of the invention, and transformed or transfected host cells carrying a nucleic acid molecule of the invention.
The expression vector may be any conventional expression vector known in the art or described in the literature, including both phage and plasmid vectors. In general, these will comprise suitable regulatory sequences e.g. a promoter and/or enhancer operably connected to a gene expressing the enzyme. Suitable promoters include SV40 early or late promoter, e.g. PSVL vector, cytomegalovirus (CMV) promoter and mouse mammary tumour virus long terminal repeat, although preferably inducible promoters are used, e.g. mouse metallothionein I promoter. The vector preferably includes a suitable marker such as a gene for dihydrofolate reductase or glutamine synthetase. The expression vector may for example be an inducible vector, such as the E. Coli vector pTrc99A (See Slupphaug (supra)) inducible with isopropyl xcex2-D-thiogalactopyranoside (IPTG). Other suitable expression vectors include any vector carrying an inducible promoter, such as lac, or bacteriophage lambda xcexPL, in which the promoter is under the control of a temperature sensitive repressor (cI). Examples of such vectors are pKK223-2 and pPL-Lambda Inducible (from Pharmacia). The DNA glycosylases of the invention may also be expressed as fusion proteins. The expression of such fusion proteins may facilitate purification e.g. by using a system such as the GST-gene fusion systems, exemplified by the pGEX(copyright) vector systems (Pharmacia) or the fusion proteins with peptide sequences that are recognized by specific antibodies, exemplified by the FLAG(copyright) Expression vectors (Kodak).
The host cell may likewise be any suitable host cell known in the art, including both eukaryotic e.g. yeast, mammalian and plant cells, and prokaryotic cells, e.g. bacteria.
Transfection and transformation techniques are also well known in the art as described for example in Sambrook et al. (1989), Molecular Cloning: A laboratory manual, 2nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) as indeed are other techniques for introducing nucleic acids into cells, for example using calcium phosphate, DEAE dextran, polybrene, protoplast fusion, liposomes, electroporation, direct microinjection, gene cannon etc.
Expression of the DNA glycosylase according to the invention results in the release of C or T from the cellular DNA, which may lead to transition mutations upon replication.
Mutagenesis of cells, e.g. mammalian cells, may also be performed by introduction of the DNA glycosylase protein of the invention into the cell. This may be performed using for example liposomes or other appropriate techniques known in the art.
TDG or CDG may also be used to specifically induce mutations either in the cell nucleus or mitochondria of eukaryotic cells. This may be carried out by expressing cDNA with the complete open reading frame of UNG2, but with a site directed mutation in codon 204 (preferably Asn204Asp) or in codon 147 (preferably Tyr147Ala), in which the N-terminal amino acid sequence contains a nuclear localization signal, as described previously, to obtain mutations in the nuclear DNA, or by expressing a cDNA expressing the complete reading frame of UNG1, in which the N-terminal amino acid sequence contains a mitochondrial localization signal, as described previously, with similar site directed mutations to those mentioned above, to specifically obtain mitochondrial mutations. For this purpose any expression vector applicable to eukaryotic cells may be used, but preferably the vector system should be inducible. To introduce the expression vectors into the cells, any method for transfection my be used. Alternatively, the same proteins may be expressed and purified and then introduced into the cells by liposome technology or other appropriate techniques in the art as mentioned previously.
Combined in vitro/in vivo mutagenesis may also be performed. For example, an isolated restriction fragment of interest (or possibly the whole plasmid) may be treated with limited amounts of cytosine-DNA glycosylase or thymine-DNA glycosylase. Subsequently, the treated fragment may be reinserted into a vector and transformed into E. coli cells (the cells may also be pre-treated with a DNA damaging agent to ensure an error-prone SOS-repair). As a result of the mutagenicity of AP-sites, this should yield random mutations. DNA glycosylases of the prior art were limited in their usefulness in mutagenesis due to their ability to achieve site-directed mutation only (see for example WO93/18175).
The Examples below describe the induction of mutations in bacterial cells by the expression within such cells of a DNA glycosylase ie. a CDG or TDG according to the present invention. Expression of the DNA glycosylases of the invention in the transformed cells causes an increase in mutation frequencies. Similar results may be obtained with other cells. To enhance mutagenesis, strains may be used, including both prokaryotic and eukaryotic strains, which are defective in the repair of AP-sites or are otherwise hypermutatable e.g. bacterial mutants that are defective in endonuclease IV or exonuclease III, or both, or other mutants that similarly enhance the yield of mutations.
Thus, the use of one or more DNA glycosylases according to the invention in in vitro and/or in vivo mutagenesis systems provide yet further aspects of this invention.
Another use of DNA glycosylases of the invention involves DNA modification. By treating any type of DNA (single or double-stranded) in vitro with a DNA glycosylase according to the invention, naturally-occurring C or T will be released, thus leaving an apyrimidic site (AP-site). Subsequent treatment of this DNA with alkaline solutions or enzymes such as apurinic/apyrimidinic-site endonucleases (AP-endonucleases) recognising AP-sites will cause breaks in the DNA at the AP-sites. This method may therefore be used for the random cleavage of DNA. The number of cleaved sites will depend on the amount of the DNA glycosylases according to the invention used, thus allowing the number of AP-sites and hence breaks to be controlled. Uses of such methods include the removal of possible contaminating DNA prior to PCR amplification and for the enzymatic sequencing of DNA. The random cleavage of DNA can also be used for producing randomly fragmented DNA of defined size ranges for different purposes, for example for efficient hybridization of DNA, for preparing genomic libraries or for removal of high-molecular weight viscous DNA.
One advantage of using a DNA glycosylase according to the invention in such methods is that in contrast to nucleases, DNA glycosylases do not require divalent cations and this is advantageous when buffers containing divalent cations are not desirable. A further advantage is that the DNA glycosylase may be inactivated by heating the reaction mixture to 80xc2x0 C. for 15 minutes, thus eliminating or substantially reducing its activity.
Uracil-DNA glycosylase has previously been shown to be efficient in removing contaminating DNA prior to PCR amplification (see for example EP-A-624643). This method has the disadvantage that only DNA containing uracil could be removed and meant that uracil-containing DNA had to be prepared using appropriate uracil-containing primers to obtain DNA which could be removed prior to amplification. One advantage of the DNA glycosylases according to the present invention is that they do not have this requirement as any contaminating DNA would be likely to contain cytosine or thymine bases. Thus, CDG and/or TDG according to the invention may be added to a reaction mix and allowed to digest contaminating DNA. After treatment the enzymes/s are inactivated prior to the addition of the DNA sample and amplification to avoid degradation of the template or product.
Thus a further aspect of the invention provides the use of one of more DNA glycosylases according to the present invention for removing contaminating DNA prior to PCR amplification. The use of one or more DNA glycosylases according to invention in DNA modification provides a further aspect of the invention. The term xe2x80x9cmodificationxe2x80x9d as used herein refers to all forms of modifying or manipulating DNA, including cleavage, base substitution or insertion etc.
A DNA glycosylase according to the present invention may also be used in a method for the killing of cells. A DNA glycosylase according to the present invention may be introduced into specific target cells by means of known transformation techniques, liposomes, specific targeting systems such as ligands that bind to specific receptors, or any other suitable techniques. The DNA glycosylase may be expressed in a tissue-specific manner by placing a tissue-specific promoter upstream of the DNA sequence encoding a DNA glycosylase according to the present invention. Examples of such tissue-specific promoters are well known and are for example found in genes for a number of liver specific proteins such as albumin, blood clotting factors and apolipoproteins; several hormones, such as human growth hormone from the pituitary gland and insulin from Langerhans islands in pancreas, as well as aromatase involved in the estrogen biosynthetic pathway; porphobilinogen deaminase which is the third enzyme in the heme biosynthetic pathway; glycoprotein IIb/IIIa which is expressed in maturing megakaryocytes; the Zeta subunit of T-cell antigen receptor (TCR) which is expressed in T-cells; CD14 expressed in monocytes and macrophages; villin expressed in certain epithelial tissues and tyrosinase expressed in melanocytes and melanomas. In some cases abnormal expression from tissue specific promoters has been observed in tumour cells, and this may be exploited by using constructs of novel DNA glycosylases and the relevant tissue specific promoter.
When the DNA glycosylase is expressed it may fragment the DNA in the cell and therefore kill the cell. Specific cells may also be targeted through the use of promoters containing other control elements, for example, promoters which are controlled in a cell-cycle or temporal manner or those possessing regulatory elements responsive to internal or external factors, e.g. promoters activatable by specific inducers, e.g. the inducer IPTG, which induces the lac promoter or lac derivatives such as trc, by certain metals (e.g. metallothionein promoter), by certain hormones such as dexamethasone, androgens (on for example the promoter of the gene for prostate specific antigen which is tissue specific), retinic acid and certain cytokines.
Conceivably, where enzymes of the invention exhibit specific substrate requirements in the sequence surrounding the base for excision, this specificity may be employed by appropriate low level expression of the DNA glycosylase such that only DNA with the specific sequence is made susceptible to degradation.
Thus a further aspect of the invention provides a method of killing cells, comprising the steps of introducing a DNA glycosylase according to the present invention into a cell and expressing said DNA glycosylase in the cell to an extent which results in the killing of that cell. Preferably, the DNA glycosylase according to the present invention is contained within an expression vector, most preferably, a tissue-specific expression vector.
A further use of DNA glycosylases of the invention is for performing enzymatic DNA sequencing. This may be performed in a manner analogous to the chemical sequencing method of Maxam and Gilbert (Maxam and Gilbert (1980) Methods in Enzymology, 65: 499). However, the Maxam-Gilbert procedure involves the use of several very toxic chemicals, such as dimethylsulfate (DMA) and hydrazine (the latter is also explosive) and use of the glycosylases of the invention present a considerable advantage. Enzymatic sequencing may be performed for example by end-labelling the sample DNA fragment appropriately, for example with 32P, 33P or 35S. For identifying the positions of cytosines and thymines in the DNA, the DNA is treated with limiting amounts of cytosine-DNA glycosylase and thymine-DNA glycosylase according to the invention, respectively. The resulting AP-sites are then cleaved, e.g. by alkaline solution (pyridine) or by an AP-endonuclease. The resulting end-labelled fragments are subsequently separated e.g. by electrophoresis and the position of fragments of varying length identified appropriately, e.g. by autoradiography. Ideally, the positions of adenines and guanines should be determined in the same way using adenine- or guanine-DNA glycosylases. At the present time such enzymes are not available. However, the E coli DNA repair enzymes Tag and AlkA recognize adenine alkylated in the 3-position (Tag, AlkA) and guanine alkylated in the 3-position (AlkA). Thus, one way of determining the positions of adenines and guanines may be after alkylation of DNA with DMS, followed by treatment with AlkA and Tag. Subsequent experimental procedure may be performed as for determining the C and T positions.
Thus, a further aspect of the invention provides a method of performing enzymatic DNA sequencing to determine the position of cytosine and/or thymine bases by treating said DNA with at least one CDG and/or TDG of the invention.