As is common knowledge, mutations can arise in sections of the nucleotide strands of genomic DNA either by deletion or insertion of part or all of a nucleotide base sequence or by an alteration of one or more nucleotide bases. Transcription from such mutated DNA sections ray then lead to defective protein products. Such defective protein products can be a cause of many genetic diseases or disorders. Such defective protein products that arise in fermentation processes in biotechnology may also be dysfunctional or harmful.
The ability to detect mutations in coding and non-coding DNA is important for the diagnosis of inherited disease. Nucleotide changes in a normal (i.e. wild-type) gene sequence are called gene mutations and can be either harmful or harmless. For example, a harmful gene mutation can be a single base pair change in a gene encoding an essential protein. A single base pair change or small insertion or deletion can result in a frame shift, a stop codon, or a non-conservative amino acid substitution, each of which can result in an inactive protein. A harmless gene mutation can be a gene polymorphism which results in a protein product with no detectable change in normal function. Mutation in non-coding DNA can also lead to disease, as in, for example, mutations in non-coding splice sites (found in certain cases of cystic fibrosis disease for instance) or mutations in transcriptional regulatory elements (found in certain defects of .beta.-globin genes).
It is possible to form four sets of nucleotide mismatches when a mutant and normal DNA segment are annealed together. These sets include: G:A/C:T, C:C/G:G, A:A/T:T, and C:A/G:T. Each nucleotide pair represents eight possible single base pair mismatches which could be found in a DNA heteroduplex. However, DNA:RNA and RNA:RNA heteroduplexes can also be formed. Where a heteroduplex includes RNA, 9 single base pair mismatch sets are possible. DNA:DNA, DNA:RNA and RNA:RNA heteroduplexes can also be created by insertion or deletion of nucleotides in the mutant nucleic acid strand.
One example of a harmful mutation is provided by the well-known case of sickle cell anaemia where a point mutation involving an alteration of but a single nucleotide base, namely the substitution of a specific adenine by thymine in the genomic DNA, is responsible for the defect. In the case of cystic fibrosis the disease can arise from the presence of any one of a number of possible point mutations or small insertions or deletions that have been identified in different parts of the cystic fibrosis transmembrane regulator (CFTR) gene. Mutations in genomic DNA encoding oncogenes and tumour suppressors are also believed to be responsible for cell proliferation that causes many cancers.
Various methods of testing and detecting mutations in nucleic acids are known, many of which use for example a preliminary stage of polymerase chain reaction (PCR) amplification. Many of the existing methods are limited to cases where the precise nature and location of the mutation or molecular change being sought is already known and/or is of a particular kind. However, in many instances of disease-causing mutations the precise nature and location of the mutations are not known. A number of known methods of detecting unknown mutations in nucleic acids, such as SSCP, heteroduplex analysis, RNAse protection and chemical cleavage of heteroduplexes, are discussed in a review article by Markus Grompe entitled "The rapid detection of unknown mutations in nucleic acids" published 1993 in Nature Genetics, 5, 111-117. As yet there is no universal ideal method available and there is a need for more sensitive, rapid and efficient methods of detecting mutations in DNA. Such methods should also be capable of locating in a target nucleic acid or gene the position of point mutations or small mutations involving only a few bases.
When a mutation occurs in normal double stranded DNA of a living organism, initially this will generally affect one strand only of the duplex molecule, causing a mismatch in the base pairing. For example, with the occurrence of a point mutation a nucleotide cytosine base (C) in one strand may be changed so that the complementary quanine base (G) in the other strand becomes opposed to an adenine (A) or to a thymine base (T), producing a base-pair mismatch in the duplex molecule. However, certain proteins and enzymes, referred to as "proof-reading" proteins or enzymes or "DNA mismatch repair proteins or enzymes", are usually present in most living organisms and these enzymes act to detect such base-pair mismatches and to initiate a repair process in the mutated region before the molecules replicate and pass on the defect to subsequent copies of the DNA. One example of a set of such mismatch "repair enzymes", believed to be present in all living organisms, is provided by the Mut series of proteins and homologues thereof. A very well characterised system of these Mut repair enzymes, occurring for example in the bacterium E.coli, has been described by Paul Modrich and colleagues (for Review article, see for example Modrich, P. (1991), Annual Rev. Genet., 25, 229-253), a set of three proteins having been identified and purified which are termed MutS, MutH and MutL. These detect errors in DNA replication by interacting with double-stranded nucleic acid molecules containing mismatched base pairs that arise when errors and new mutations occur. The DNA-repair protein MutS in particular is a highly conserved protein which has the ability to detect and bind to the sites of mismatched bases (other than C:C) or deletions or insertions of up to four bases (see for example paper by Shin-San Su and Paul Modrich, "Escherichia coli mutS-encoded protein binds to mismatched DNA base pairs" (1986) Proc. Natl. Acad. Sci. U.S.A., 83, 5057-5061). MutS then recruits the MutH and MutL enzymes to create a nick at a CATG sequence near the mutation. Other enzymes then repair that region between the mutation and the nick (see also R. S. Lahue et al, "DNA Mismatch Correction in a Defined System" (1989) Science 245, 160-164).
The ability of various known mismatch repair enzymes to seek out and interact with mismatch regions in duplex DNA molecules has already led to some proposals for using such enzymes in assays to detect mutations responsible for such mismatches. Thus, in WO 93/02216 (Upstate Biotechnology, Inc., Lake Placid, N.Y., U.S.A.) a method for detecting mutations such as a single base change or an addition or deletion of about one to four base pairs in duplex nucleic acid molecules or polynucleotides is disclosed which is based on the use of a DNA mismatch-binding protein such as MutS in conjunction with a method using antibody reagents or the like for specifically recognizing or detecting the presence of said protein bound to nucleic acid or polynucleotide molecules. WO 93/20233 (University of Maryland at Baltimore) discloses a method for detecting single base pair mismatches at a preselected site in nucleic acids such as genomic DNA which is also based on the use of enzymes that repair mismatches in nucleic acids. In this latter method, however, mast emphasis is given to the detection of base pair mismatches at particular sites using mismatch repair enzymes having endonuclease activity that specifically cleave nucleic acid strands near to mismatches. Neither of the methods disclosed in the two above-mentioned patent publications, however, appear to be well-suited for determining the location of unknown mutations in nucleic acids, and in WO 93/02216 in particular the method appears to be concerned specifically with determining whether a particular sample of DNA includes a mismatch mutation without consideration of locating any such mutation that may be detected.
In a technique developed by Schmitz and Galas Schmitz, A, and Galas, D. J. (1978) DNaseI footprinting: A simple method for the detection of protein-DNA binding specificity." Nucleic Acids Research, 5, 3157-3170! for the study of sequence-specific binding of proteins to DNA, a DNA fragment is exposed to a sequence-specific DNA binding protein. After such time as to allow for binding, the protein-DNA complex is treated with DNaseI. The bound protein shields that region of DNA to which it is bound from digestion by DNaseI, and after separation of the reaction products by gel electrophoresis, the protected region is seen as a gap in the otherwise continuous background of digestion products.
Footprint analysis has also been accomplished using DNA digesting enzymes which are processive, i.e. act on the DNA in a 3' to 5' fashion. The prototype enzyme which has been used for this type of assay is Exonuclease III. The procedure for such exonuclease footprinting involves binding of sequence specific DNA binding protein to the DNA followed by exonuclease digestion. As with the above-mentioned technique using DNase, the bound protein protects the DNA from digestion. , due to the processive nature of the exonuclease, the reaction products do not consist of a background of randomly cleaved DNA fragments, rather they consist of two single stranded species which overlap and form double standed DNA only in the region of the DNA-protein interaction. Thus, after separation by gel electrophoresis, the region of DNA-protein interaction may be deduced by the lengths of the two single stranded reaction products.
It is to be noted, however, that in none of the afore-mentioned prior art has there been any disclosure of a footprinting or similar technique involving endonucleases or exonucleases being used to detect mutations in DNA.