As understanding of the role of genetic alterations in the basis of disease has increased, methods have been developed to facilitate identification of disease-causing mutations. Recently, a method of genomic mismatch scanning (GMS) has been used to identif regions of DNA that are xe2x80x9cidentical by descentxe2x80x9d (xe2x80x9cIBDxe2x80x9d) in individuals who are distantly related, that is, regions of DNA that are inherited from a common ancestor (see, e.g., U.S. Pat. No. 5,376,526 to Brown et al; Nelson, S. F., Electrophoresis 16:279-285 (1995); Cheung, V. G. et al., Nature Genetics 18:225-230 (1998); and Cheung, V. G. and Nelson, S. F., Genomics 47:1-6 (1998)). DNA that is IBD between individuals who are distantly related, and who are affected by a disease, may contain a genetic mutation within a shared BD region that contributes to or causes the disease (see, e.g., Mirzayans, F. et al., Am. J. Hum. Genet. 61:111-119 (1997)). GMS methods employ a series of steps, including use of a panel of methyl-directed mismatch repair enzymes (mutH, mutL and mutS from Escherichia coli), and binding of mismatched fragments to benzoylated, naphthylated DEAE cellulose (BNDC). Difficulties associated with GMS methods include optimization of the mutHLS enzymes, unreliability of BNDC binding, and high background interference (see Kruglyak, L. and McAllister, L., Nature Genetics 18:200-202 (1998)). Easier and more accurate methods of identifying IBD regions of DNA would enhance the ability to identify mutations associated with disease.
The present invention is drawn to methods of identifying DNA fragments containing DNA that are identical by descent for two individuals. In the methods, a sample of genomic DNA is obtained from each of the two individuals. The samples are initially digested with a restriction endonuclease, such as PstI; one of the samples is then methylated, while the other is preserved as a non-methylated sample. The digested, methylated and non-methylated samples are then mixed and incubated under conditions which allow denaturation of the DNA (e.g., heating of the mixture). The denatured mixture is then allow to re-anneal, preferably in the presence of a phenol based emulsion (e.g., formamide phenol emulsion reassociation technique), forming a mixture containing homohybrid DNA fragments (both methylated homohybrids and non-methylated hybrids), and heterohybrid DNA fragments (xe2x80x9chemi-methylatedxe2x80x9d). The re-annealed sample is then contacted with restriction endonucleases which digest the homohybrid DNA fragments (e.g., DpnI, which digests methylated, double-stranded DNA, and MboI, which digests unmethylated, double-stranded DNA). The sample is then contacted with an endonuclease(s) which cleaves both strands of a mismatch-containing heterohybrid DNA fragment (e.g., T7 endonuclease I, which yields 5xe2x80x2 overhangs which are exonuclease III sensitive, and can thus be digested by ExoIII). As a result of this process, the sample mixture contains and is enriched in perfectly-matched heterohybrid DNA fragments which comprise DNA fragments that are identical by descent for the two individuals. If desired, the mixture can be further processed by contact with an exonuclease which digests small fragments generated by the endonuclease(s) which cleaves both strands of a mismatch-containing heterohybrid DNA fragment. As the fragments may contain many repetitive sequences, the mixture can be incubated with repeat-rich DNA (e.g., COT-1) to hybridize and remove the repetitive sequences. In addition, the perfectly-matched heterohybrid DNA fragments can be ligated to nucleic acid adaptors and amplified by polymerase chain reaction (PCR) or long-range polymerase chain reaction (LR-PCR). To reduce background interference further, the process described above can be repeated, one or more times, using perfectly-matched heterohybrid DNA fragments as a first sample of genomic DNA, and a sample of genomic DNA from a third (or fourth, etc.,) individual as a second sample of genomic DNA. The methods of the invention provide a simple and efficient means of identifying regions of genomic DNA that are identical by descent, utilizing enzymes which significantly eliminate background interference without requiring extensive optimization of conditions.