DNA is often methylated in normal mammalian cells. For example, DNA is methylated to determine whether a given gene will be expressed and whether the maternal or the paternal allele of that gene will be expressed. See Melissa Little et al., Methylation and p16: Suppressing the Suppressor, 1 NATURE MEDICINE 633 (1995). While methylation is known to occur at CpG sequences, only recent studies indicate that CpNpG sequences may be methylated. Susan J. Clark et al., CpNpGp Methylation in Mammalian Cells, 10 NATURE GENETICS 20, 20 (1995). Methylation at CpG sites has been much more widely studied and is better understood.
Methylation occurs by enzymatic recognition of CpG and CpNpG sequences followed by placement of a methyl (CH.sub.3) group on the fifth carbon atom of a cytosine base. The enzyme that mediates methylation of CpG dinucleotides, 5-cytosine methyltransferase, is essential for embryonic development--without it embryos die soon after gastrulation. It is not yet clear whether this enzyme methylates CpNpG sites. Peter W. Laird et al., DNA Methylation and Cancer, 3 HUMAN MOLECULAR GENETICS 1487, 1488 (1994).
When a gene has many methylated cytosines it is less likely to be expressed. K. Willson, 7 TRENDS GENET. 107-109 (1991). Hence, if a maternally-inherited gene is more highly methylated than the paternally-inherited gene, the paternally-inherited gene will give rise to more gene product. Similarly, when a gene is expressed in a tissue-specific manner, that gene will often be unmethylated in the tissues where it is active, but will be highly methylated in the tissues where it is inactive. Incorrect methylation is thought to be the cause of some diseases including Beckwith-Wiedemann syndrome and Prader-Willi syndrome. I. Henry et al., 351 NATURE 665, 667 (1991); R. D. Nicholls et al., 342 NATURE 281, 281-85 (1989).
The methylation patterns of DNA from tumor cells are generally different than those of normal cells. Laird et al., supra. Tumor cell DNA is generally undermethylated relative to normal cell DNA, but selected regions of the tumor cell genome may be more highly methylated than the same regions of a normal cell's genome. Hence, detection of altered methylation patterns in the DNA of a tissue sample is an indication that the tissue is cancerous. For example, the gene for Insulin-Like Growth Factor 2 (IGF2) is hypomethylated in a number of cancerous tissues, such as Wilm's Tumors, rhabdomyosarcoma, lung cancer and hepatoblastomas. Rainner et al. 362 NATURE 747-49 (1993); Ogawa, et al., 362 NATURE 749-51 (1993); S. Zhan et al., 94 J. CLIN. INVEST. 445-48 (1994); P. V. Pedone et al., 3 HUM. MOL. GENET. 1117-21 (1994); H. Suzuki et al., 7 NATURE GENET 432-38 (1994); S. Rainier et al., 55 CANCER RES. 1836-38 (1995).
The present invention is directed to a method of detecting differential methylation at CpNpG sequences by cutting test and control DNAs with a restriction enzyme that will not cut methylated DNA, and then detecting the difference in size of the resulting restriction fragments.
While methylation-sensitive restriction enzymes have been used for observing differential methylation in various cells, no commercial assays exist for use on human samples because differentially methylated sequences represent such a minute proportion of the human genome that they are not readily detected. The human genome is both highly complex, in that it contains a great diversity of DNA sequences, and highly repetitive, in that it contains a lot of DNA with very similar or identical sequences. The high complexity and repetitiveness of human DNA confounds efforts at detecting and isolating the minute amount of differentially methylated DNA which may be present in a test sample. The present invention remedies the detection problem by providing new procedures for screening a selected subset of the mammalian genome which is most likely to contain genetic functions.
The present invention provides techniques for detecting and isolating differentially methylated or mutated segments of DNA which may be present in a tissue sample in only minute amounts by using one or more rounds of DNA amplification coupled with subtractive hybridization to identify such segments of DNA. DNA amplification has been coupled with subtractive hybridization in the Representational Difference Analysis (RDA) procedures disclosed in U.S. Pat. No. 5,436,142 to Wigler et al. and Nikolai Lisitsyn et al., Cloning the Differences Between Two Complex Genomes, 259 SCIENCE 946 (1993). However, for the subtractive hybridization step of such RDA procedures to proceed in a reasonable time and with reasonable efficiency, only a subset of the genome can be examined. To accomplish this necessary reduction in the complexity of the sample DNA, Wigler et al. and Lisitsyn et al. disclose cutting DNA samples with restriction enzymes that cut infrequently and randomly. However, selection of enzymes which randomly cut the genome, means that the portion of the genome which is examined is not enriched for any particular population of DNA fragments. Thus, when RDA is used, only a random subset of the human genome, which includes repetitive elements, noncoding regions and other sequences which are generally not of interest, can be tested in a single experiment.
In contrast, the present invention is directed to methods which use enzymes that cut frequently and that specifically cut CG-rich regions of the genome. These enzymes are chosen because CG-rich regions of the genome are not evenly distributed in the genome--instead, CG-rich regions are frequently found near genes, and particularly near the promoter regions of genes. This means that the proportion of the genome that is examined by the present methods will be enriched for genetically-encoded sequences as well as regulatory sequences. Moreover, unlike the RDA method, the present methods selectively identify regions of the genome which are hypomethylated or hypermethylated by using enzymes which specifically cut non-methylated CG-rich sequences. The present invention therefore represents an improvement over RDA methods because of its ability to select DNA fragments which are likely to be near or to encode genetic functions.