The present invention relates to a method for detecting 5-methylcytosine in genomic DNA-samples.
The levels of observation that have been well studied by the methodological developments of recent years in molecular biology are the genes themselves, the translation of these genes into RNA, and the resulting proteins. The question of which gene is switched on at which point in the course of the development of an individual, and how the activation and inhibition of specific genes in specific cells and tissues are controlled is correlatable to the degree and character of the methylation of the genes or of the genome. In this respect, the assumption suggests itself that pathogenic conditions express themselves in an altered methylation pattern of individual genes or of the genome.
The present invention describes a method for detecting the methylation state of genomic DNA samples. The method can, at the same time, also be used for detecting point mutations and single nucleotide polymorphisms (SNPs).
5-methylcytosine is the most frequent covalently modified base in the DNA of eukaryotic cells. It plays a role, for example, in the regulation of transcription, in genetic imprinting, and in tumorgenesis. Therefore, the identification of 5-methylcytosine as a component of genetic information is of considerable interest. However, 5-methylcytosine positions cannot be identified by sequencing since 5-methylcytosine has the same base pairing behavior as cytosine. Moreover, the epigenetic information carried by the 5-methylcytosines is completely lost during PCR amplification.
A relatively new, and currently the most frequently used method for analyzing DNA for 5-methylcytosine is based on the specific reaction of bisulfite with cytosine which, upon subsequent alkaline hydrolysis, is converted into uracil which corresponds to thymidine in its base pairing behavior. However, 5-methylcytosine remains unmodified under these conditions. Consequently, the original DNA is converted in such a manner that methylcytosine, which originally could not be distinguished from cytosine by its hybridization behavior, can now be detected as the only remaining cytosine using “normal” molecular biological techniques, for example, by amplification and hybridization or sequencing. All of these techniques are based on base pairing which can now be fully exploited. In terms of sensitivity the prior art is defined by a method which encloses the DNA to be analyzed in an agarose matrix, thus preventing the diffusion and renaturation of the DNA (bisulfite only reacts with single-stranded DNA), and which replaces all precipitation and purification steps with fast dialysis (Olek, A. et al, Nucl. Acids. Res. 1996, 24, 5064-5066). Using this method, it is possible to analyze individual cells, which illustrates the potential of the method. Until now, however, only individual regions of a length of up to approximately 3000 base pairs are analyzed; a global analysis of cells for thousands of possible methylation analyses is not possible. Moreover, this method cannot reliably analyze very small fragments from small sample quantities either. These are lost in spite of the diffusion protection by the matrix.
An overview of further methods of detecting 5-methylcytosines can be gathered from the following survey article: Rein, T., DePamphilis, M. L., Zorbas, H., Nucleic Acids Res. 1998, 26, 2255.,
With few exceptions (e.g., Zeschnigk M. et al, Eur. J. Hum. Genet. 1997, 5, 94-98), the bisulfite technology is currently only used in research. Always, however, short specific fragments of a known gene are amplified subsequent to a bisulfite treatment and either completely sequenced (Olek, A. and Walter, J., Nat. Genet. 1997, 17, 275-276) or individual cytosine positions are detected by a primer extension reaction (Gonzalgo, M. L., and Jones, P. A., Nucl. Acids Res. 1997, 25, 2529-2531, WO Patent 9500669) or by enzymatic digestion (Xiong, Z. and Laird, P. W., Nucl. Acids. Res. 1997, 25, 2532-2534). In addition, detection by hybridization has also been described (Olek et al., WO 99 28498).
Further publications dealing with the use of the bisulfite technique for methylation detection in individual genes are: Xiong, Z. and Laird, P. W. (1997), Nucl. Acids Res. 25, 2532; Gonzalgo, M. L. and Jones, P. A. (1997), Nucl. Acids Res. 25, 2529; Grigg, S. and Clark, S. (1994), Bioassays 16, 431; Zeschnik, M. et al. (1997), Human Molecular Genetics 6, 387; Teil, R. et al. (1994), Nucl. Acids Res. 22, 695; Martin, V. et al. (1995), Gene 157, 261; WO 97 46705, WO 95 15373 and WO 45560.
An overview of the Prior Art in oligomer array manufacturing can be gathered from a special edition of Nature Genetics (Nature Genetics Supplement, Volume 21, January 1999), published in January 1999, and from the literature cited there.
There are different methods known for immobilizing DNA. The best-known method is the fixed binding of a DNA which has been functionalized with biotin to a streptavidin-coated surface (Uhlen, M. et al. 1988, Nucleic Acids Res. 16, 3025-3038). The binding strength of this system corresponds to that of a covalent chemical bond without being one. To be able to covalently bind a target DNA to a chemically prepared surface, a corresponding functionality of the target DNA is required. DNA itself does not possess any functionalization which is suitable. There are different variants of introducing a suitable functionalization into a target DNA: two functionalizations which are easy to handle are primary aliphatic amines and thiols. Such amines are quantitatively converted with N-hydroxysuccinimide esters, and thiols react quantitatively with alkyl iodides under suitable conditions. The difficulty exists in introducing such a functionalization into a DNA. The simplest variant is the introduction via a PCR primer in a PCR. Disclosed variants use 5′-modified primers (NH2 and SH) and a bifunctional linker.
An essential component of the immobilization on a surface is its constitution. Systems described heretofore are mainly composed of silicon or metal. A further method of binding a target DNA is based on the use of a short recognition sequence (e.g., 20 bases) in the target DNA for hybridization to a surface-immobilized oligonucleotide. Enzymatic variants for introducing chemically activated positions in a target DNA have been described as well. In this case, a 5′-NH2-functionalization is carried out enzymatically on a target DNA.
For scanning an immobilized DNA array, fluorescently labeled probes have often been used. Particularly suitable for fluorescence labeling is the simple attachment of Cy3 and Cy5 dyes to the 5′-OH of the specific probe. The detection of the fluorescence of the hybridized probes is carried out, for example via a confocal microscope. Cy3 and Cy5 dyes, besides many others, are commercially available.
An overview of the Prior Art in oligomer array manufacturing can be gathered from a special edition of Nature Genetics (Nature Genetics Supplement, Volume 21, January 1999), published in January 1999, and from the literature cited there, as well as from U.S. Pat. No. 5,994,065 on methods for preparing solid supports for target molecules such a oligonucleotides at reduced, non-specific background signal.
More recent methods for detecting mutations are specified in the following:
Worth mentioning as a special case of sequencing is the single-base primer extension (Genetic Bit Analysis) (Head, S R., Rogers, Y H., Parikh K., Lan, G., Anderson, S., Goelet, P., Boycejacino M T., Nucleic Acids Research. 25(24): 5065-5071, 1997; Picoult-Newberg, L., Genome Res. 9(2): 167-174, 1999). A combined amplification and sequencing is described in U.S. Pat. No 5,928,906 where a base-specific termination on matrix molecules is used. A further method uses a ligase/polymerase reaction for identifying nucleotides (U.S. Pat. No. 5,952,174).
Matrix Assisted Laser Desorption Ionization Mass Spectrometry (MALDI) is a very efficient development for the analysis of biomolecules (Karas, M. and Hillenkamp, F. (1988), Laser desorption ionization of proteins with molecular masses exceeding 10000 daltons. Anal. Chem. 60: 2299-2301). An analyte is embedded in a light-absorbing matrix. Using a short laser pulse, the matrix is evaporated, thus transporting the analyte molecule into the vapor phase in an unfragmented manner. The analyte is ionized by collisions with matrix molecules. An applied voltage accelerates the ions into a field-free flight tube. Due to their different masses, the ions are accelerated at different rates. Smaller ions reach the detector sooner than larger ones.
MALDI is ideally suited to the analysis of peptides and proteins. The analysis of nucleic acids is somewhat more difficult (Gut, I. G. and Beck, S. (1995), DNA and Matrix Assisted Laser Desorption Ionization Mass Spectrometry. Molecular Biology: Current Innovations and Future Trends 1: 147-157.). The sensitivity for nucleic acids is approximately 100 times worse than for peptides and decreases disproportionally with increasing fragment size. For nucleic acids having a multiply negatively charged backbone, the ionization process via the matrix is considerably less efficient. For MALDI, the selection of the matrix plays an eminently important role. For the desorption of peptides, several very efficient matrixes have been found which produce a very fine crystallization. For DNA, there are currently several responsive matrixes in use, however, this has not reduced the difference in sensitivity. The difference in sensitivity can be reduced by chemically modifying the DNA in such a manner that it becomes more similar to a peptide. Phosphorothioate nucleic acids in which the usual phosphates of the backbone are substituted by thiophosphates can be converted into a charge-neutral DNA using simple alkylation chemistry (Gut, I. G. and Beck, S. (1995), A procedure for selective DNA alkylation and detection by mass spectrometry. Nucleic Acids Res. 23: 1367-1373). The coupling of a charge tag to this modified DNA results in an increase in sensitivity to the same amount as that found for peptides. A further advantage of charge tagging is the increased stability of the analysis against impurities which make the detection of unmodified substrates considerably more difficult.
Genomic DNA is obtained from DNA of cell, tissue or other test samples using standard methods. This standard methodology is found in references such as Fritsch and Maniatis eds., Molecular Cloning: A Laboratory Manual, 1989.
Mutualities between promoters consist not only in the occurrence of TATA- or GC-boxes but also for which transcription factors they possess binding sites and at what distance these are located from each other. The existing binding sites for a specific protein do not match completely in their sequence but conserved sequences of at least 4 bases are found which can still be elongated by inserting wobbles, i.e., positions at which in each case different bases are located. Moreover, these binding sites are present at specific distances from each other.
However, the distribution of the DNA in the interphase chromatin which occupies the largest portion of the nuclear volume is subject to a very special arrangement. Thus, the DNA is attached to the nuclear matrix, a filamentous pattern at the inner side of the nuclear membrane, at several locations. These regions are designated as matrix attachment regions (MAR) or scaffold attachment regions (SAR). The attachment has an essential influence on the transcription or the replication. These MAR fragments have no conserved sequences, but to 70% they consist of A or T, and are located in the vicinity of cisacting regions, which regulate the transcription in a general manner, and in the vicinity of topoisomerase II recognition sites.
In addition to promoters and enhancers, further regulatory elements, so-called “insulators”, exist for different genes. These insulators can, for example, inhibit the action of the enhancer on the promotor if they are located between enhancer and promotor, or else, if located between heterochromatin and a gene, can protect the active gene from the influence of the heterochromatin. Examples of such insulators include: firstly, so-called “LCR” (locus control regions) consisting of several sites which are hypersensitive to DNAase I; secondly, certain sequences such as SCS (specialized chromatin structures) or SCS′, 350 or 200 bp long, respectively, and highly resistant to degradation by DNAase I, and flanked on both sides with hypersensitive sites (distance in each case 100 bp). The protein BEAF-32 binds to scs′. These insulators can be located on both sides of the gene.