The invention relates to a method of investigating by mass spectrometry the genetic material deoxyribonucleic acid (DNA) replicated by polymerase chain reaction (PCR), for the identification of known mutations and polymorphisms; it particularly relates to the analysis of single nucleotide polymorphisms (SNPs) by matrix assisted laser desorption and ionization (MALDI).
The invention consists of using a set of nucleoside triphosphates for the selective PCR replication of the DNA in which one or more of the nucleoside triphosphates have been made much heavier by attaching a chemical group, but in such a way that the replication is not disturbed by the polymerase. In this way a single nucleotide polymorphism in DNA pieces with a length of about 40 to 50 bases can very easily be made visible by mass spectrometry without any further manipulation.
Subject of this invention is a method for easily and quickly detecting mutative changes at certain known points of the genomic DNA of an organism. Special consideration is here given to polymorphisms where with a statistical frequency a single base exchange is to be found at a certain point in the genome. This type of polymorphism has in recent years been given the designation xe2x80x9csingle nucleotide polymorphismxe2x80x9d (SNP).
SNPs have in the meantime acquired considerable importance for genotyping. It is assumed that the human genome contains about 3 million such SNPs. Therefore there are about 3 million points at which with a statistical frequency a base is exchanged for a different base. Such a base exchange can take place within a gene or in non expressed areas between the genes. Therefore, and due to the large redundancy of the genetic code, an SNP can be without any phenotypical effect. Certain forms (so-called alleles) of SNPs can, however, also be linked to a phenotypical variation, e.g. by the exchange of an amino acid in a protein, by a change in the gene expression or its regulation etc. The phenotypical variation can, for example, be expressed in a changed tolerance to environmental influences, a changed pharmaceutical effect, or, under extreme circumstances, in a genetically conditioned disease. SNPs inherit half from the father and half from the mother so SNPs can also be applied in individual analysis (genetic passport).
SNPs acquire increasing importance for genotyping and particularly for the coupling analysis of multicausal diseases. The higher frequency in the genome and the thus possible denser marker network, as well as the lower mutation rate compared with the STR markers (short tandem repeats) used to date represent a considerable advantage.
The basis for detecting such and other mutations is the selective PCR (polymerase chain reaction), a replication method for DNA pieces in the test tube, which was only developed by K. B. Mullis in 1983 (who was awarded the Nobel Prize for it in 1993) and after introduction of temperature-stable polymerases began an unprecedented march to victory through the genetic laboratories.
PCR is the targeted replication of a piece of the double-stranded DNA (dsDNA) accurately selected by the replication method itself. Selection of the DNA segment is performed by a pair of so-called primers, two single-stranded DNA pieces (ssDNA) each having a length of about 20 nucleotides, which (described somewhat briefly and simplified) hybridize at both ends (the future ends) of the selected DNA piece. Enzymatic replication is performed by a DNA polymerase, which represents a chemical factory inside a molecule, by passing through a simple temperature cycle. The PCR reaction takes place in aqueous solution in which a few molecules of the original DNA and sufficient quantities of DNA polymerase, primers, nucleoside triphosphates, activators, and stabilizers are present. In each thermal cycle (for example the melting of the double helix at 94xc2x0 C., hybridization of the primers at 55xc2x0 C., reconstitution to a double helix by attachment of new DNA building blocks by the polymerase at 72xc2x0 C.) the number of selected DNA segments is basically doubled. Therefore, in 30 cycles, around 1 billion DNA segments are generated from one single double strand of the DNA as original material. (In a more exact description, both primers hybridize on the two different single strands of the DNA and the shortening to the selected DNA segment including the two attached primers only occurs statistically during further replication).
Mass spectrometry with ionization of heavy molecules either by matrix-assisted laser desorption (MALDI) or by electrospray (ESI) is a very efficient method of analyzing biomolecules. For instance, the ions can be analyzed with regard to their mass in time-of-flight mass spectrometers. Since the flight velocity of the ions in the mass spectrometer is about 107 faster than the migration velocity of the molecules in the gel of electrophoresis, the mass spectrometry method is exceptionally faster than the previously used gel electrophoresis method, even if the spectrum measurement is repeated 10 to 100 times in order to achieve a good signal-to-noise ratio.
Due to the capability of a higher sample throughput the MALDI method has become more widespread than ESI for analyzing DNA. The MALDI method consists of first embedding the analyte molecules on a sample support in a UV-absorbing matrix, usually an organic acid. The sample support is introduced to the ion source of a mass spectrometer. Due to a short UV laser pulse of about 3 nanoseconds in length the matrix is evaporated into the vacuum; largely unfragmented, the analyte molecule is transported into the gaseous phase. Ionization of the analyte molecule is achieved by collisions with matrix ions forming simultaneously. An applied voltage accelerates the ions into a field-free flight tube. Based on their various masses the ions in the ion source are accelerated to various velocities. Smaller ions reach the detector earlier than large ones. The time of flight is converted to the mass of the ions.
Technical innovations in hardware have significantly improved the method of time-of-flight mass spectrometry with MALDI ionization. Worth mentioning is the delayed acceleration (Delayed Extraction) with which an improved resolution of the signals is achieved at a point in the spectrum, but also an even more reduced fragmentation. By means of an additional dynamic change in acceleration voltage it is possible to achieve a good resolution in a large mass range (for example see DE 196 38 577).
Naturally the MALDI method of ionization can be also coupled to other types of mass spectrometry such as RF quadrupole ion traps or ion cyclotron resonance spectrometers.
MALDI is ideally suitable for analyzing peptides and proteins. The analysis of nucleic acids is much more difficult. For nucleic acids ionization in the MALDI process is about 100 times lower than for peptides and decreases disproportionately with increasing mass. On the one hand, DNA pieces are very fragile and easily decompose in the MALDI process, while on the other hand they tend to form adducts with numerous alkali ions. Both processes of fragmentation and adduct formation cause the determination of mass to become increasingly inaccurate as mass increases.
Although one can determine a DNA piece with a length of 20 to 25 bases (around 6,000 to 8,000 atomic mass units) accurately to within three to five atomic mass units, it is no longer the case for DNA having a length of about 40 to 50 bases (around 12,000 to 16,000 atomic mass units). In the latter case the mass difference has to be about 40 to 60 mass units to ensure reliable differentiation. The four natural nucleobases of DNA, however, only have mass differences of 9 to a maximum of 40 atomic mass units so a base exchange can no longer be reliably detected at this length of DNA pieces. Only with extremely careful work and extremely efficient cleaning to keep adduct formation to a minimum is it possible to detect mass differences of 20 atomic mass units in this mass range.
The minimum length of a PCR amplified DNA product around an SNP (single nucleotide polymorphism) is about 40 to 50 bases because two primers with a length of 20 bases have to be used and the primers can sometimes not be connected to the SNP point directly. For these PCR products there is therefore no longer any reliable mass-spectrometric detection of a base exchange by MALDI ionization, which is otherwise so convenient and fast.
Recently a method of mutation diagnostics became known which uses MALDI mass spectrometry and which can be particularly used for SNP analysis (Little, D. P., Braun, A., Darnhofer-Demar, B., Frilling, A., Li, Y., Mclver, R. T. and Kxc3x6ster, H.; Detection of RET proto-oncogene codon 634 mutations using mass spectrometry. J. Mol. Med. 75, 745-750, 1997). Initially a normal PCR is performed with a pair of first primers in order have sufficient DNA material available for further steps. After an intermediate cleaning procedure to remove residual primers and nucleoside triphosphates a new primer is added. This second primer is synthesized so that it becomes attached to the matrix strand in the vicinity of a known point mutation or of an SNP. Between the position of this SNP and the 3xe2x80x2 end of the primer (the primer is extended at that end) the sequence of the matrix strand may contain a maximum of three of the four nucleobases. The fourth base occurs at the point of the SNP at the earliest (of the polymorphism in the case of allele 1) or after it (allele 2) for the first time. With a polymerase and the special set of deoxynucleoside triphosphates (the maximum of three complementary ones which occur up to polymorphism) and a dideoxynucleoside triphosphate (with the base which is complementary to the allele of the polymorphism) the primer is then extended by copying. The dideoxynucleoside triphosphate terminates the chain extension. Depending on the allele of the polymorphism the chain extension is terminated at the SNP or several nucleotides later. This method has been termed by the authors xe2x80x9cPROBExe2x80x9d.
The method is very favorable because it ends with short DNA products with a length of about 25 nucleotides, which are very suitable for MALDI analysis, and because the mass difference is always at least one base. However, on the other hand it calls for a larger number of thermal and cleaning phases. Initially the PCR products must be cleaned of the first primer, enzyme and all nucleoside triphosphates, whereby one must take into consideration that the purification, particularly of primer, becomes increasingly difficult, the shorter the DNA product is. Only then can the second primer which is to be lengthened be added with the special set of nucleoside triphosphates. Now new thermal cycles have to be integrated to extend the primer. Then there has to be a cleaning phase before the extension product can be measured with MALDI. The authors have solved the problem of these phases by fixing the DNA to a surface not only by physical absorptive means but also chemically, but must detach it later, which complicates the reactions again though.
The invention consists of changing the mass of at least one of the four nucleoside triphosphates used in PCR amplification by a chemical change (derivatizing) in such a way that, on the one hand, the PCR reaction is not disturbed, but on the other hand, a base exchange becomes reliably detectable due to the then considerably changed mass of the PCR product. The decision as to which nucleoside triphosphate is best used with changed mass depends on the type of the base exchange.
Derivatization tends to lead to a mass enlargement rather than a reduction. In the case of the nucleoside triphosphate G this enlargement only needs to be about 14 atomic mass units in order to detect the base reliably in an exchange by one of the two lightest nucleoside triphosphates (C and T). However, a mass difference of at least 20 atomic mass units is better; and 40 to 80 atomic mass units as derivatization increase are ideal for this method.
Changed-mass nucleoside triphosphates are not even required for all the four bases. Two changed-mass nucleoside triphosphates are adequate for detecting all the base exchanges because the bases in the DNA strand and counter strand always occur in pairs (G and C, A and T). Favorably the masses of the two heaviest nucleoside triphosphates (G and A) must be enlarged. However, the mass change only needs to be detectable in one strand. Normally (if no special measures are taken) both strands are always measured simultaneously in the MALDI process. If only one strand is used (there are also methods for this) the strand used for measurement can be selected accordingly.
The nucleobases consist of the two purins, adenine (A) and guanine (G) and the two pyrimidines cytosine (C) and thymine (T). (In the RNA the uracil (U) occurs instead of the thymine). For structural reasons G and A can be more easily derivatized according to current knowledge, and this is more favorable for the invention in any case. Derivatization of the pyrimidines is not precluded though.
In particular it is a basic idea of the invention to derivatize purines A and G in position 7 (letters b and e in FIG. 1). For this it is necessary to replace the nitrogen atoms of position 7 by methine groups. The result is 7-desaza-purine nucleosides which are then transformed into triphosphates. The C-atom hydrogen in position 7 is then replaced by a correspondingly heavy group. Very different groups can be used. Particularly the modified purine nucleosides in position 7 can be derivatized by attaching remainders of the form xe2x80x94R, xe2x80x94(CH2)nxe2x80x94R or xe2x80x94Cxe2x89xa1Cxe2x80x94R, whereby R can be a remainder of the form xe2x80x94H, xe2x80x94F, xe2x80x94Cl, xe2x80x94Br, xe2x80x94I, xe2x80x94OH, xe2x80x94SH, xe2x80x94SeH, -alkyl, -alkenyl, -alkinyl, xe2x80x94OCH3, xe2x80x94SCH3, xe2x80x94CHF2, xe2x80x94CF3, xe2x80x94CH2CH2xe2x80x94(OCH2CH2)nxe2x80x94O-alkyl, xe2x80x94NH2, xe2x80x94(NHCOCH2)nxe2x80x94NH2, xe2x80x94(NHCOCHCH3)nxe2x80x94NH2, xe2x80x94OCOCH2NH2, xe2x80x94OCOCH2(NHCOCH2)nxe2x80x94NH2xe2x80x94OCOCHCH3NH2, xe2x80x94OCOCHCH3(NHCOCH3)nxe2x80x94NH2, xe2x80x94OCH2F, xe2x80x94OCHF2, OCF3, xe2x80x94SCH2F, xe2x80x94SCHF2 or xe2x80x94SeCH3 inasmuch as this produces a mass difference of at least 14 atomic mass units, although at least 20 or even 40 atomic mass units are preferable. Furthermore, the specialist will be aware of further residues which fulfill the same purpose.
There is also the possibility of derivatizing 8-aza-7-desaza-purine nucleosides as described above. Position 6 of the guanine, to which an oxygen atom is normally attached, can also be derivatized for mass modification. In this way it is possible to integrate a sulfur or selenium atom without essentially disturbing the Watson-Crick bond with the opposite pyrimidine base.
For mass modification O1xe2x80x2 can be substituted by S1xe2x80x2 in the deoxyribose. A DNA building block can also be replaced by a phosphothioate.
It is a further basic idea of the invention to manufacture appropriate chemical kits for this SNP analysis. These can, on the one hand, contain only the modified nucleoside triphosphates, for example in compounds with unmodified nucleoside triphosphates. However, buffers, activators for the polymerase, and stabilizers, may also be included. It is also possible, though, to manufacture ready-to-use kits which also already contain the inactivated polymerase and to which only the specific primers and activators have to be added in use.