Today, the testing of samples for the presence of certain nucleic acids in nucleic acid sequences gains increasingly more importance. This is partly due to the fact that the nucleotide sequence of a nucleic acid is a unique feature of each organism. First, a number of diseases are genetic in the sense that the nucleotide sequence for a "normal" gene is in some manner changed. Such a change could arise by the substitution of one base for another. Changes comprising more that one base can also be perceived. Given that three bases code for a single amino acid, a change in one base (a point mutation) could result in a change in the amino acid which, in turn, could result in a defective protein being made in a cell. Sickle cell anemia is a classic example of such a genetic defect caused by a change of a single base in a single gene; the beta-globin gene (A.fwdarw.T transversion at codon 6). Important point mutations can also be found for example in LDL receptors (Atherosclerosis (1992), 96:91-107) or in the apolipoprotein-B-gene (CGG.fwdarw.CAG mutation of codon 3500 (Proc. Natl. Acad. Sci. U.S.A. (1989) 86:587-591). Other examples of diseases caused by single gene defects include Factor IX and Factor VIII deficiency, breast cancer, cystic fibrosis, Factor V Leiden, Fragile X, Huntington disease, myotonic dystrophy, Haemophilia A, Haemophilia B, Neurofibromatosis type I, adenosine deaminase deficiency, purine nucleotide phosphorylase deficiency, ornithine transcarbamylase deficiency, argininsuccinate synthetase deficiency, beta-thalassemia, alpha-1 antitrypsin deficiency, glucocerebrosidase deficiency, phenylalanine hydroxylase deficiency and hypoxanthine-guanine phosphoribosyltransferase deficiency.
Second, the main cause of cancer is considered to be alterations in the cellular genes which directly or indirectly control cell growth and differentiation. There are at least thirty families of genes, called oncogenes, which are implicated in human tumor formation. Members of one such family, the ras gene family, are frequently found to be mutated in human tumors. In their normal state, proteins produced by the ras genes are thought to be involved in normal cell growth and maturation. Mutation of the ras gene, causing an amino acid alteration at one of three critical positions in the protein product, results in conversion to a form which is implicated in tumor formation. A gene having such a mutation is said to be "mutant" or "activated." Unmutated ras is called "wild-type" or "normal" ras. It is thought that such a point mutation leading to ras activation can be induced by carcinogens or other environmental factors. Over 90% of pancreatic adenocarcinomas, about 50% of adenomas and adenocarcinomas of the colon, about 50% of adenocarcinomas of the lung and carcinomas of the thyroid, and a large fraction of haematological malignancies such as acute myeloid leukemia, lymphomas and myelodysplastic syndrome have been found to contain activated ras oncogenes. Overall, some 10 to 20% of human tumors have a mutation in one of the three ras genes (H-ras, Ki-ras, or N-ras).
Another example of a gene which is highly involved in the development of cancer is the TP53 gene. It is altered by mutations and/or deletions in more than half of all human cancers. The point mutations are scattered over more than 250 codons and mostly occur as missense mutations. In this respect, the TP53 gene differs from other tumor suppressor genes such as the retinoblastoma tumor suppressor gene (Rb1) and the p16 gene which are most frequently inactivated by deletions or nonsense mutations.
Most malignant tumors show alterations in both alleles of the TP53 gene. This usually involves the complete deletion of one allele and inactivation of the other allele by missense mutations. The result is either a complete lack of TP53 protein or expression of an altered protein. The missense mutations in the highly conserved regions of TP53 have also been associated with increased level of TP53 protein. This seems to result from mutation induced conformational changes, which stabilize the protein and extends its half-life from 4 to 8 hours. The majority of missense mutations cluster in the four highly conserved domains in the central core of the protein. This region is responsible for the sequence specific DNA binding and is therefore of critical importance for the functional integrity of TP53. Seven mutational hot spots have been identified within these domains. These are located at amino acid residues 175, 213, 245, 248, 249, 273, and 282.
Third, infectious diseases are caused by parasites, micro-organisms and viruses all of which have their own nucleic acids. The presence of these organisms in a sample of biological material is often determined by a number of traditional methods (e.g., culture). Each organism has its own unique genome and if there are genes or sequences of nucleic acids that are specific to a single species (to several related species, to a genus or to a higher level of relationship), this sequence will provide a "fingerprint" for that organism (or species, etc.), e.g. in the gene of the reverse transcriptase of the HIV virus (A.fwdarw.T mutation in codon 215: Science (1989) 246:1155-1158). Examples of other viruses include HPV, EBV, HSV, Hepatitis B and C and CMV. Examples of micro-organisms include bacteria and more particularly include various strains of mycoplasma, legionella, myco-bacteria, chlamydia, candida, gonocci, shigella and salmonella. As information on the genomes from more organisms are obtained by the scientific community the repertoire of micro-organisms that can be identified by the present invention will increase. In the nucleic acids of some bacterial strains, a particularly great similarity is found in the sequence of their ribosomal genes and their rRNA.
Current attempts in the field of examining samples for different nucleotide sequences focus on the use of only one single difference in the sequence of nucleotides in order to be able to discriminate between nucleic acids. Such differences may be a consequence of e.g. nucleotide exchanges caused by point mutation or in the case of micro-organisms a consequence of inter-species differences. Natural examples of such closely related nucleic acids are alleles, i.e. alternative variants of sequences of a given gene on a defined site on a chromosome.
In each example set forth above one can isolate nucleic acids from a sample and determine if the sample contains any of the above mentioned sequences, i.e. sequences specific for "genetic disease", cancer, infectious diseases or infectious organisms, by identifying one or more sequences that are specific for a diseases or organism. A difficulty when identifying these differences or changes in the nucleotide sequence is however that the detection is not readily applicable in those instances where the number of copies of the target sequence present in a sample is low. In such instances it is difficult to distinguish signal from noise. One way around this problem is to increase the signal. Accordingly, a number of methods have been described to amplify the target sequences present in a sample. One of the best known and widely used amplification methods is the polymerase chain reaction (referred to as PCR) which is described in detail in U.S. Pat. Nos. 4,683,195, 4,683,202 and 4,800,159.
Based upon the PCR technique a number of methods for detection of sequence variations have been described. From Oncogene Research 1 (1989), 235-241 and Nucl. Acids Res. 17 (1989), 8093-8099 a method is known where the area which presumably contains the allelic variant is first amplified in a PCR using specially designed primers and is then treated with a restriction enzyme. The alleles can then be diagnosed once they have been analyzed with restriction fragment length polymorphisms (RFLP). Electrophoretic separation of the cleavage products according to size then reveals whether the corresponding allele was or was not contained in the probe. The disadvantage of this procedure is that it requires specific restrictive digestion. Apart from the fact that this is a cumbersome procedure for each mutation that does not already produce an RFLP it is necessary that a primer can be designed to be adjacent to the point mutation which should allow digestion with a restriction enzyme that cleaves exactly at the given site. This may be difficult due to the reasons listed in said publications.
EP-A-0 332 435 and U.S. Pat. No. 5,605,794 describe a method for selectively detecting a nucleic acid which differs from an adjacent nucleic acid by only one nucleotide. The effect employed here is the following one: from the oligonucleotides which are hybridized to the nucleic acid to be detected only those can be theoretically extended by means of enzymes where the one nucleotide, which is terminal in direction of extension, is complementary to the corresponding nucleotide of the nucleic acid (of the one allele) to be detected. The oligonucleotide is hence selected such that it is only complementary to the nucleic acid to be tested. Thus, the oligonucleotide hybridized to the other allele is theoretically not extended. It turned out, however, that in practice the oligonucleotide hybridized to the other allele is, though only to a minor extent, also extended. This reduces the sensitivity and, particularly, the specificity of the method. Non-specific extension may easily occur especially when T is part of the 3'-terminal mismatch or when the mismatch is a C:A mismatch (Kwok et al. (1990). Nucleic Acids Research, 18:999-1005). In order to increase specificity, EP-A-0 332 435 proposes to select the nucleotide sequence of an oligonucleotide such that the terminal area contains another nucleotide that is not complementary to the corresponding nucleotide of the two nucleic acids. For the detection of both alleles two reactions must be carried out with only one of two alleles being detected per reaction. This procedure requires the synthesis of two allele-specific primers and one complementary strand primer. The sample is amplified in two reactions: once in a PCR with the primer of the complementary strand and one of the allele-specific primers, and in the second parallel reaction, a PCR, it is amplified with the complementary strand primer and the second allele-specific primer. If the suspected allele-specific PCR product is not detected in one of the reactions, it is assumed that the respective allele is not present in the sample. Since homozygous DNA-samples contain only one of the two alleles which can be detected in only one of the two reactions, it is necessary to use two additional primers which produce the same control product in all reactions. This control product is different from the specific product in order to control the efficiency of the respective PCR of the other allele and to establish the absence of the respective allele. If a control product is present in the PCR product but no specific product of the allele is found, the sample is not likely to contain the allele tested for in the reaction. In this method, the presence or absence of two alleles must be established in two separate reactions and each individual PCR must comprise a control PCR.
Biochem. Biophys. Res. Commun. (1989), 160:441-447 proposes to increase the selectivity by decreasing the dNTP concentration. Even if this additional measure is taken, the detection of alleles in separate batches can yield non-specific products.
In a ligase chain reaction (WO 89/09835), thermostable ligase is used to specifically link two adjacent oligonucleotides. This occurs only if they are hybridized to a complementary target at a stringent hybridization temperature and if base-pairing at the site of linkage is complete. If two alleles differ from each other as a consequence of a mutation at the linkage site, the above condition of complete base-pairing is fulfilled for only one of the alleles. Two additional oligonucleotides, which are complementary to the first two, are then necessary to amplify the ligation product in the ligase chain reaction. To date, the detection of two alleles requires two reactions with at least six oligonucleotides, and the amplification product is detected with a radioactive label (Proc. Natl. Acad. Sci. U.S.A. (1991), 88:189-193).
From Proc. Natl. Acad. Sci. U.S.A. (1985), 82:1585-1588 and from New England Journal of Medicine (1987), 317:985 a method of detecting alleles is known which is based upon differential hybridization of "allele-specific" oligonucleotides (ASO) with the alleles under examination. Two oligonucleotides, each 20 bp in length, for example, are synthesized. Each matches one of the two different alleles but has a mismatch to the other allele located in the middle of the oligonucleotide sequence. Discrimination between alleles is then possible by differential hybridization with labelled oligonucleotides. This applies to the analysis of both human genomic DNA and PCR products. Direct and discrete analysis of genomic DNA is also possible with this method but requires additional digestion and electrophoresis.
Nucleic Acids Research (1989), 17:2437-2448 and EP-A-0 333 465 describe a method of testing pre-amplified human genomic DNA for the presence of various alleles in a few additional PCR cycles by competition of allele-specific primers (competitive oligonucleotide priming=COP). The above described ASO-technique is then converted into a PCR technique. In the original ASO-technique, an error rate of 5% caused by cross hybridization is acceptable since a comparison of the signal intensities during corresponding controls allows an unequivocal interpretation of the results. In a PCR reaction, however, where the primers are allele-specific oligonucleotides, the error rate for a sample that contains only one of the alleles would after ten cycles amount to 12% if this error occurred in a reagent mixture where both alleles are amplified. It could indeed be demonstrated that primer competition increased selectivity, however, the area of interest of the genomic DNA was first amplified in a PCR and the analysis for the different alleles was then carried out in ten subsequent cycles. Two allele-specific primers and a complementary strand primer were used in these cycles and in two reactions one of the allele-specific primers was radioactively labelled. A selective detection of different alleles has been demonstrated for oligonucleotides of 12 to 16 bases in length whereas longer oligonucleotides under the given conditions also yielded non-specific products.
There is a need for a simple and still very specific method for directly detecting at least one single base difference in a sample containing nucleic acids such as genomic DNA in which detection steps are minimized resulting in a method which may be performed quickly, accurately and easily with minimal operator skills.
Methods that are based on differential hybridization can only be applied in certain situations and are moreover very complex and susceptible to interference. Also, pre-amplification is a procedural step, e.g. during COP, which preferably is eliminated.
All of the above mentioned methods for detecting variant nucleic acids have the same feature of relying on unmodified nucleotides as the discriminating factor in the detection of the variant nucleic acids.