The ability to determine the identity of a nucleotide within a characterised sequence of DNA has many applications in the fields of medical and forensic science. For instance, changes in one or more individual, ie. single, bases in genomic DNA have been shown to be associated with a number of human hereditary diseases including muscular dystrophy and cystic fibrosis. The identification of such mutations at the prenatal and postnatal stages can be a valuable diagnostic tool. Similarly, the identity of single bases at several polymorphic sites in human DNA can provide an accurate method for matching forensic samples with genetic material taken from known subjects.
Methods for the detection of characterised sequences or variations are known in which the region of DNA containing the variation is first amplified by the Polymerase Chain Reaction (PCR) and the sample is then tested using immobilised oligonucleotide probes which correspond to the possible variations in the region (Saiki et al. 1989; Proc Natl Acad Sci U.S.A. 86: 6230-6234). Such methods are cumbersome because a probe is required for each possible variation, and a separate reaction must be carried out for each probe.
Methods are also known for detecting a single base variation in which first a segment of DNA is amplified by PCR using two primers, one of which has been conjugated to biotin. The resulting biotin-DNA is immobilised and used as a template for a single detection-step primer which anneals to the DNA immediately upstream of the site of the variation. The variation is then investigated using a pair of radiolabelled nucleoside triphosphates corresponding to two possible base variations. These are added to the immobilised DNA/primer mixture in the presence of a suitable polymerase.
The identity of the base variation can then be ascertained by using a scintillation counter to measure the radioactivity incorporated into the eluted detection primer. Alternatively a digoxigenin label can be used which can be detected by spectrophotometery. This method has the disadvantage that a separate incorporation experiment must be carried out for each possible variation in each variable region. By using two distinguishable radiolabels, the number of experiments can be reduced slightly. However, each variable region must still be analysed separately which makes it laborious when analysing several polymorphic sites, for instance when compiling stringent forensic data or screening for several different inherited diseases. The present inventors have now provided a method that addresses some, and in preferred forms all, of these problems.
According to a first aspect of the present invention there is provided a method for determining the identity of at least two discrete single nucleotide bases each adjacent to a predetermined target nucleotide base sequence in a target sample comprising one or more types of polynucleotide chain, the method comprising mixing the target sample with (i) nucleotide primers which are complementary to the predetermined base sequences such that they anneal thereto at positions adjacent to the bases to be identified, (ii) at least two types of chain terminator each type labelled with a characteristic fluorescent group, and (iii) a nucleotide chain extending enzyme such that terminators complementary to the bases to be identified are incorporated into the nucleotide primers; separating the types of extended nucleotide primer on basis of size and/or charge and identifying the terminators incorporated into each type of nucleotide primer by reference to its fluorescent characteristics. Using the preferred embodiments the present invention provides a method for rapidly determining several discrete bases simultaneously.
Preferably the chain terminators are dideoxynucleoside triphosphates (ddNTPs); however other terminators such as might occur to the skilled addressee eg. nucleotide analogs or arabinoside triphosphates, are also encompassed by the present invention.
Preferably the polynucleotide is DNA, however the invention could also be applied to RNA were suitable enzymes to become available. The primary requirement for the method to operate is that each of the unknown bases is adjacent to a nucleotide base sequence which is sufficiently elucidated to allow the design of a working primer i.e one which can initiate accurate template-mediated polymerisation. The term `adjacent` in this context means one base upstream of the unknown base i.e in the 3' direction with respect to the template strand of the target DNA.
As is known, ddNTPs differ from conventional deoxynucleoside triphosphates (dNTPs) in that they lack a hydroxyl group at the 3' position of the sugar component. This prevents chain extension of incorporated ddNTPs, and thus leads to termination. Although the use of ddNTPs in conjunction with dNTPS for the sequencing of DNA chains by the Sanger-Coulson method is well documented, in the present invention ddNTPs are used without dNTPS; hence chain extension by the chain extending enzyme terminates after the addition of only one base which is complementary to the base being determined.
Each of the ddNTPs used in the present invention is labelled with a distinguishable fluorescent group, thereby allowing all possible base identities to be ascertained in a single operation. Any distinctive fluorescent label which does not interfere with the incorporation of the ddNTP into a nucleotide chain may be suitable. Dye labels having these characteristics are discussed by Lee et al. 1992; Nucleic Acids Research Vol. 20 10: 2471-2483. The fluorescently labelled nucleotides generated by the methods of the current invention can be conveniently scanned using conventional laboratory equipment, for instance the Applied Biosystems Inc. Model 373 DNA Sequencing system.
Preferably the target DNA in the sample to be investigated is first amplified by means of the Polymerase Chain Reaction (PCR) technique well known to those skilled in the art. Enriching the target DNA used in the method can provide a quicker, more accurate. template-directed synthesis by the nucleotide chain extending enzyme. Since target DNA used in the method can consist of several different regions or chains of DNA, these can potentially be generated in a single PCR step by using several different primer pairs. The invention can be carried out without any need to separate the target chains.
Preferably the target nucleotide sequence in the sample, or a corresponding nucleotide sequence derived from it (eg. by PCR) is purified before mixing with agents (i) to (iii) by incorporating a capture group into it and immobilising it through that group. By carrying out PCR with primers which have been conjugated to a capture group, a population of target DNA can be generated which can be readily immobilised onto an insoluble, solid-phase substrate adapted to complement the capture group. Alternatively the capture group can be annealed to the target DNA directly. Any pair of chemical species which bind strongly, and one of which can be annealed to nucleotide chains, can be used. Suitably the biotin/avidin pair can be employed, with the biotin being annealed to the target DNA and the avidin being attached to a solid substrate eg. latex or polystyrene coated magnetic beads.
Immobilisation greatly facilitates the efficient removal of unincorporated primers and labelled ddNTPs, which will in turn improve the analysis of the extended primers to see which ddNTPs have been incorporated into them. This is particularly important when the invention is being applied to identify a large number of nucleotide bases in a single operation and hence where there will be many extended primer products to separate and analyse.
The number of types ddNTP which are used in the method will depend on the number of possible identities which the bases to be determined could possess. Thus, for instance, if none of the bases to be determined is likely to be an adenosine residue, then ddTTP can be omitted from the reaction mixture. In most cases, however, it will be preferable to have four ddNTP species present, so as to be able to accurately detect all possible combinations.
The nucleotide chain extending enzyme is preferably a DNA polymerase, or viable fragment thereof (such as the Klenow fragment). Most preferably the DNA polymerase is a thermostable polymerase, such as that from Thermus aquaticus (`Taq polymerase`).