DNA sequencing is an essential tool in molecular genetic analysis. The ability to determine DNA nucleotide sequences has become increasingly important as efforts have proceeded to determine the sequences of the large genomes of humans and other organisms.
Techniques enabling the rapid detection of a single DNA base change, or a few base changes, are also important tools for genetic analysis, for example in clinical situations in the analysis of genetic diseases or certain cancers. Indeed, as more and more diseases are discovered to be associated with changes at the genetic level, most notably single nucleotide polymorphisms (SNPs), the need for methods of both screening for SNPs or other mutations or genetic changes (by sequencing representative genomic samples) and scoring SNPs (or other mutations/changes) grows. Thus, as well as the development of novel sequencing technologies for determining the sequence of longer stretches of DNA, the art has also seen a rapid rise in the development of technologies for detecting single (or a few) base changes. Such protocols to determine more limited sequence information, relating to only one or a few bases are termed mini-sequencing.
The method most commonly used as the basis for DNA sequencing, or for identifying a target DNA base, is the enzymatic chain-termination method of Sanger. Traditionally, such methods relied on gel electrophoresis to resolve, according to their size, DNA fragments produced from a larger DNA segment. However, in recent years various sequencing technologies have evolved which rely on a range of different detection strategies, such as mass spectrometry and array technologies.
One class of sequencing methods assuming importance in the art are those which rely upon the detection of PPi release as the detection strategy. It has been found that such methods lend themselves admirably to large scale genomic projects or clinical sequencing or screening, where relatively cost-effective units with high throughput are needed.
Methods of sequencing based on the concept of detecting inorganic pyrophosphate (PPi) which is released during a polymerase reaction have been described in the literature for example (WO 93/23564, WO 89/09283, WO98/13523 and WO 98/28440). As each nucleotide is added to a growing nucleic acid strand during a polymerase reaction, a pyrophosphate molecule is released. It has been found that pyrophosphate released under these conditions can readily be detected, for example enzymically e.g. by the generation of light in the luciferase-luciferin reaction. Such methods enable a base to be identified in a target position and DNA to be sequenced simply and rapidly whilst avoiding the need for electrophoresis and the use of labels.
At its most basic, a PPi-based sequencing reaction involves simply carrying out a primer-directed polymerase extension reaction, and detecting whether or not that nucleotide has been incorporated by detecting whether or not PPi has been released. Conveniently, this detection of PPi-release may be achieved enzymatically, and most conveniently by means of a luciferase-based light detection reaction termed ELIDA (see further below).
It has been found that dATP added as a nucleotide for incorporation, interferes with the luciferase reaction used for PPi detection. Accordingly, a major improvement to the basic PPi-based sequencing method has been to use, in place of dATP, a dATP analogue (specifically dATPαs) which is incapable of acting as a substrate for luciferase, but which is nonetheless capable of being incorporated into a nucleotide chain by a polymerase enzyme (WO98/13523).
Further improvements to the basic PPi-based sequencing technique include the use of a nucleotide degrading enzyme such as apyrase during the polymerase step, so that unincorporated nucleotides are degraded, as described in WO 98/28440, and the use of a single-stranded nucleic acid binding protein in the reaction mixture after annealing of the primers to the template, which has been found to have a beneficial effect in reducing the number of false signals, as described in WO00/43540.
However, even with the modified and improved PPi-based sequencing methods mentioned above, there is still room for improvement, for example to increase the efficiency and/or accuracy of the procedure, or, as discussed further below, to increase the sequence read length possible. The present invention addresses these needs.
In particular, the present invention is concerned with methods of PPi-based sequencing which use an α-thio analogue of deoxy ATP (dATP) (or dideoxy ATP (ddATP)) namely an (1-thio) triphosphate (or α-thiophosphate) analogue of deoxy or dideoxy ATP, preferably deoxyadenosine [1-thio]triphosphate or deoxyadenosine α-thiotriphosphate (dATPαs) as it is also known. dATPαs (as with all α-thio nucleotide analogues) occurs as a mixture of isomers, the Rp isomer and the Sp isomer.
When dATPαS (and/or other α-thio nucleotides) are used, it has been found that the efficiency of the sequencing method decreases as the number of cycles increases, and in particular that the read length attainable is limited (e.g. to 40-50 bases). This is believed to be due to the accumulation of inhibitory substances in the reaction system. The present invention is particularly concerned with reducing or removing such inhibitory effects.