Ever since Watson and Crick elucidated the structure of the DNA molecule in 1953, genetic researchers have wanted to find fast and efficient ways of sequencing individual DNA molecules. Sanger/Barrell and Maxam/Gilbert developed two new methods for DNA sequencing between 1975 and 1977, which represented a major breakthrough in sequencing technology. All methods in extensive use today are based on the Sanger/Barrell method and developments in DNA sequencing in the last 23 years have more or less been modifications of this method.
Polynucleotides are polymeric molecules comprising repeating units of nucleotides bound together in a linear fashion. Examples of polynucleotides are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). DNA polymers are made up of strings of four different nucleotide bases known as adenine (A), guanine (G), cytosine (C), and thymine (T). The particular order, or “sequence” of these bases in a given gene determines the structure of the protein encoded by the gene. Furthermore, the sequence of bases surrounding the gene typically contains information about how often the particular protein should be made, in which cell types, etc. Knowledge of the DNA sequence in and around a gene provides valuable information about the structure and function of the gene, the protein it encodes, and its relationship to other genes and proteins. RNA is structurally and chemically related to DNA, however, the sugar component of RNA is ribose (as opposed to DNA which is deoxyribose) and the base thymine is substituted with uracil.
It is appreciated by those skilled in the art that there is a direct relationship between particular DNA sequences and certain-disease states. This fact has encouraged many pharmaceutical companies to invest heavily in the field of genomic research in the hope of discovering the underlying genetic nature of these diseases.
Another reason that sequence information is important is the expected ability to determine an individual's susceptibility to particular diseases based on his or her genetic sequence. The field of genetic diagnostics is dedicated to identifying nucleotide sequence elements whose presence in a genome correlates with development of a particular disorder or feature. The more information is available about genomic sequence elements observed in the population the more powerful this field becomes. Furthermore, the more rapidly information about the prevalence and penetrance of sequence elements in the general population, as well as the presence of such elements in the genomes of particular individuals being tested, the more effective the analysis becomes.
Yet another reason that sequence information is valuable is that a number of pharmaceutical companies seek to develop drugs that are custom-tailored to an individual's genetic profile. The hope is to provide targeted, potent drugs, possibly with decreased dosage levels appropriate to the genetic characteristics of the particular individual to whom the drug is being administered.
Most currently available nucleotide sequencing technologies determine the nucleotide sequence of a given polynucleotide strand by generating a collection of complementary strands of different lengths, so that the collection includes molecules terminating at each base of the target sequence and ranging in size from just a few nucleotides to the full length of the target molecule. The target molecule's sequence is then determined by analyzing the truncated complementary strands and determining which terminate with each of four DNA nucleotides. A “ladder” is constructed by arranging the truncated molecules in order by length, and the terminal residue of each rung is read off to provide the complement of the target polynucleotide sequence.
Currently available DNA sequencing systems are very powerful. However, they are limited by their speed, their complexity, and their cost. The speed of currently available automated sequencers is limited by the inability of the machines to analyze more than several hundred (typically around 600) nucleotides of sequence at a time. Allowing for the overlaps needed to piece together correctly strands less than 1000 bases longs, the standard sequencing process has to be performed as many as 70 million times in order to determine the human genome sequence (Technology Review 102(2):64-68 1999 March/April; incorporated herein by reference). At a theoretical rate of even 100 million bases per day it will take at least a year to sequence the human genome once. With these techniques, large-scale sequencing cannot become a clinical tool. For genetic diagnostics to become practical in a clinical setting, the sequencing rate will have to be increased by at least three to five orders of magnitude.
The complexity of current sequencing technology arises from the need to amplify and modify the genetic molecules being sequenced. This modification is carried out either chemically or enzymatically, and amplification is achieved by numerous cycles of heating and cooling. One of the more popular ways of amplifying and modifying the DNA to be sequenced is using the polymerase chain reaction (PCR). PCR involves successive rounds of denaturing, annealing, and extension using a DNA polymerase and resulting in the exponential amplification of the original strand of DNA. The length of time associated with each part of the cycle depends on the fluid volume and the length of DNA to be amplified. Typical times are on the order of 10-30 seconds for the denaturation step, 5-30 seconds for the annealing step, and 1-4 minutes for the extension step. This cycle is usually carried out 15 to 30 times. Therefore, normal PCR times are one-to three hours depending on the length of the DNA to be amplified. The fundamental physical processes that constrain the denaturing, annealing, and labeling are the number of detectable strands needed, the time needed to carry out this process, and the processivity of the enzyme. This entire process is time consuming and requires following involved procedures.
Currently there is a need for a more efficient method for sequencing polynucleotides. The present invention provides for such a method.