There is great interest in identifying the composition and sequence of various biomolecules, such as human DNA, with accuracy and specificity. Sequencing technology, however, is time consuming and expensive to develop and implement. For example, sequencing the DNA of a single individual for the Human Genome Project required over $3 billion of funding.
It is estimated that each person's DNA varies from one another by approximately 1 base in 1000. Knowledge of such genetic variations among human populations may allow the scientific community to identify genetic trends that are related to various medical predispositions, conditions, or diseases, and may lead to the realization of truly personalized medicine where treatments are customized for a given individual based on that individual's DNA. A reduction in the time and cost of DNA sequencing is needed to develop such knowledge and to tailor medical diagnostics and treatments based on the genetic makeup of individual patients.
One particular obstacle inherent in known methods is the inability to accurately position repetitive sequences in DNA fragments. Furthermore, known methods cannot determine the length of tandem short repeats, which are associated with several human genetic diseases.
One emerging sequencing technology employs nanopore or micropore devices. Nanopores are substantially cylindrical holes formed in a membrane or solid media, said holes having diameters that range from about 1 nm to about 200 nm. Some existing methods using nanopores have attempted to detect single DNA bases as they move through a nanopore under a bias voltage. However, it is difficult to detect single DNA bases as each base passes through the nanopore. Furthermore, the use of nanopores small enough to track single stranded DNA are unreliable and difficult to form.
Other methods have attempted to use nanopores to detect DNA hybridization probes or oligonucleotides on a DNA molecule and to recover the DNA sequence information using the method of Sequencing-By-Hybridization (SBH). SBH is a two step procedure, wherein the collection of all subsequences that make up a target sequence is first determined by detecting hybridization of sequence-specific probes or a pool of probes to the target sequence and then using an algorithm that relies on the use of combinatorial methods to reconstruct the full sequence of the target using the collection of subsequences. Most of the SBH methods have relied on standard DNA probes, termed k-mers (see e.g., E. M. Southern. “DNA chips: analysing sequence by hybridization to oligonucleotide on a large scale” Trends in Genetics, 12(3), 110-115 (1996)).
SBH procedures can also be used to attach a large set of single-stranded fragments or probes to a substrate to form a sequencing chip. When a solution of labeled, single-stranded target DNA fragments is exposed to the chip, the target fragments hybridize with complementary sequences on the chip. The hybridized fragments can be identified using a radiometric or optical detector depending on the selected label. Each hybridization provides information about whether the fragment sequence is a subsequence of the target DNA. The target DNA can then be sequenced based on which strings are and are not substrings of the target sequence.
The efficiency of SBH methods is poor. For example, large probe arrays are required to sequence modest target lengths. Furthermore, the information regarding the binding position along the target sequence of a given fragment with respect to other fragments is not generated using this experimental approach, and the number of times that a fragment binds a target is also undetermined. While SBH may be a useful for sequencing variants of known molecules, it is not useful for sequencing organic biomolecules at high throughput and accuracy. Still further, the algorithms that are used to reconstruct the target sequence from the hybridization data have not prove useful in practice because known SBH methods do not return sufficient information to sequence long fragments. Thus, these limitations have prevented the adoption of SBH as a primary sequencing method. There is therefore a need for improved methods of sequencing organic biomolecules such as DNA.