New diagnostic tools for personalized medicine and the rapidly evolving field of genetics requires inexpensive, fast, reliable, enzyme-free, and high-throughput sequencing techniques. While several DNA sequencing techniques developed recently have tried to reduce the sequencing costs and time, the reported nucleic acid sequences are statistically significant ensemble averages. While these ensemble averages can be used to derive some correlation between nucleotide sequences and physiological behavior, trace levels of genetic variations or mutations can dominate the biological functions. This is exemplified by the rapid emergence of multi-drug resistant strains of bacteria, or superbugs, and fast mutating pathogens which nominally exist in trace quantities before drug treatments. Recent studies involving fast identification of drug-resistance encoding DNA sequences, such as β-lactamases, which cause resistance against penicillin-based antibiotics, have shown that these techniques are essential for providing timely, targeted medical intervention, thus underscoring the need for reliable single molecule sequencing tools for rapid and high-throughput sequencing. Current second generation sequencing technologies are capable of detecting single nucleotide polymorphisms (SNP) using deep and ultra-deep (about 100 reads per polynucleotide) sequencing methods, and single copy PCR (polymerase chain reaction) amplification. However, these methods are expensive and technically complex, making them difficult to apply in clinical settings. While recent studies have outlined the potential use of single-cell genomics for medicine and non-invasive clinical applications, these studies involve enzymatic amplification of DNA from single molecules, and DNA sequencing using traditional sequencing tools (optical markers). Thus, the present techniques for identification of DNA rely on enzyme based DNA amplification which can introduce sequence bias and can potentially lead to errors in DNA sequence detection for trace or single-cell samples. Other new techniques have tried to improve the sequencing errors in de novo sequencing, with the use of nucleic acid markers and specific enzymes that allow sequencing of DNA molecules only.
Electronic identification of DNA sequences is a candidate for next-generation sequencing technology, as it may offer an enzyme-free technique without DNA amplification. This method may offer the possibility of reducing processing time and errors associated with other techniques. Several groups have been exploring using nanopore conductance of DNA nucleotides based on either ionic current change along the pore, or tunneling current decay when a base is traversing the pore. In these experiments, DNA is made to travel through a very small hole, where its structure is probed. However, this method lacks single molecule resolution capability and suffers from insufficient change in conductance due to nucleotide modifications, thus limiting its potential use for diagnostics and epigenomics identifications. Other studies have explored scanning tunneling microscopy for single molecule detection and identification. Although imaging of single DNA molecules, using scanning tunneling microscopy has been accomplished, none have offered a reliable method or device for accurate, reproducible, and efficient identification and discrimination of individual nucleotides, nucleosides, and nucleobases or the ability to sequence nucleotides, nucleosides, and nucleobases in a molecule with multiple nucleotides, nucleosides, nucleobases, and combinations thereof.
RNA sequencing presents unique challenges. In the recent years, massively parallel RNA sequencing, has allowed high-throughput quantification of gene expression and identification of rare transcripts, including small RNA characterization, transcription start site identification among others. However, most RNA sequencing methods rely on cDNA synthesis as well as a number of manipulations which introduce bias at multiple levels including priming with random hexamers, ligation, amplification and sequencing. Moreover, a number of common natural (5-methylcytosine, pseudouridine) and chemical modifications (N7-methylguanine) do not stop reverse transcriptase during cDNA synthesis and therefore are not detected using high throughput DNA sequencing methods. Commonly used reverse transcriptases are also known to introduce artifacts into the cDNA, e.g. tendency to delete nucleotides in regions of RNA secondary structure. This leads to a “blurring” of the sequencing pattern in the resultant cDNA. Further, DNA methylation, which is not detected by present sequencing techniques, has been found to be a dominant marker for cancer cells, and can been used to distinguish the somatic changes that occur between cancerous cells and non-cancerous cells.