High throughput DNA sequencing is essential to a broad array of genomic studies, such as whole genome and metagenome sequencing, expression profiling of mRNAs and miRNAs, discovery of alternatively spliced and polyadenylated transcripts, histone and chromatin changes involved in epigenetic events, and identification of binding sites for transcription factors and RNA binding proteins. Sequencing of individual human genomes is especially appealing, with its potentially unlimited but as yet unachieved promise for personalized medicine.
Given the ever-growing importance of high throughput DNA sequencing for biological and anthropological research, agriculture and medicine, there is a need for sequencing technologies that are low-cost and rapid on the one hand, and have high sensitivity and accuracy on the other. Sequencing by Synthesis (SBS) has driven much of the “next generation” sequencing technology, allowing the field to approach the $100,000 Genome [Fuller et al. 2009, Hawkins et al. 2010, Morozova at al. 2009, and Park 2009]. With further improvements in nucleotide incorporation detection methods, SBS could be an engine that drives third-generation platforms leading to the reality of the “$1,000 Genome”. At the same time, since non-fluorescent detection approaches are likely to decrease the cost of obtaining data by avoiding expensive cameras and imaging tools, SBS also offers the possibility of high sensitivity, leading to both longer reads and permitting single molecule sequencing, thereby removing one of the most time-consuming and biased steps—the generation and amplification of DNA templates.
Current commercial next-generation sequencing platforms have certainly made substantial inroads in this direction, with the current cost of sequencing a human genome at high draft coverage significantly below $10,000 [Fuller at al. 2009, Hawkins et al. 2010, Morozova et al. 2009, and Metzker 2010]. Expression studies (e.g. using RNA-Seq) and epigenetic studies (e.g. using Methyl-Seq, ChIP-Seq), among many others, have also benefited greatly from these platforms [Ozsolak et al. 2011, Varley et al. 2010, and Park 2009]. Nonetheless, these costs are still prohibitive for most laboratories and for clinical applications.
All of the current approaches have one or more additional limitations: biased coverage of GC-rich or AT-rich portions of genomes; inability to accurately sequence through homopolymer stretches; inability to directly sequence RNA; high reagent costs; difficulty in sequencing beyond 200 or so nucleotides resulting in difficulty in de novo assembly of previously unsequenced genomes; insufficient throughput due to ceiling on number of possible reads per run.
To overcome these obstacles, a number of third-generation sequencing platforms have appeared on the market, or are in development. All of these have issues with accuracy and most have limited throughput. For example, attempts to sequence DNA using Raman detection have been reported [Kneipp et al. 1998] but thus far have been unsuccessful.
In addition to high throughput DNA sequencing, detection of protein-protein interactions are essential for study of cell biology. Examples of protein-protein interactions include generation of protein assemblies for enzymatic reactions in metabolic pathways (e.g., fatty acid synthesis), ribosomes (protein synthesis), ubiquitin association with proteins destined to be degraded, for transport of ions (multi-subunit membrane channels and pumps), for enhancing or inhibiting transcription of genes (cooperating transcription factors), formation of cellular junctions and cell-cell interactions, and countless other examples. Mutations in these proteins affecting their assembly or interactions are crucial for a number of diseases, and particularly relevant to the development of tumors.
Numerous assays have been developed for detection of specific protein-protein interactions. Biochemical approaches include gel shift assays, cross-linking assays, immunoprecipitation, immunoblotting, etc. The yeast two-hybrid and three-hybrid systems are genetic approaches that have been developed to identify target proteins that can bind to a bait protein molecule. Several of these methods characterize the partners and the complexes by gel electrophoresis with at least one of the partners radiolabeled. Other assays including surface binding assays, for example protein arrays, may use fluorescent tags. Finally, it is possible to reveal the interacting proteins by mass spectrometry.
Recently, use of Raman spectroscopy for molecular detection has been considered. When combined with modified surfaces decorated with colloidal gold or other metals, that coat structures such as pillars or antennae, taking advantage of surface plasmonics, one can obtain extraordinarily enhanced Raman signals (as much as 15 orders of magnitude). However, despite its impressive potential for signal enhancement, a major problem with surface-enhanced Raman spectroscopy (SERS) is consistency, which is related in part to the disposition of the Raman scattering group relative to the gold particles or nanocrystals. Thus it is clear that transformative methods are needed.