Spectroscopy is an analytical technique concerned with the measurement of the interaction of radiant energy with matter and with the interpretation of the interaction both at the fundamental level and for practical analysis. Interpretation of the spectra produced by various spectroscopic instrumentation has been used to provide fundamental information on atomic and molecular energy levels, the distribution of species within those levels, the nature of processes involving change from one level to another, molecular geometries, chemical bonding, and interaction of molecules in solution. Comparisons of spectra have provided a basis for the determination of qualitative chemical composition and chemical structure, and for quantitative chemical analysis.
One particular spectroscopic technique, known as Raman spectroscopy, utilizes the Raman effect, which is a phenomenon observed in the scattering of light as it passes through a material medium, whereby the light suffers a change in frequency and a random alteration in phase. Raman spectroscopy is a spectrochemical technique that is complementary to fluorescence, and has been an important analytical tool due to its excellent specificity for chemical group identification. One of the major limitations of Raman spectroscopy is its low sensitivity. Recently, the Raman technique has been rejuvenated following the discovery of enormous Raman enhancement of up to 106 for molecules adsorbed on microstructures of metal surfaces.
Deoxyribonucleic acid (DNA) is the main carrier of genetic information in most living organisms. DNA is essentially a complex molecule built up of deoxyribonucleotide repeating units. Each unit comprises a sugar, phosphate, and a purine or pyrimidine base. The deoxyribonucleotide units are linked together by the phosphate groups, joining the 3' position of one sugar to the 5' position of the next. The alternate sugar and phosphate residues form the backbone of the molecule, and the purine and pyrimidine bases are attached to the backbone via the 1' position of the deoxyribose. This sugar-phosphate backbone is the same in all DNA molecules. What gives each DNA its individuality is the sequence of the purine and pyrimidine bases.
The resolution of the genetic material of an organism into linear sequence of the material's elements is known as "genetic mapping". In decreasing order of size these elements include the chromosome, the genes, the codons, and the nucleotides of the DNA. Complete mapping of the genetic material of an organism implies the complete description of its DNA sequences. This has been achieved in only a few cases, with small viruses with sequences only a few thousand nucleotides long. Higher plants or animals have vastly longer sequences.
Current DNA sequencing techniques are based on two methods developed in the late 1970's. Most previous research efforts have been devoted to automation of existing sequencing methods and the use of fluorescent labels. Two of the most prominent sequencing techniques are known respectively as the Sanger method and the Maxam-Gilbert method. In both, radiolabeled DNA fragments are generated chemically or enzymatically. The DNA fragments are then separated on a molecular weight basis using polyacrylamide gel electrophoresis. The gels are dried and autoradiographed to image the DNA band pattern. The band pattern is interpreted to yield sequence information, and the sequence data are entered into a computer for further analysis.
Although these sequencing methods are very useful, they suffer from several notable limitations. For example, the radioactive labels used for detection present a potential health hazard. These radioisotopes are also unstable and expensive for large-scale applications. Moreover, they require highly trained personnel and their disposal often creates serious environmental and safety problems.
L. M. Smith et al, in Nature, Vol. 321, p. 674 (Jun. 12, 1986), describes a fluorescence method for partial automation of DNA sequence analysis. The detection of the DNA fragments obtained by the Sanger method is performed by measuring the emission of a fluorescent label covalently attached to the oligonucleotide primer used in enzymatic DNA sequence analysis. Four different types of fluorescent labels were used for each of the reactions specific for the bases adenosine (A), cytosine (C), guanosine (G), and thymidine (T). The reaction mixtures are combined and co-electrophoresed down a single polyacrylamide gel tube. The basic structure of the Smith et al. sequencer is illustrated in FIG. 1. The sequencer includes upper and lower buffer reservoirs 10 and 12 between which extends a polyacrylamide gel column 14. The fluorescent bands of DNA are detected near the bottom of the tube 14 with a detector 16 and the sequence information is acquired directly by a computer 18. An idealized output of the sequencer is illustrated in FIG. 2.
Another system for DNA sequencing in which four chemically related, yet distinguishable fluorescence-tagged dideoxynucleotides are used to label DNA by a modified Sanger protocol has been recently developed by Prober et al., as described in Science, 238, 336 (1987). It was suggested that the fluorescent sequencing fragments are resolved temporally rather than spatially in a single band by conventional polyacrylamide electrophoresis. The dyes used are a family of 9-(carboxymethyl)-3-hydroxy-6-oxo-6H-xanthines or succinylfluoresceins. These fluorophores have largely overlapping yet distinct emission bands. The automated sequencer is capable of determining 50 bases per hour per lane. A fully loaded gel thus yields a throughput of about 600 bases per hour.
There has been little progress made in the development of advanced detection technologies that will provide new or improved systems for DNA sequence detection and analysis. Current instruments use straightforward fluorescence detection with optical filters. These detection techniques are based solely on a single spectroscopic method which may be prone to misreading errors. For example, the fluorescence spectra of the three chemical dyes or "labels" NBD (4-chloro-7-nitrobenzo-2-oxa-1-diazole), Texas Red dye, and Fluorescein, which are commonly used for DNA sequencing, are broad, structureless and overlap one another severely.
Thus, there is a need for new or improved methodologies and instrumentation that will improve the accuracy, speed and efficiency of DNA sequencing.