Since the cracking of the genetic code in the middle of the twentieth century, determining the base sequences of DNA and RNA has been a tool for elucidating the primary structure of peptides and proteins. Sequence information is also useful for determining gross polynucleotide structure and control of gene expression. The base sequences of non-coding polynucleotide regions are also useful for studying mutation events, phylogenetic linkages, polynucleotide structural characteristics, cell cycle control, cancer and transcription and translation mechanisms.
Two sequencing methods are commonly used: the Maxam-Gilbert, or chemical degradation method, and the Sanger, or dideoxy terminator or enzymatic method. Either method delivers a family of DNA strands. Each strand species is incrementally longer by one base than the next smaller species. By tagging the strands to indicate which nucleotide is additional to the next smaller strand species, the sequence of bases of the polynucleotide can be determined. Gel electrophoresis is commonly used to resolve the different lengths for analysis and determination of sequence.
During the sequencing reaction each polynucleotide strand can be tagged by labelling the primers, by labelling the terminal base itself, or by labelling a plurality of one base incorporated into each strand.
Two common labelling methods are the use of radioisotopes and fluorescent tags. Using a different fluorescent tag for each terminal base allows sequencing analysis to be accomplished in a single electrophoresis lane. Other methods require multiple parallel lanes for sequencing one polynucleotide fragment.
For example, the original Sanger method required four (4) parallel reaction vessels. Each vessel was identical except for the terminating dideoxy base included in the reaction mixture. Thus when products from the four vessels were electrophoresed in four parallel lanes, each lane revealed only the DNA strands terminating in the respective dideoxy base. By comparing the four lanes containing a DNA ladder of lengths of DNA differing by only one base and knowing the terminating base of each lane, the sequence could be determined.
Another method being developed uses non-radioactive isotopic labels and mass spectrometry to determine the polynucleotide sequence.
At present the most automated systems use either Sanger or Maxam-Gilbert sequencing chemistry, and tag the resultant DNA species with fluorescent probes. On line detection is accomplished as each band is electrophoresed past a detection window. Commercial embodiments of this technology, however, are limited to thirty-six or fewer simultaneous sequence determinations per electrophoresis plate.
An automated electrophoresis apparatus is described in U.S. Pat. No. 5,279,721 to Schmid. Molecules are electrophoretically separated, based on molecular weight, by a horizontal electric field. An impermeable sheet is then removed allowing a vertical electrical field to effect transfer of the separated substances to a blot membrane.
While automating some aspects of electrophoresis and electroblotting, the apparatus described in U.S. Pat. No. 5,279,721 does not sequence a polynucleotide or provide means for the required multiple serial reactions. Rather, it addresses Southern blotting procedures wherein specific nucleotide sequences are detected by complementary binding with a probe nucleotide strand.
Another method, described in U.S. Pat. No. 5,302,509 to Cheeseman, uses a solid support to anchor a DNA template to the apparatus and determines each complementary base species as it is added during the synthesis process. This method does not describe gel electrophoresis for separation.
Solid phase supports are also described in WO 93/20232. Here two or more regions of target DNA could be sequenced by annealing them to opposite selective sequencing primers. A modified Sanger reaction followed. In a preferred embodiment formamide was used to chemically melt the DNA from the Dynabead supports before electrophoresis into the separating gel. This method lends itself to PCR amplification of very small quantities of DNA prior to the sequencing reactions.
These and other sequencing schemes are advancing due to the impetus of the human genome project. The goal of the genome project, to sequence the entire human genome (and selected genes of other species) has been likened to the 1960's era space program to put a man on the moon by the end of the decade. Many researchers are therefore proposing methods to rapidly and inexpensively sequence massive lengths of genetic materials. Reducing the costs and errors inherent in human manipulations is a common thread of these proposals.