The growing demand for sequencing of unknown nucleic acid sequences has spurred the demand for rapid, inexpensive methods of sequencing large amounts of DNA. For example, the Human Genome Initiative will require the sequencing of about 4 billion base pairs of DNA. However, it is possible that current sequencing methodologies, such as Sanger or Maxam-Gilbert sequencing, are not capable of high enough throughput to allow a project of this magnitude to be completed in a reasonable time.
The attention of many researchers has turned to sequencing methods which process sequences in parallel, rather than the serial sequencing methods described above. The promise of parallel "sequencing by hybridization" (SBH) methods is that large amounts of information can potentially be obtained rapidly, in a single experiment. SBH involves the use of multiple probes disposed in an array format to bind to a sample of a target nucleic acid which has been cleaved into smaller fragments. Presently, however, SBH has been attempted on only small DNA targets and with small probe arrays.
Certain problems have arisen in attempts to implement SBH schemes. One serious difficulty is the need to correctly discriminate between target fragments that are perfectly matched to a probe sequence, and target fragments that are bound to a probe sequence despite one or more mismatched bases. This "mismatch discrimination" problem presents the possibility of misidentification of sequences. The problem is especially acute when attempting to differentiate between sequences which bind with significantly different binding energies. For example, in general, AT-rich sequences bind less strongly to their complementary probes than do GC-rich sequences, of the same length, to their respective complementary probes. Thus, it can be difficult to distinguish between perfectly-bound AT-rich sequences and partially mismatched GC-rich sequences. In view of these difficulties, hybridization of mismatched sequences is undesirable, as it makes the unambiguous determination of the target sequence harder to achieve.