The rate of determining the sequence of the four nucleotides in DNA samples is a major technical obstacle for further advancement of molecular biology, medicine, and biotechnology. Nucleic acid sequencing methods which involve separation of DNA molecules in a gel have been in use since 1978. The other proven method for sequencing nucleic acids is sequencing by hybridization (SBH).
The traditional method of determining a sequence of nucleotides (i.e., the order of the A, G, C and T nucleotides in a sample) is performed by preparing a mixture of randomly-terminated, differentially labelled DNA fragments by degradation at specific nucleotides, or by dideoxy chain termination of replicating strands. Resulting DNA fragments in the range of 1 to 500 bp are then separated on a gel to produce a ladder of bands wherein the adjacent samples differ in length by one nucleotide.
The array-based approach of SBH does not require single base resolution in separation, degradation, synthesis or imaging of a DNA molecule. Using mismatch discriminative hybridization of short oligonucleotides K bases in length, lists of constituent K-mer oligonucleotides may be determined for target DNA. DNA sequence for the target DNA may be assembled by uniquely overlapping several oligonucleotides.
There are several approaches available to achieve sequencing by hybridization. In a process called SBH Format 1, DNA samples are arrayed, and labeled probes are hybridized with the samples. Replica membranes with the same sets of sample DNAs may be used for parallel scoring of several probes and/or probes may be multiplexed. DNA samples may be arrayed and hybridized on nylon membranes. Each membrane array may be reused many times. Format 1 is especially efficient for batch processing large numbers of samples.
In SBH Format 2, probes are arrayed at location on a substrate which correspond to their respective sequences, and a labeled DNA sample fragment is hybridized to the arrayed probes. In this case, sequence information about a fragment may be determined in a simultaneous hybridization reaction with all of the arrayed probes. For sequencing other DNA fragments, the same oligonucleotide array may be reused. The arrays may be produced by spotting or by in situ synthesis of probes.
In Format 3 SBH, two sets of probes are used. One set may be in the form of arrays of probes with known positions, and another, labeled set may be stored in multiwell plates. In this case, target DNA need not be labelled. Target DNA and one or more labelled probes are added to the arrayed sets of probes. If one attached probe and one labelled probe both hybridize contiguously on the target DNA, they are covalently ligated, producing a detected sequence equal to the sum of the length of the ligated probes. The process allows for sequencing long DNA fragments, e.g. a complete bacterial genome, without DNA subcloning in smaller pieces.
In the present invention, SBH is applied to the efficient identification and sequencing of one or more DNA samples. The procedure has many applications in DNA diagnostics, forensics, and gene mapping. It also may be used to identify mutations responsible for genetic disorders and other traits, to assess biodiversity and to produce may other types of data dependent on DNA sequence.