Recent sequencing techniques allow the simultaneous determination of large quantities of nucleotide sequences. This abolishes the need to perform separate sequencing reactions in different capillaries or separated reaction wells. Typically, a DNA sample is fragmented by mechanical or enzymatic techniques, after which individual DNA fragments are bound to a substrate (e.g. the wall of a reaction chamber or a microcarrier/bead) via one type of nucleotide linker molecule attached to the fragment, which also functions as a universal primer. For technologies other than single molecule sequencing, a PCR-based amplification step follows. For example, sequencing techniques such as “454” pyrosequencing (Roche) use individual microbeads as a substrate, which are arranged in microwells of a reaction chamber. Subsequently in all techniques, nucleotides are stepwise incorporated and identified for each DNA molecule bound to the substrate. This process is repeated a number of times and the sequencing reads of all the individual fragments are aligned to get the complete sequence of a target DNA sample under investigation. These techniques are known in the art as “next generation sequencing” and are commercialized by companies such as Helicos, Illumina and Applied Biosystems and Roche. Next generation sequencing methods require that the different reactions, which are performed at the same time, can be physically separated from each other.
Enrichment of a DNA sample can be performed prior to sequencing in order to reduce the complexity of the sample and select specific areas of the genome for sequencing. Methods are described in Hodges et al. (2007) Nature Genetics 39, 1522-1527 to select or enrich genomic DNA fragments for subsequent sequencing by the choice of hybridization probes. More versatile methods for DNA hybridization have been developed wherein multiplexing methods are performed partially or entirely in solution and wherein the individual reactions are indexed using microcarriers with different colors or using encoded markers (reviewed in Braeckmans et al. (2002) Nature Rev. Drug Discovery 1, 447-448). In this context, Braeckmans et al. (2003) Nature Mat. 2, 169-173, suggest the use of photobleached encoded particles for DNA hybridization assays.
A drawback of current sequencing methods is the comparatively short individual read length of DNA that can be sequenced, leading to sequences that have limited information content. This makes it often difficult to position a determined sequence in a reference genome sequence (annotation of sequences). This is especially difficult in the case of bisulfate sequencing, where prior to the sequencing reaction unmethylated C nucleotides within CpG sequences in a fragment are converted to T nucleotides, resulting in reduced information for alignment of sequencing reads with the reference genome and consequently increased difficulty in correct assembly of the final nucleic acid sequence. Especially for future clinical diagnostic applications such as cancer sequencing, microbiology and clinical genetics, it would be advantageous to speed up the time needed to determine the sequence of a patient genomic sample, and obtain the highest possible accuracy.