Integrated genetic and physical genome maps are extremely valuable for map-based gene isolation, comparative genome analysis and as sources of sequence-ready clones for genome sequencing projects. The effect of the availability of an integrated map of physical and genetic markers of a species for genome research is enormous. Integrated maps allow for precise and rapid gene mapping and precise mapping of microsatellite loci and SNP markers. Various methods have been developed for assembling physical maps of genomes of varying complexity. One of the better characterized approaches use restriction enzymes to generate large numbers of DNA fragments from genomic subclones (Brenner et al., Proc. Natl. Acad. Sci., (1989), 86, 8902-8906; Gregory et al., Genome Res. (1997), 7, 1162-1168; Marra et al., Genome Res. (1997), 7, 1072-1084). These fingerprints are compared to identify related clones and to assemble overlapping clones in contigs. The utility of fingerprinting for ordering large insert clones of a complex genome is limited, however, due to variation in DNA migration from gel to gel, the presence of repetitive DNAs, unusual distribution of restriction sites and skewed clone representation. Most high quality physical maps of complex genomes have therefore been constructed using a combination of fingerprinting and PCR-based or hybridisation based methods. However, one of the disadvantages of the use of fingerprinting technology is that it is based on fragment-pattern matching, which is an indirect method.
It would be preferred to create physical maps by generating the contigs based on actual sequence data, i.e. a more direct method. A sequence-based physical map is not only more accurate, but at the same time also contributes to the determination of the whole genome sequence of the species of interest. Recently methods for high throughput sequencing have been made available that would allow for the determination of complete nucleotide sequences of clones in a more efficient and cost-effective manner.
However, detection by sequencing of the entire restriction fragment is still relatively uneconomical. Furthermore, the current state of the art sequencing technology such as disclosed herein elsewhere (from 454 Life Sciences, www.454.com, Solexa, www.solexa.com, and Helicos, www.helicosbio.com), despite their overwhelming sequencing power, can only provide sequencing fragments of limited length. Also the current methods do not allow for the simultaneous processing of many samples in one run.
It is now the goal of the present invention to devise and describe a strategy that allows for the high throughput generation of a physical map based on a combination of restriction digestion, pooling, highly accurate amplification and high throughput sequencing. Using this method, physical maps can be generated, even of complex genomes.