Integrated genetic and physical genome maps are extremely valuable for map-based gene isolation, comparative genome analysis and as sources of sequence-ready clones for genome sequencing projects. The effect of the availability of an integrated map of physical and genetic markers of a species for genome research is enormous. Integrated maps allow for precise and rapid gene mapping and precise mapping of microsatellite loci and SNP markers. Various methods have been developed for assembling physical maps of genomes of varying complexity. One of the better characterized approaches use restriction enzymes to generate large numbers of DNA fragments from genomic subclones (Brenner et al., Proc. Natl. Acad. Sci., (1989), 86, 8902-8906; Gregory et al., Genome Res. (1997), 7, 1162-1168; Marra et al., Genome Res. (1997), 7, 1072-1084). These fingerprints are compared to identify related clones and to assemble overlapping clones in contigs. The utility of fingerprinting for ordering large insert clones of a complex genome is limited, however, due to variation in DNA migration from gel to gel, the presence of repetitive DNAs, unusual distribution of restriction sites and skewed clone representation. Most high quality physical maps of complex genomes have therefore been constructed using a combination of fingerprinting and PCR-based or hybridisation based methods. However, one of the disadvantages of the use of fingerprinting technology is that it is based on fragment-pattern matching, which is an indirect method.
It would be preferred to create physical maps by generating the contigs based on actual sequence data, i.e. a more direct method. A sequence-based physical map is not only more accurate, but at the same time also contributes to the determination of the whole genome sequence of the species of interest. Recently methods for high throughput sequencing have been made available that would allow for the determination of complete nucleotide sequences of clones in a more efficient and cost-effective manner.
However, detection by sequencing of the entire restriction fragment is still relatively uneconomical. Furthermore, the current state of the art sequencing technology such as disclosed herein elsewhere (from 454 Life Sciences, www.454.com, Solexa, www.solexa.com, and Helicos, www.helicosbio.com), despite their overwhelming sequencing power, can only provide sequencing fragments of limited length. Also the current methods do not allow for the simultaneous processing of many samples in one run.
Further, populations carrying mutations, either induced or naturally occurring are used in modern genomics research to identify genes affecting traits of importance by reverse genetics approaches. This is in particular applicable for plants and crops of agronomic importance, but such populations are also useful, for other organisms such as yeast, bacteria etc. Other organisms, such as animals, birds, mammals etc can also be used, but these populations are typically more cumbersome to obtain or to control. Nevertheless, it is observed that the invention described herein is of a very general nature, and can be applied also to such organisms. Mutagenized populations represent complementary tools for gene discovery, as such populations are commonly used to screen known genes for loss-of-function mutations or assessing phenotype changes in organisms with the mutated gene. The rate-limiting step is the screening work associated with identification of, respectively, organisms carrying a mutation in the gene of interest.