International efforts to generate knockouts of all mouse genes (Austin, C. P. et al. 2004; Auwerx, J. et al. 2004), such as the NIH Knockout Mouse Project (KOMP), have been initiated, however these efforts will concentrate on coding regions, representing about 2.5% of the genome. The remaining 97.5% non-coding region is often referred to as “junk DNA”. Based upon comparisons between the newly sequenced mammalian genomes, as well as partial sequencing of other vertebrate genomes, more than 300,000 conserved non-coding elements (CNEs) (also referred to as conserved non-genic sequences, CNGs) have been identified within this presumed “junk DNA”. Many of these CNEs show greater sequence conservation among disparate vertebrate species than do the average protein-coding sequence (Dermitzakis, E. T., et al. 2005; Bejerano, G. et al. 2004; Boffelli, D., et al. 2004; Sandelin, A. et al. 2004; Margulies, E. H. et al. 2005; Vavouri, T., et al. 2006; Bejerano, G., et al. 2004). Increasing evidence suggests that such non-coding regions play important regulatory roles, particularly for genes controlling development. In many cases mutations in these regions cause significant disease phenotypes. However, in order to assess their functions directly, it is currently unrealistic to generate specific deletions of all of these CNEs in the mouse. A more practical approach to dissect the functional roles of such non-coding regions can be to systematically generate relatively large deletions of up to several hundred kilobase pairs (kb) that encompass multiple CNEs.
In Drosophila, the transposon-based gene-trap has been used to generate a large collection of FRT-bearing alleles, allowing investigators to use FLP/FRT site-specific recombination to mediate trans recombinations in vivo between homologous chromosomes in order to generate large deletions and duplications covering the entire genome (Ryder, E. et al. 2004; Golic, K. G. & Golic, M. M. 1996). In the mouse, an in vitro Cre/loxP-based method in embryonic stem (ES) cells has been used to generate megabase size deletions and duplications (Zheng, B., et al. 2000; Mills, A. A. & Bradley, A. 2001). However, this in vitro protocol is very labor-intensive and requires multiple rounds of ES cell genomic manipulations. An in vivo Cre/loxP method, named TAMERE, that uses the Sycp1-Cre driver, and takes advantage of homologous chromosome paring during meiosis, has been used to generate trans-allelic recombination in mice (Herault, Y., et al. 1998). Although this method was successful in generating deletions and duplications for the closely-linked Hoxd genes (Kmita, M., et al. 2002), it has been limited to generating only relatively small deletions of up to about 15 kb (Genoud, N. et al. 2004). Since deletions of this size can be readily achieved by conventional gene targeting/knockout technology, TAMERE does not offer more advantages. Another in vivo Cre/loxP method, named STRING (Spitz, F., 2005), that also uses the Sycp1-Cre driver and very tedious and lengthy breeding, has been able to generate super-large deletions of more than several megabase pairs. The main problem of TAMERE and STRING is that the deletions they generate are either too small or too big, the most useful deletions of from 20 kb to 2 Mb would be very difficult to create by TAMERE or STRING, if not impossible. Needed is a simple and efficient strategy for the generation of large deletions and duplications at most useful resolutions of 20 kb to 2 Mb. Needed also is a simple and efficient strategy for in vivo generation of translocations.
The development of phage based homologous recombination systems has greatly simplified the generation of transgenic and knockout constructs, making it possible to engineer large segments of genomic DNA, such as those carried on BACs or P1 artificial chromosomes (PACs), that replicate at low-copy number in Escherichia coli. Using phage recombination to carry out genetic engineering has been called recombinogenic engineering or recombineering. Needed are improved compositions and methods of recombineering to facilitate large scale construction of target vectors.