A key feature of the retroviral replication cycle is that the virus integrates into the host chromosome. Retroviral DNA integration was initially thought to occur in an essentially random manner, for the most part giving no preference to any particular nucleotide sequence as a target for proviral establishment. It has also been suggested that the observed randomness of integration is due to the nonspecific DNA binding affinity of the integrase protein (Sandmeyer et al., 1990; Annu Rev Genet, 24:491-518). However, it has recently been reported that retrovirus may exhibit a propensity for integrated into highly preferred target sites (Pryciak and Varmus, 1992, Cell, 69:769-80; Rohdewohld, et al., 1987, Journal of Virology, 61:336-343; Shih et al., 1988, Cell, 53:531-537). This nonrandom integration may result from the restricted access of retroviral integrase protein to genomic DNA, or an interaction with specific target sequences. In general, the observed integration bias has hindered efforts to randomly saturate the mammalian genome with proviral tags (Sandmeyer et al., 1990, Annu Rev Genet, 24:491-518).
Experimentally, Bushman et al. has used an artificial system to further bias the integration reaction in vitro using a retroviral integrase that has been fused to the DNA binding domains of the bacteriophage Lambda DNA binding repressor protein using an in vitro integration system. These fusion proteins proved capable of directing retroviral integration into sequences adjacent to Lambda repressor DNA binding sites (Bushman, 1994, Proc Natl Acad Sci, USA, 91:9233-9237; Goulaouic and Chow, 1996, Journal of Virology, 70, No. 1:37-46). Other groups have expanded on this concept by establishing mutant viral lines containing fusions between the retroviral integrase and the well characterized procaryotic DNA binding protein LexA (Goulaouic and Chow, 1996, Journal of Virology, 70, No. 1:37-46; Katz et al., 1996, Virology 217:178-190). The preliminary in vitro studies using a single procaryotic DNA binding activity provide proof in concept that engineered integrase molecules can mediate nonrandom integration in an artificial biochemical assay. However, the useful application of chimeric integrase would ideally require the following scientific breakthroughs: 1) The production of a chimeric integrase that incorporates a DNA binding domain from a biologically relevant protein with known function in the target cell; 2) The demonstration that the chimeric integrase may be incorporated into an infectious viral particle; 3) The demonstration that the presence of the chimeric integrase does not interfere with reverse transcription; 4) A showing that the chimeric integrase retains the ability to process the inverted repeats at both ends of the retroviral DNA product of reverse transcription; and 5) The demonstration that the chimeric integrase can direct the nonrandom, or biased, integration of the retroviral genome to targeted regions of the cellular genome. Additionally, the above studies require the development of specialized retroviral packaging cell lines, and preferably amphotropic packaging cell lines, that express and incorporate the chimeric integrase molecules into high titer stocks (&gt;10.sup.5 per ml) of infectious virus.
The use of modified retroviral vectors to both trap and mutate genes has allowed for the identification of novel genes as well as the analysis of corresponding mutant phenotypes (Chen et al., 1994, Genes & Development 8:2293-2301; Gasca et al., 1995, Developmental Genetics, 17:141-154; von Melchner, 1989, J Virol, 63:3227-3233). Recent advances in vector technology have resulted in the development of efficient gene-trap strategies that have enabled researchers to both discover and disrupt genes (von Melchner et al., 1992, Genes & Dev 6:919-927; Yoshida et al., 1995, Transgenic Research 4:277-287). Although such approaches have yielded a sizable amount of raw genetic information, the general absence of practical genetic systems in most higher eukaryotes has largely prevented researchers from organizing the raw data into regulatory hierarchies. Consequently, only a minor fraction of the mammalian gene products identified from DNA sequence data have been functionally defined in the context of the biochemical pathways or regulatory cascades in which they are involved.
By developing the technological breakthroughs necessary for the biologically relevant exploitation of chimeric integrase molecules, and further combining targeted integration with high efficiency gene trap technology, the present invention defines a novel and improved method of gene discovery. A method that allows for the rapid identification, cloning, sequencing, and disruption of genes in proximity to, encoding, or regulated by, DNA binding protein target sequences.