Gene trapping provides a powerful approach for simultaneously mutating and identifying genes. Although vector insertion into the cellular genome can be a random process, gene trap vectors have been designed that select for events in which the gene trap vector has inserted into and mutated a gene. By exploiting the cellular splicing machinery, these vectors remove the large background of insertion events where vectors have not integrated into genes. Most mammalian genes are divided into exons and introns. Exons are the portions of the gene that are spliced into mRNA and encode the protein product of a gene. In genomic DNA, these coding exons are divided by noncoding intron sequences. Although RNA polymerase transcribes both intron and exon sequences, the intron sequences must be removed from the transcript so that the resulting MRNA can be translated into protein. Accordingly, all mammalian, and most eukaryotic, cells have the machinery to splice exons into mRNA. Gene trap vectors have been designed to integrate into introns or genes in a manner that allows the cellular splicing machinery to splice vector encoded exons to cellular mRNAs. Commonly, gene trap vectors contain selectable marker sequences that are preceded by strong splice acceptor sequences and are not preceded by a promoter. Thus, when such vectors integrate into a gene, the cellular splicing machinery splices exons from the trapped gene onto the 5' end of the selectable marker sequence. Typically, such selectable marker genes can only be expressed if the vector encoding the gene has integrated into an intron. The resulting gene trap events are subsequently identified by selecting for cells that can survive selective culture.
Gene trapping has proven to be a very efficient method of mutating large numbers of genes. The insertion of the gene trap vector creates a mutation in the trapped gene, and also provides a molecular tag for ease of identifying the gene that has been trapped. When ROSA.beta.geo was used to trap genes it was demonstrated that at least 50% of the resulting mutations resulted in a phenotype when examined in mice. This indicates that the gene trap insertion vectors are useful mutagens. Although a powerful tool for mutating genes, the potential of the method was limited by the difficulty in identifying the trapped genes. Methods that have been used to identify trap events rely on the fusion transcripts resulting from the splicing of exon sequences from the trapped gene to sequences encoded by the gene trap vector. Common gene identification protocols used to obtain sequences from these fusion transcripts include 5' RACE, cDNA cloning, and cloning of genomic DNA surrounding the site of vector integration. However, these methods have proven labor intensive, not readily amenable to automation, and generally impractical for high-throughput.
More recently, vectors have been developed that rely on a new strategy of gene trapping that uses a vector that contains a selectable marker gene preceded by a promoter and followed by a splice donor sequence instead of a polyadenylation sequence. These vectors do not provide selection unless they integrate into a gene and subsequently trap downstream exons which provide a polyadenylation sequence. Integration of such vectors into the chromosome results in the splicing of the selectable marker gene to 3' exons of the trapped gene. These vectors provide a number of advantages. They can be used to trap genes regardless of whether the genes are normally expressed in the cell type in which the vector has integrated. In addition, cells harboring such vectors can be screened using automated (e.g., 96-well plate format) gene identification assays such as 3' RACE (see generally, Frohman, 1994, PCR Methods and Applications, 4:S40-S58). Using these vectors it is possible to produce large numbers of mutations and rapidly identify the mutated, or trapped, gene. However, prior to the present invention, the broad exploitation of such vectors has been hampered by the limited number of target genes that can be efficiently trapped.