The present invention relates to recombinant vectors incorporating structural elements that, after the vectors have integrated into the host cell genome, enhance the number of cellular genes that can be identified as well as effectively mutated. The described vectors are important tools for both gene discovery, gene cloning, gene mutation, gene regulation, shuttling nucleic acid sequences throughout the genome, and gene activation and overexpression.
Gene trapping provides a powerful approach for simultaneously mutating and identifying genes. Gene trap vectors can be nonspecifically inserted into the target cell genome, and gene trap vectors have consequently been constructed that select for events in which the gene trap vector has inserted into and mutated a gene. By exploiting the cellular splicing machinery, the selectable nature of these vectors removes the large background of insertion events where vectors have not integrated into genes.
Most mammalian genes are divided into exons and introns. Exons are the portions of the gene that are spliced into mRNA and encode the protein product of a gene. In genomic DNA, these coding exons are divided by noncoding intron sequences. Although RNA polymerase transcribes both intron and exon sequences, the intron sequences must be removed from the transcript so that the resulting mRNA can be translated into protein. Accordingly, all mammalian, and most eukaryotic, cells have the machinery to splice exons into mRNA. Gene trap vectors have been designed to integrate into introns or genes in a manner that allows the cellular splicing machinery to splice vector encoded exons to cellular mRNAs. Often, such gene trap vectors contain selectable marker sequences that are preceded by strong splice acceptor sequences and are not preceded by a promoter. Accordingly, when such vectors integrate into a gene, the cellular splicing machinery splices exons from the trapped gene onto the 5xe2x80x2 end of the selectable marker sequence. Typically, such selectable marker genes can only be expressed if the vector encoding the gene has integrated into an intron. The resulting gene trap events are subsequently identified by selecting for cells that can survive selective culture.
Gene trapping has proven to be a very efficient method of mutating large numbers of genes. The insertion of the gene trap vector creates a mutation in the trapped gene, and also provides a molecular tap that can be exploited to identify the trapped gene. When ROSAxcex2geo was used to trap genes it was demonstrated that at least 50% of the resulting mutations resulted in a phenotype when examined in mice. This indicates that the gene trap insertion vectors are useful mutagens. Although a powerful tool for mutating genes, the potential of the method had been limited by the difficulty in identifying the trapped genes. Methods that have been used to identify trap events rely on the fusion transcripts resulting from the splicing of exon sequences from the trapped gene to sequences encoded by the gene trap vector. Common gene identification protocols used to obtain sequences from these fusion transcripts include 5xe2x80x2 RACE, cDNA cloning, and cloning of genomic DNA surrounding the site of vector integration. However, these methods have proven labor intensive, not readily amenable to automation, and generally impractical for high-throughput.
Recently, vectors have been developed that rely on a new strategy of gene trapping that uses a vector that contains a selectable marker gene preceded by a promoter and followed by a splice donor sequence instead of a polyadenylation sequence. These vectors do not provide selection unless they integrate into a gene and subsequently trap downstream exons that provide the polyadenylation sequence required for expression of the selectable marker. Integration of such vectors into the chromosome results in the splicing of the selectable marker gene to 3xe2x80x2 exons of the trapped gene. These vectors provide a number of advantages. They can be used to trap genes regardless of whether the genes are normally expressed in the cell type in which the vector has integrated. In addition, cells harboring such vectors can be screened using automated (e.g., 96-well plate format) gene identification assays such as 3xe2x80x2 RACE (see generally, Frohman, 1994, PCR Methods and Applications, 4:S40-S58). Using these vectors it is possible to produce large numbers of mutations and rapidly identify the mutated, or trapped, gene. However, prior to the present invention, the commercial scale exploitation of such vectors has been limited by the number of target genes that can be efficiently trapped using such vectors.
The relative inefficiency of first generation 3xe2x80x2 gene trap vectors has limited the total number of genes that can be rapidly and practically trapped, identified, analyzed, and effectively mutated. This inefficiency prompted the development of more efficient methods of 3xe2x80x2 gene trappingxe2x80x94methods that allow a greater percentage of genes in the target cell genome to be trapped and rapidly identified by, for example, DNA sequence analysis.
The present invention relates to the construction of novel vectors comprising a 3xe2x80x2 gene trap cassette that allows for high efficiency 3xe2x80x2 gene trapping. The presently described 3xe2x80x2 gene trap cassette comprises in operable combination, a promoter region, an exon (typically characterized by a translation initiation codon and open reading frame and/or internal ribosome entry site), a splice donor sequence, and, optionally, intronic sequences. The splice donor (SD) sequence is operatively positioned such that the exon of the 3xe2x80x2 gene trap cassette is spliced to the splice acceptor (SA) site of a downstream exon or a cellularly encoded exon. As such, the described 3xe2x80x2 gene trap cassette (or gene trap vector incorporating the same) shall not incorporate a splice acceptor (SA) sequence and a polyadenylation site operatively positioned downstream from the SD sequence of the gene trap cassette. In a preferred embodiment, the exon component of the 3xe2x80x2 gene trap cassette, which also serves as a sequence acquisition cassette, will comprise exon sequence and a splice donor sequence derived from genetic material that naturally occurs in an eukaryotic cell.
An additional embodiment of the present invention is the use of the described vectors to acquire novel DNA sequence information from gene trapped exons from an infected target cell or a plurality of target cells.
Additional embodiments of the present invention include recombinant vectors, particularly viral vectors, that have been genetically engineered to incorporate the described 3xe2x80x2 gene trap cassette. Preferably, although not necessarily, these vectors will additionally incorporate a selectable marker that allows for maintenance and detection of vector sequence in the target cell. The selectable marker can be utilized as a 5xe2x80x2 gene trap cassette that is placed upstream from, and in the same orientation as, the 3xe2x80x2 gene trap cassette. Optionally, a 5xe2x80x2 gene trap cassette incorporating a selectable marker can be used in conjunction with a vector encoded mutagenic mini-exon sequence operably positioned, inter alia, to enhance splicing of cellular transcripts to the selectable marker of the 5xe2x80x2 gene trap cassette.
Additionally, the vector can include one or more mutagenesis enhancer sequence(s) such as, but not limited to, a sequence encoding a self-cleaving RNA, a transcription terminator, an exon that changes the reading frame (or encodes one or more stop codons), and/or a terminal exon, or any mixture or combination thereof, operatively positioned between the 5xe2x80x2 gene trap cassette and the 3xe2x80x2 gene trap cassette of the disclosed vectors.
An additional embodiment of the present invention is the use of the novel 3xe2x80x2 gene trap cassette, or vectors comprising the same, to mutate and trap genes in a population of target cells, or tissues, in vitro or in vivo, and/or to obtain the polynucleotide sequence of unknown genes (i.e., discover new genes). As such, general methods of gene mutation, identification, and phenotypic screening are described that use the described 3xe2x80x2 gene trap cassette, and vectors comprising the same.
Another embodiment of the present invention is the use of the presently described vectors (e.g., viral vectors comprising a mini-exon and/or 3xe2x80x2 gene trap cassette) to activate gene expression in target cells. Preferably, the vectors are retroviral vectors that are nonspecifically integrated (using viral integration machinery) into the target cell genome. Additionally, assays are described that employ the described 3xe2x80x2 gene trap cassette, or vectors incorporating the same, to activate, genetically or phenotypically select for, and subsequently identify new genes.
Additional embodiments of the presently described invention include libraries of eukaryotic cells having genes that have been simultaneously mutated (by one or more of the described mutagenic components), and identified (using the described 3xe2x80x2 gene trap cassette) using the described vectors, and/or cDNA libraries produced by exploiting the targeting frequency and the sequence acquisition features of the described vectors.
Another embodiment of the present invention is a method of obtaining DNA sequence information from a target cell, comprising the steps of nonspecifically integrating a 3xe2x80x2 gene trap cassette (or mutagenic mini-exon), obtaining the chimeric RNA transcript produced when the gene trap cassette (or mutagenic mini-exon) is spliced by the target cell""s endogenous splicing machinery to an endogenous exon encoded within the target cell genome, and obtaining sequence information from the endogenously encoded exon from the target cell genome.