This invention relates generally to methods of efficiently isolating coding sequences from complex genomic DNAs. More particularly, the present invention relates to a novel methodology for gene trapping in bacterial and bacteriophage-derived artificial chromosomes. The procedures are exemplified by either targeted gene trapping using homologous recombination methodology or random gene trapping employing a transposon system. Included in the invention are methods of preparing a gene map from a bacterial or bacteriophage-derived artificial chromosome contig, the resulting gene maps, methods of constructing a cDNA library from bacterial or bacteriophage-derived artificial chromosomes contigs, and the resulting cDNA libraries.
In recent years, the sequencing of the genomes of individual species, including humans, has become a major goal of biomedical research. The most prevalent procedure for sequencing the coding regions of a gene relies on RNA based methods, such as direct screening of a cDNA library. However, such methods are inherently biased towards the identification of nucleic acids which are prevalent in the tissue sample being studied. Therefore, genes which are expressed solely in tissues that are difficult to obtain, and/or expressed under relatively rare circumstances, have a good chance of being missed. Particularly in the latter case, these genes are likely to play a unique role during a specific cellular challenge, and thus could be important in a specific diseased state.
Exon trapping is one method of potentially overcoming the inherent bias of the mRNA based procedures of genomic sequencing. Exon trapping was originally developed to efficiently isolate coding sequences from complex genomic sequences [Duyk et al., PNAS 87, 8995-8999 (1990); Buckler et al., PNAS 88:40054009 (1991)]. This method is based on the selection of exons which are flanked by functional 5xe2x80x2 and 3xe2x80x2 splice sites. Conventional exon trapping vectors contain a driving promoter (i.e. SV40 promoter, metallothionein-1 promoter) which controls the expression of an exon having a 5xe2x80x2 splice site; an intron with multiple cloning sites; and a 3xe2x80x2 exon having a 3xe2x80x2 splice site and a poly-adenylation (poly A) signal sequence. Genomic fragments containing potential exons are first subcloned into the intron. The resulting plasmid DNA is then transfected into COS-7 cells, which transcribe and then process the RNA products. The mature RNAs containing the trapped exons can be amplified by reverse transcriptase PCR and subcloned. The trapped exons can be identified by sequencing the cloned cDNA products. In addition to its simplicity and efficiency, exon trapping is also independent, of the amount, location, and timing of the expression of a given gene, and therefore is preferable to mRNA based methods. Consequently, exon trapping has become widely employed in transcription map construction for positional cloning and in general genomic sequencing.
Unfortunately, current exon trapping systems have a number of limitations. First, the size of the genomic insert in the exon-trapping vector is limited to 1-2 kilobases (kb), so the resulting trapped exon is usually a single small exon (80-150 basepairs (bp)). Such small exons are usually difficult to use in subsequent biological procedures, such as library screening, Northern blot analysis, or in in situ hybridizations. Second, different exons from a single gene will be dispersed in different trapping vectors. Therefore, reconstruction of the gene from the small pieces of the gene requires considerable additional work. Third, subcloning of small genomic fragments may disrupt the elements necessary for proper splicing, thereby increasing the chance of missing. certain exons. Fourth, current exon trapping systems can only be used in combination with specific cell lines (i.e. COS cells). However since specific cellular factors are required to support the SV40 origin of replication, exons that are spliced in a tissue specific manner could be missed in the COS cells.
One recent advance towards solving some of these problems uses cosmid-based exon trapping vectors [Datson et al., NAR 24, 1105-1111 (1996)]. A specially designed cosmid vector is used, with a promoter and 5xe2x80x2 splice site on one end, and 3xe2x80x2 splice site and poly-adenylation signal sequence on the other end. The genomic insert now can be as large as 40 kb. In this case, multiple exons can be trapped together. Such a trapped gene segment can be greater than 800 bp. The major disadvantage of this system is that it is necessary to use a specialized genomic cosmid library. Furthermore, cosmid clones are inherently unstable.
An alternative to using a cosmid based system is to use one or more of the E. coli based cloning systems based on the E. coli fertility factor which have been developed to construct large genomic DNA insert libraries. These are bacterial artificial chromosomes (BACs) and P-1 derived artificial chromosomes (PACs) [Mejia et al., Genome Res. 7:179-186 (1997); Shizuya et al., Proc. Natl. Acad. Sci. 89:8794-8797 (1992);Ioannou et al., Nat. Genet., 6:84-89 (1994); Hosoda et al., Nucleic Acids Res. 18:3863 (1990)]. BACs are based on the E. coli fertility plasmid (F factor); and PACs are based on the bacteriophage P1. The size of DNA fragments from eukaryotic genomes that can be stably cloned in Escherichia coli as plasmid molecules has been expanded by the advent of PACs and BACs. These vectors propagate at a very low copy number (1-2 per cell) enabling genomic inserts up to 700 kb in size to be stably maintained in recombination deficient hosts. The host cell is required to be recombination deficient to ensure that non-specific and potentially deleterious recombination events are kept to a very minimum. As a result, libraries of PACs and BACs are relatively free of the high proportion of chimeric or rearranged clones typical in Yeast artificial chromosomes (YACs). [Burke et al., Science 236:806; Peterson et al., Trends Genet. 13:61 (1997); Choi, et al., Nat. Genet., 4:117-223 (1993), Davies, et al., Biotechnology 11:911-914 (1993), Matsuura, et al., Hum. Mol. Genet., 5:451-459 (1996), Peterson et al., Proc. Natl. Acad. Sci., 93:6605-6609 (1996); Schedl, et al., Cell, 6:71-82 (1996); Monaco et al., Trends Biotechnol 12:280-286 (1994); Boyseu et al., Genome Research, 7:330-338 (1997)]. In addition, isolating and sequencing DNA from PACs or BACs involves simpler procedures than for YACs, and PACs and BACs have a higher cloning efficiency than YACs [Shizuya et al., Proc. Natl. Acad. Sci. 89:8794-8797 (1992);Ioannou et al., Nat. Genet., 6:84-89 (1994); Hosoda et al., Nucleic Acids Res. 18:3863 (1990)]. Such advantages have made BACs and PACs important tools for physical mapping in many genomes [Woo et al., Nucleic Acids Res., 22:4922 (1994); Kim et al., Proc. Natl. Acad. Sci. 93:6297-6301 (1996); Wang et al., Genomics 24:527(1994); Wooster et al., Nature 378:789 (1995)]. Furthermore, the PACs and BACs are circular DNA molecules that are readily isolated from the host genomic background by classical alkaline lysis [Birnboim et al., Nucleic Acids Res. 7:1513-1523 (1979]. In addition, BACs have been found to be an important source of genomic DNA for the direct sequencing of the human genome [Rowen et al., Sequence 278: 605-607 (1997)]. On the other hand, their use in gene identification is still extremely limited. Indeed, heretofore, BACs and PACs have not been shown to be useful in methods that directly isolate genes, such as exon trapping.
Therefore, there is a need to efficiently sequence coding regions of eukaryotic genes, and in particular human genes, which are expressed relatively rarely and/or only at specific times (such as the genes involved in circadian rhythms or those involved in body weight homeostasis); and/or are predominantly expressed in tissues that are difficult to obtain, such as the human organ of Corti. In addition there is a need to produce new and improved gene maps for BAC or PAC contigs. Furthermore, there is a need to compile new cDNA libraries that are not biased by the expression pattern of the tissue that serve as the source for the mRNAs used to construct the cDNA library.
The citation of any reference herein should not be construed as an admission that such reference is available as xe2x80x9cPrior Artxe2x80x9d to the instant application.
The present invention provides a novel and efficient method of determining the nucleotide sequence of a portion of a eukaryotic gene that minimally contains one exon which has a 3xe2x80x2 splice acceptor, i.e., any exon other than the first exon. Preferably the portion of the eukaryotic gene contains two or more exons that have 3xe2x80x2 splice acceptors. In a more preferred embodiment, the portion of the eukaryotic gene contains three or more exons that have 3xe2x80x2 splice acceptors. In the most preferred embodiment, the portion of the eukaryotic gene contains all of the exons of the gene except the first exon.
The present invention includes methods of including a eukaryotic promoter exonlintron unit (PEU) in a Bacterial or Bacteriophage-Derived Artificial Chromosome (BBPAC) with a trappable eukaryotic gene. In one aspect of the present invention,r the PEU is placed into existing BBPACs. Preferably the PEU is inserted operatively upstream to one or more exons of the trappable eukaryotic gene. More preferably, the PEU is inserted operatively upstream to all, or all but the first exon of the trappable eukaryotic gene.
In an alternative aspect of the present invention, BBPACs are constructed using vectors containing one or more PEUs, and genomic DNA containing trappable eukaryotic genes is inserted into the vectors to form a BBPAC containing one or more PEUs and trappable eukaryotic genes. Preferably a PEU is operatively upstream to one or more exoris of the inserted trappable eukaryotic gene. More preferably, the PEU is operatively upstream to all but the first exon of the inserted trappable eukaryotic gene.
In one embodiment the trappable eukaryotic gene(s) are vertebrate genes. In a preferred embodiment of this type the vertebrate genes are mammalian genes. In a more preferred embodiment of this type the mammalian genes are human genes. In another embodiment the trappable eukaryotic gene(s) are invertebrate genes. In a preferred embodiment of this type the invertebrate genes are insect genes. In still another embodiment the trappable eukaryotic genes are plant genes.
In a related aspect, the present invention provides methods of obtaining a cell that contains a BBPAC containing a PEU and a trappable eukaryotic gene. Preferably the PEU is operatively upstream to one or more exons of the trappable eukaryotic gene. More preferably, the PEU is operatively upstream to all but the first exon of the trappable eukaryotic gene.
The present invention also includes methods of transcribing a trappable eukaryotic gene contained in a BBPAC in a eukaryotic cell. In one such embodiment the eukaryotic cell is a vertebrate cell. In a preferred embodiment of this type the vertebrate cell is a mammalian cell. In a more preferred embodiment the mammalian cell is. a human cell. In another embodiment the eukaryotic cell is an invertebrate cell. In a preferred embodiment of this type the invertebrate cell is an insect cell. In still another embodiment the eukaryotic cell is a plant cell.
The present invention further provides methods of determining the nucleotide sequence of a trappable eukaryotic gene contained in a BBPAC. In another aspect, the present invention provides methods of preparing a gene map for a BBPAC contig. Still another aspect of the present invention includes methods of constructing a cDNA library from genoric DNA contained in a BBPAC genomic library. In a preferred embodiment the BBPAC either contains all of the exons of the eukaryotic gene, or alternatively, all of the exons of the gene except the first exon. In another preferred embodiment the BBPAC is a BAC.
The PEU of the present invention is specifically constructed to contain at least one 5xe2x80x2 vector-derived exon and at least part of one intron (e.g., a fragment of an intron). In one such embodiment, the PEU does not contain a 3xe2x80x2 polyadenylation sequence. In a preferred embodiment the PEU is a bi-directional eukaryotic promoter-exon/intron unit (BPEU).
The PEU of the present invention can be introduced into a host cell containing the BBPAC via a shuttle vector. In a preferred embodiment the shuttle vector is a conditional replication shuttle vector. The conditional replication shuttle vector is preferably a temperature sensitive shuttle vector (TSSV) having a temperature-sensitive origin of replication, such that the TSSV replicates at a permissive temperature, but does not replicate at a non-permissive temperature. In a particular embodiment, the permissive temperature is 30xc2x0 C., and the non-permissive temperature is 43xc2x0 C. The TSSV is diluted out when the host cell containing the TSSV is grown at the non-permissive temperature.
One aspect of the present invention relates to a method for placing a PEU operatively upstream to an exon of a trappable eukaryotic gene of a BBPAC. One such embodiment comprises introducing a conditional replication shuttle vector into a host cell under conditions in which the conditional replication shuttle vector can replicate and transform the host cell. The host cell comprises a BBPAC that contains a trappable eukaryotic gene, BBPAC vector DNA, and a second marker gene. The conditional ,replication shuttle vector comprises a first marker gene and the PEU. The PEU comprises a eukaryotic promoter, at least one 5xe2x80x2 vector-derived exon, and at least one intron or fragment thereof. A 5xe2x80x2 vector-derived exon is adjacent to the intron or fragment thereof and is operatively downstream from the eukaryotic promoter. The PEU and the first marker gene are configured on the conditional replication shuttle vector such that when the PEU is transferred from the conditional replication shuttle vector to the BBPAC, the first marker gene remains with the conditional replication shuttle vector. In a related embodiment of this type the PEU further comprises a third marker gene and/or the first marker gene can be counter-selected against. In a preferred embodiment, the first marker gene is a tetracycline resistance gene that can be counter-selected against by growing the cell in the presence of fusaric acid.
The transformed host cell is then grown under conditions in which the conditional replication shuttle vector can replicate, and under conditions that select for a cell that contains the first and second marker gene. The PEU is then transferred from the conditional replication shuttle vector to the BBPAC of the selected cell, while the first marker gene remains with the conditional replication shuttle vector. When the BBPAC contains a trappable eukaryotic gene, the PEU can integrate into the BBPAC and place one or more exons of the trappable eukaryotic gene operatively downstream of the PEU. (Of course, the presence of the trappable eukaryotic gene in the BBPAC is not required for the insertion of the PEU into the BBPAC, since the PEU can integrate into a BBPAC which does not contain a trappable gene.)
In a preferred embodiment of this type the PEU is transferred from the conditional replication shuttle vector to the BBPAC through homologous recombination between the conditional replication shuttle vector and the BBPAC.
In an alternative embodiment, the PEU is transferred from the conditional replication shuttle vector to the BBPAC by the addition of transposase to the host cell. In this case the PEU is positioned in between a pair of inverted transposon ends on the conditional replication shuttle vector. In one embodiment of this type, the host cell contains a nucleic acid encoding transposase; in another embodiment the BBPAC contains a nucleic acid encoding transposase; in still a third such embodiment the conditional replication shuttle vector contains a nucleic acid encoding transposase. In this case, the transposase remains with the conditional replication shuttle vector when the PEU is transferred. In all of these alternative embodiments, the transcription of the nucleic acid encoding transposase can be placed under the control of an inducible promoter and in such cases, the addition of transposase to the host cell is achieved by adding an inducer of the inducible promoter to the host cell. This facilitates the transcription of an mRNA encoding transposase which can then be translated by the host cell, resulting in the expression of transposase.
The present invention also includes methods of isolating a cell that contains a BBPAC with a trappable eukaryotic gene and a PEU. Preferably the PEU is operatively upstream of one or more exons of the trappable eukaryotic gene. One such embodiment comprises growing the cell under conditions in which the conditional replication shuttle vector cannot replicate, and in which a cell that contains the second and third marker genes is selected for, while a cell that contains the first marker gene is selected against. A cell containing a BBPAC having a PEU is then isolated. In this manner cells containing a PEU operatively upstream of one or more exons of a trappable eukaryotic gene can be obtained. In such an embodiment the PEU can further comprise the third marker gene and the first marker gene can be counter-selected against.
The present invention also includes a method of transcribing one or more exons of a trappable eukaryotic gene contained in a BBPAC in a eukaryotic cell. One such embodiment comprises isolating a BBPAC containing the trappable eukaryotic gene operably downstream of the PEU from an isolated cell that comprises the BBPAC. The isolated BBPAC is transfected into a eukaryotic cell and the eukaryotic cell is cultured. In this case the eukaryotic promoter of the PEU facilitates the transcription of the trappable eukaryotic gene into an mRNA. In a related embodiment the mRNA is used as a template for preparing a cognate cDNA in order to determine the nucleotide sequence of the trappable eukaryotic gene contained in the BBPAC by determining the nucleotide sequence of the cognate cDNA.
The present invention further provides additional methods of placing a PEU operatively upstream to a trappable eukaryotic gene contained in a BBPAC. One such embodiment comprises introducing a conditional replication shuttle vector into a host cell that contains the BBPAC under conditions in which the conditional replication shuttle vector can replicate and transform the host cell. The BBPAC contains a trappable eukaryotic gene, BBPAC vector DNA, and a second marker gene. The conditional replication shuttle vector contains a first marker gene, and a recombination cassette. The recombination cassette comprises a PEU flanked on both its 5xe2x80x2 and 3xe2x80x2 ends by nucleotide sequences that are homologous to BBPAC vector DNA and the recombination cassette, and the first marker gene are linked together on the conditional replication shuttle vector such that when the PEU integrates into the BBPAC, the first marker gene does not remain linked to the integrated PEU. The PEU comprises a eukaryotic promoter, at least one 5xe2x80x2 vector-derived exon, and at least one intron or fragment thereof. In a preferred embodiment the PEU does not contain an exon encoding a 3xe2x80x2 polyadenylation sequence. The 5xe2x80x2 vector-derived exon is adjacent to the intron or fragment thereof and operatively downstream from the eukaryotic promoter. Thus when the BBPAC contains a trappable eukaryotic gene, the PEU can integrate into the BBPAC placing the exon of the trappable eukaryotic gene operatively downstream of the PEU. In a related embodiment of this type the PEU further comprises a third marker gene and/or the first marker gene can be counter-selected against. In a preferred embodiment of this type the first marker gene is a tetracycline resistance gene that can be counter-selected against by growing the cell in the presence of fusaric acid.
The transformed host cell is grown under conditions in which the conditional replication shuttle vector can replicate, and a cell that contains the first and second marker genes can be selected for. In this case a first homologous recombination event is allowed to occur between the recombination cassette and the BBPAC to form a co-integrate. The cell is then grown under conditions in which the conditional replication shuttle vector cannot replicate and in which a cell that contains the first and second markers is selected for. A cell containing the co-integrate between the recombination cassette and the BBPAC is thus selected for. This cell is then grown under conditions in which the conditional replication shuttle vector cannot replicate and in which a cell that contains the second marker gene is selected for. A second homologous recombination event is then allowed to occur between the conditional replication shuttle vector and the BBPAC. The PEU is thus allowed to integrate into the BBPAC and place the exon of the trappable eukaryotic gene operatively downstream of the PEU. In one such embodiment the eukaryotic promoter is a mammalian promoter and/or the eukaryotic gene is a mammalian gene. In another such embodiment the eukaryotic promoter is a plant promoter, and the eukaryotic gene is a plant gene.
A cell containing BBPAC having the integrated PEU can be isolated in a related embodiment. Such an embodiment comprises growing the cell under conditions in which a cell that contains the second and third marker genes is selected for, while a cell that contains the first marker gene is selected against. The cell containing the BBPAC having the PEU is then isolated. In this embodiment the PEU further comprises the third marker gene, and the first marker gene can be counter-selected against.
A particular embodiment of the present invention further includes a method of transcribing one or more exons of the trappable eukaryotic gene contained in a BBPAC in a eukaryotic cell. This embodiment comprises isolating the BBPACs containing the PEU from the isolated cell, and then transfecting the isolated BBPACs into eukaryotic cells. The eukaryotic cell are then cultured. When the PEU is operatively upstream of an exon (or more than one exon) of the trappable eukaryotic gene, the eukaryotic promoter of the PEU facilitates the transcription of the exon(s) of the trappable eukaryotic gene into an mRNA.
A related aspect of the present invention includes a method of determining the nucleotide sequence of the exon(s) of the trappable eukaryotic gene contained in the BBPAC. One such embodiment comprises preparing cognate cDNA by using the mRNA as a template, and determining the nucleotide sequence of the cognate cDNA. The nucleotide sequence of the exons of the trappable eukaryotic gene contained in the BBPAC is thus determined.
A preferred embodiment for a method for placing a PEU into a BBPAC containing a trappable eukaryotic gene comprises introducing a conditional replication shuttle vector into a host cell containing the BBPAC under conditions in which the conditional replication shuttle vector can replicate and transform the host cell. The BBPAC contains a trappable eukaryotic gene, BBPAC vector DNA, and a second marker gene; the conditional replication shuttle vector contains a RecA-like protein gene, a first marker gene, and a recombination cassette. The recombination cassette comprises a PEU flanked on both its 5xe2x80x2 and 3xe2x80x2 ends by nucleotide sequences that are homologous to BBPAC vector DNA. The recombination cassette, the RecA-like protein gene, and the first marker gene are linked together on the conditional replication vector such that when the PEU integrates into the BBPAC, the RecA-like protein gene and the first marker gene remain linked together, but neither the RecA-like protein gene nor the first marker gene remain linked to the integrated PEU. The PEU comprises a eukaryotic promoter, at least one 5xe2x80x2 vector-derived exon, and at least one intron or fragment thereof. In a more preferred embodiment of this type the PEU does not contain an exon encoding a 3xe2x80x2 polyadenylation sequence. The 5xe2x80x2 vector-derived exon is adjacent to the intron or fragment thereof, and operatively downstream from the eukaryotic promoter. When the trappable eukaryotic gene comprises an exon with a 3xe2x80x2 splice acceptor, the PEU can integrate into the BBPAC and place the exon of the trappable eukaryotic gene operatively downstream of the PEU. In a preferred embodiment of this type neither the host cell nor the BBPAC independently or in conjunction can support homologous recombination, without the conditional replication shuttle vector. The transformed host cell can be grown under conditions in which the conditional replication shuttle vector can replicate, the RecA-like gene can be expressed, and in which a cell that contains the first and second marker genes is selected for, and in which a first homologous recombination event is allowed to occur between the recombination cassette and the BBPAC to form a co-integrate. This cell is then grown under conditions in which the conditional replication shuttle vector cannot replicate and in which a cell that contains the first and second marker is selected for. In this way, a cell containing the co-integrate between the recombination cassette and the BBPAC is selected for. This cell is then grown under conditions in which the conditional replication shuttle vector cannot replicate and in which a cell that contains the second marker gene is selected for and wherein a second homologous recombination event is allowed to occur between the conditional replication shuttle vector and the BBPAC. The PEU can thus integrate into the BBPAC placing the exon of the trappable eukaryotic gene operatively downstream of the PEU.
In an alternative embodiment of this type, the PEU further comprises a third marker gene and/or the first marker gene can be counter-selected against. In a preferred embodiment of this type the first marker gene is a tetracycline resistance gene that can be counter-selected against by growing the cell in the presence of fusaric acid.
A related aspect of the invention further comprises a method of isolating such a cell which contains a BBPAC having a PEU. One such embodiment comprises growing the cell under conditions in which a cell that contains the second and third, marker genes is selected for, while a cell that contains the first marker gene is selected against. A cell containing a BBPAC having a eukaryotic promoter exon/intron unit (PEU) is obtained. In one such embodiment the PEU further comprises the third marker gene, and the first marker gene can be counter-selected against.
A further related aspect of the present invention comprises a method of transcribing a trappable eukaryotic gene contained in a BBPAC in a eukaryotic cell. One such embodiment comprises isolating the BBPAC containing the PEU from the isolated cell and transfecting the isolated BBPAC into the eukaryotic cell. The eukaryotic cell is then cultured. When the PEU is operatively upstream of an exon (or more than one exon) of the trappable eukaryotic gene, the eukaryotic promoter of the PEU facilitates the transcription of the exon(s) of the trappable eukaryotic gene into an mRNA.
A related embodiment further includes a method of determining the nucleotide sequence of the exon(s) of the trappable eukaryotic gene contained in the BBPAC comprising preparing a cognate cDNA using the mRNA as a template and then determining the nucleotide sequence of the cognate cDNA. The nucleotide sequence of the trappable eukaryotic gene contained in the BBPAC is thus determined. In a preferred embodiment of this type preparation of the cognate cDNA is performed by PCR.
As is true for the entire invention, in this aspect of the invention, preferably the PEU is a bi-directional eukaryotic promoter-exon/intron unit (BPEU). Similarly it is preferred that the conditional replication shuttle vector is a temperature sensitive shuttle vector (TSSV) having a temperature-sensitive origin of replication, such that the TSSV replicates at a permissive temperature, but does not replicate at a non-permissive temperature. In another preferred embodiment the BBPAC is a BAC. In one particular embodiment the PEU comprises two 5xe2x80x2 vector-derived exons and one intron or fragment thereof. In a preferred embodiment the two 5xe2x80x2 vector-derived exons consist of the first exon of beta-globin and a fusion exon containing the second exon of beta-globin fused to the HIV-tat exon, and the intron is the HIV-tat intron or fragment thereof. In this embodiment the fusion exon is adjacent to the HIV-tat intron.
Another variation of the present invention includes a method of placing a eukaryotic promoter exon/intron unit (PEU) into a BBPAC by introducing a conditional replication shuttle vector into a host cell containing the BBPAC under conditions in which the conditional replication shuttle vector can replicate and transform the host cell. The BBPAC contains a trappable eukaryotic gene, BBPAC vector DNA, and a second marker gene, whereas the conditional replication shuttle vector comprises a first marker gene, the PEU, a mini-transposon containing a pair of inverted transposon ends, a nucleic acid encoding transposase, and an inducible promoter. The expression of transposase is maintained under the control of the inducible promoter. The PEU is positioned in between the pair of inverted transposon ends and the nucleic acid encoding transposase, the inducible promoter, and the first marker gene are positioned outside of the pair of inverted transposon ends. The PEU comprises a eukaryotic promoter, at least one 5xe2x80x2 vector-derived exon, and at least onc intron or fragment thereof. Preferably the PEU does not contain an exon encoding a 3xe2x80x2 polyadenylation sequence. A 5xe2x80x2 vector-derived exon is adjacent to the intron or fragment thereof and operatively downstream from the eukaryotic promoter. When the trappable eukaryotic gene comprises an exon with a 3xe2x80x2 splice acceptor, the PEU can integrate into the BBPAC and place the exon of the trappable eukaryotic gene operatively downstream of the PEU. The transformed host cell is grown under conditions in which the conditional replication shuttle vector can replicate, and in which a cell that contains the first and second marker gene are selected for. The inducible promoter of this cell is induced and transposase is expressed. The PEU can then integrate into the BBPAC and place the exon of the trappable eukaryotic gene operatively downstream of the PEU. A cell that contains the first and second marker gene is then selected for. In a related embodiment of this type the PEU further comprises a third marker gene and/or the first marker gene can be counter-selected against. In a preferred embodiment of this type, the first marker gene is a tetracycline resistance gene that can be counter-selected against by growing the cell in the presence of fusaric acid.
A related embodiment further includes a method of isolating the cell containing the BBPAC having the PEU which comprises growing the cell under conditions in which the conditional replication shuttle vector cannot replicate, and in which a cell that contains the second and third marker genes are selected for, while a cell that contains the first marker gene is selected against. In a preferred embodiment of this type, the selection for the second and third marker genes are performed in one step, and the counterselection for the first marker gene is performed in a subsequent step. In either case a cell containing a BBPAC having a eukaryotic promoter exon/intron unit PEU is isolated. In this embodiment the PEU further comprises the third marker gene, and the first marker gene can be counter-selected against.
In a particular embodiment of this type the inducible promoter is the xcex2-galactosidase promoter and the bacterial host expresses lacIq. In a related embodiment the conditional replication shuttle vector encodes lacIq. In either case the inducing of the inducible promoter comprises contacting the bacterial host cell with IPTG. In a preferred embodiment the amount of IPTG used to contact the bacterial host is controlled so that the BBPAC receives only a single transposon or none at all.
The conditional replication shuttle vector is preferably a temperature sensitive shuttle vector (TSSV) having a temperature-sensitive origin of replication, such that the TSSV replicates at a permissive temperature, but does not replicate at a non-permissive temperature. The BBPAC is preferably a BAC. The first marker gene is preferably a tetracycline resistance gene that can be counter-selected against by growing the cell in the presence of fusaric acid. In a more preferred embodiment the PEU is a bi-directional eukaryotic promoter-exon/intron unit (BPEU).
This aspect of the invention also provides for isolating a BBPAC which has a PEU from a cell containing the BBPAC. As any person having skill in. the art would readily recognize once a PEU is introduced into a BBPAC, the subsequent isolation and manipulation of the BBPAC is independent of the method for placing the PEU into the BBPAC.
One specific embodiment comprises performing an alkaline lysis of the isolated cell therein isolating the BBPAC DNA. The isolated BBPAC DNA is next electroporated into competent bacterial cells, and then the bacterial cells are grown under conditions in which the conditional replication shuttle vector cannot replicate and in which cells that contain the second and third marker genes are selected for. Alkaline lysis of these bacterial cells is then performed to isolate the purified BBPAC DNA.
In addition the present invention also provides an embodiment that further includes a method of transcribing one or more exons of the trappable eukaryotic gene contained in a BBPAC in a eukaryotic cell which comprises transfecting the purified BBPAC into a eukaryotic cell and then culturing the eukaryotic cell. When the BBPAC contains a PEU operatively upstream to one or more exons of the trappable eukaryotic gene, the eukaryotic promoter facilitates the transcription of the trappable eukaryotic gene into an mRNA.
A particular embodiment further includes a method of determining the nucleotide sequence of one or more exons of the trappable eukaryotic gene contained in the BBPAC which comprises preparing a cognate cDNA by using the mRNA as a template, and then determining the nucleotide sequence of the cognate cDNA. The nucleotide sequence of one or more exons of the trappable eukaryotic gene contained in the BBPAC is thus determined. In a preferred embodiment of this type, preparing the cognate cDNA is performed by PCR.
The present invention further provides a method of mapping the insertion site of the PEU. In a preferred embodiment of this type, the mapping is performed by pulse field gel electrophoresis. In a related embodiment the mapping is performed by Southern blot.
Still another aspect of the present invention includes methods of preparing a gene map for a BBPAC contig that contains trappable eukaryotic genes. This aspect of the invention comprises introducing a eukaryotic promoter exon/intron units (PEU) in each BBPAC of a BBPAC contig. In one embodiment of this aspect of the invention, the PEU is placed into an existing BBPAC contig. Preferably the PEU is inserted operatively upstream to one or more exons of the trappable eukaryotic gene. More preferably, the PEU is inserted operatively upstream to either all, or all but the first exon of the trappable eukaryotic gene.
In an alternative aspect of the present invention, a BBPAC contig is constructed using vectors containing one or more PEUs, and genomic DNA containing trappable eukaryotic genes is inserted into the vectors to form a BBPAC contig with BBPACs having one or more PEUs and trappable eukaryotic genes. Preferably a PEU is operatively upstream to one or more exons of the inserted trappable eukaryotic gene. More preferably, the PEU is operatively upstream to all, or all but the first exon of the inserted trappable eukaryotic gene.
The trappable eukaryotic gene(s) for this aspect of the invention can be any eukaryotic gene including vertebrate genes, preferably mammalian genes, and more preferably human genes; invertebrate genes, preferably insect genes; or plant genes.
The insertion of the PEU into the BBPAC is preferably performed by one of the methods of placing a PEU operatively upstream to a trappable eukaryotic gene contained in a BBPAC described herein. The BBPACs are then isolated and transfected into eukaryotic cells, which are cultured. When the BBPAC contains a PEU operatively upstream to a trappable eukaryotic gene, the eukaryotic promoter facilitates the transcription of the trappable eukaryotic gene into an mRNA. Preferably, cognate cDNAs are prepared using the mRNAs as a template and the physical location of each gene is assigned within the BBPAC by hybridization of the cognate cDNAs to the BBPACs of the BBPAC contig. Alternatively the mRNAs can be used in the hybridization. In one such embodiment, RNA probes are generated e.g., from the cDNA, and are used in in situ hybridization determinations. In any case, preferably the BBPAC contig is a BAC contig.
Still another aspect of the present invention provides methods of constructing a cDNA library from genoric DNA, comprising trappable eukaryotic genes, contained in a BBPAC genoric library. Prior to, or alternatively as part of one such embodiment, a BBPAC genomic DNA library is subdivided into individual BBPAC genomic sub-libraries, wherein the BBPACs of the BBPAC sub-library contain trappable eukaryotic genes. Thus in one embodiment the BBPAC genomic library is subdivided into a sub-library that comprises 20 to 1000 BBPACs. In an alternative embodiment the BBPAC genomic library is subdivided into a sub-library comprising 40 to 500 BBPACs. In still another embodiment the BBPAC genomic library is subdivided into a sub-library comprising 80 to 250 BBPACs. In a preferred embodiment the BBPAC genomic library is subdivided into a sub-library comprising 100 to 200 BBPACs. In another preferred embodiment the BBPAC genomic library is subdivided into a sub-library comprising 20 to 80 BBPACs. In preferred embodiments the BBPAC genomic library is a BAC library.
The genomic libraries for this aspect of this aspect of the invention can derived from any eukaryotic genome including a vertebrate genome, preferably a mammalian genome, and more preferably the human genome; an invertebrate genome, preferably an insect genome; or a plant genome.
PEUs are placed into the BBPACs of a BBPAC sub-library. The insertion of the PEU into the BBPAC is preferably performed by one of the methods of placing a PEU in a BBPAC described herein. The BBPACs are then isolated, transfected into eukaryotic cells, and the eukaaryotic cells are cultured. When the BBPAC contains a PEU operatively upstream to one or more exons of a trappable eukaryotic gene, the eukaryotic promoter facilitates the transcription of the exon(s) of the trappable eukaryotic gene into an mRNA. Cognate cDNAs are prepared using the mRNAs as a template. A related embodiment further comprises determining the nucleotide sequence of the cognate cDNA. The nucleotide sequences of the exons of the trappable eukaryotic genes contained in the BBPAC genomic library are thus determined.
Accordingly, it is a principal object of the present invention to provide a method for sequencing trappable eukaryotic genes contained in BBPAC genomic libraries.
It is a further object of the present invention to provide a method of introducing a PEU (preferably a BPEU) operatively upstream to a trappable eukaryotic gene contained in a BBPAC.
It is a further object of the present invention to provide a method of obtaining bacterial cells that contain a BBPAC comprising a PEU operatively upstream to a trappable eukaryotic gene.
It is a further object of the present invention to provide a method of procuring isolated BBPACs containing a PEU and a trappable eukaryotic gene.
It is a further object of the present invention to provide a method of transcribing a trappable eukaryotic gene.
It is a further object of the present invention to provide a method of determining the sequence of the transcribed trappable eukaryotic gene.
It is a further object of the present invention to provide a method of providing a gene map for a BBPAC contig.
It is a further object of the present invention to provide a map of the insertion sites of a PEU placed into a BBPAC.
It is a further object of the present invention to provide a cDNA library from a BBPAC eukaryotic genomic library.
It is a further object of the present invention to provide a method of sequencing a cDNA library prepared from a BBPAC eukaryotic genomic library.
These and other aspects of the present invention will be better appreciated by reference to the following drawings and Detailed Description.