1. Field of the Invention
The present invention is related to the field of molecular biology and more particularly to the field of screening DNA libraries such as genomic and cDNA libraries to isolate a desired gene, and to the field of construction of targeting vectors for use in targeting a chromosomal gene, wherein the targeting vectors are constructed utilizing homologous recombination in E. coli. 
2. Description of Related Art
The development of technologies for targeted gene disruption in mouse embryonic stem cells (ES cells) has profoundly shaped biological research and the technique is now routinely used in laboratories. As the human genome project comes to completion and the mouse genome project comes to center stage, the demand for knockout mice is certain to increase dramatically in hopes of defining functions of the large volume of genes discovered by whole genome sequencing. The production of knockout mice, however, is still a time consuming process. A number of molecular manipulations must be performed to build a knockout construct to target the gene of interest in ES cells. First, genomic clones of the gene must be isolated and characterized. Secondly, a knockout construct is built in which a positive selection marker (usually the neomycin or puromycin resistant gene) is flanked by genomic sequences of several kilobases and a negative selection marker (usually the thymidine kinase of herpes virus) is placed at one end followed by plasmid backbone. These manipulations can be a rate-limiting step for generation of a knockout mouse and can often inhibit the decision to make a genetically altered mouse.
To identify a genomic region of interest, prior methods used plaque hybridization using a radioactive probe, usually a cDNA from the gene of interest, to identify the lambda phage containing the homologous genomic region of interest. This is labor intensive requiring multiple rounds of purification to identify just a few homologous clones. In addition, since it is based on hybridization, it often can pick up related sequences such as psuedo-genes that have related but not strictly identical DNA sequences. Similarly, the methods to identify full length cDNAs from libraries also require plaque screens using hybridization and radioactive probes and multiple rounds of screening. Thus, the methods previously used to identify phage containing homologous genomic DNA and cDNA are both laborious and time consuming.
It was contemplated by the present inventors that the described processes may be simplified by taking advantage of homologous recombination in E. coli. Several recombination pathways have been identified in E. coli with the RecBCD pathway playing a major role in the double-strand break repair pathway. RecBCD encodes a helicase and a nuclease that unwind and degrade DNA to generate 3xe2x80x2single-stranded tails utilized by the RecA protein for invasion to initiate the recombination process. However, homologous recombination efficiency between a linear piece of DNA and the host chromosome is very low in E.coli cells that express wildtype RecBCD because the introduced linear DNA molecules are degraded rather efficiently before recombination has had a chance to proceed. It has been found that a short sequence, 5xe2x80x2-GCTGGTGG-3xe2x80x2, called a Chi site, can stimulate homologous recombination, as this short sequence is inhibitory to the nuclease function of RecBCD. In order for a linear DNA molecule to recombine with the host chromosome or a resident plasmid, either RecBCD must be inactivated or Chi sites must be present on the linear DNA. In a strain with a mutant RecBCD (JC8679[ ], for example), linear DNA can recombine with a host chromosome or plasmid with modest efficiency. Initial attempts by the inventors to utilize the mutant RecBCD strain were not ideal, however, because the recombination efficiency was too low to screen for single copy genes within the complexity of the mouse genome. There is still a need, therefore, for simple methods of screening a genomic library and constructing targeting vectors for use in the knock out of genes of interest in mammalian species, such as mice.
The present disclosure overcomes drawbacks in the prior art by providing compositions and methods that simplify the screening of DNA libraries to select genes of interest through the use of homologous recombination in E. coli . A particular advantage of the present invention is that one can identify and select a gene of interest based on only about 60-100 bases of homology and can at the same time modify that gene fragment for use as a knockout targeting vector, for example. The invention is particularly useful in the screening of large libraries such as mammalian genomic libraries for the isolation of genomic copies of mammalian genes, for example, and in the construction of knockout targeting vectors. The invention is also useful for the screening of any DNA target, including cDNA libraries, BAC libraries, or cosmid libraries for various applications such as to extend partial sequences or to fill in sequence gaps in such libraries or genes within those libraries.
The advantages of the present compositions and methods arise from the ability to select a gene of interest from a DNA library based on the homology required for homologous recombination and simultaneously insert a positive selection marker into that gene so that only the targeted clones survive in the selection media. Using the compositions and methods described herein, one is thus able to isolate nucleic acid segments or clones of interest from a nucleic acid library via homologous recombination using regions of homology as small as about 60-100 base pairs (bp). By this description of the homology being about 60-100 base pairs, it is understood that this represents a minimum amount of sequence homology of about 56, 58 or 60 base pairs, but that much larger regions may be used, such as 200-500 or even several thousand bases or more of homology as desired by the practitioner. It is also understood that the regions of targeting homology are separated by the selection marker so that, in certain embodiments approximately one half the region of homology will appear at each end of the selectable marker region in the targeting construct. In certain preferred embodiments, then one may use regions of homology of about 26, 28, 30, 40, 50, 60, 75 or even 100 bases for each targeting region, making a total of about 50 or 60 to 200 bases of total homology.
Once the clones are selected and isolated, they can then be sequenced and used to construct complete genes or cDNA sequences, to fill gaps in sequence data, or even for genomic walking to obtain further sequence data. Furthermore, the compositions and methods of the present disclosure allow the screening of a library and production of a finished genetic targeting construct in less than a week, in contrast to months of work that are often required to produce such constructs using conventional methods.
Another aspect of the present disclosure is the use of recombination functions from the bacteriophage xcex in E. coli . The recombination function of xcex phage is carried out by two gene products, exo, a nuclease that acts progressively on double strand DNA to generate a 3xe2x80x2 single stranded overhang and beta, a single-strand DNA (ssDNA) binding protein capable of annealing complementary ssDNA strands. In certain preferred embodiments, the homologous recombination includes the inactivation of the E. coli RecBCD by a xcex phage gam gene product.
In the screening assays, one may insert a linear DNA fragment composed of a selectable marker flanked by regions of homology to the gene of interest into a library containing cell culture by electroporation and subsequent selection for the drug resistance marker. In the second type of assay one may also provide the recombinogenic fragment in vivo by placing the fragment encoding the selectable marker flanked by regions of homology into a specialized plasmid designed so that the fragment can be excised in the cell by an inducible restriction enzyme. Growing a phage library on the cells that are excising the fragment allows for the recombination to occur and phage incorporating the selectable marker can then be selected. Utilizing this system, a positive selection marker may be flanked by regions of homology of only about 28-50 bases on each side, and recombined into a genomic library to screen for the gene of interest. The use of this efficient recombination function with a positive selection marker allows rapid library screening and isolation of clones without the time-consuming steps of plaque lift or PCR based assays.
Included in the present disclosure are also preferred methods of constructing a genomic library in a xcex bacteriophage cloning vector system, in which the vector may contain a removable stuffer fragment that expresses a lambda repressor gene. In the use of this system, vectors will only form plaques if the stuffer has been replaced by a genomic DNA fragment, greatly simplifying genomic library construction. Another advantage of preferred cloning vectors is that the preferred vectors are automatically converted from a phage to a plasmid by utilizing a recombination system such as the cre-lox mediated recombination system.
In certain embodiments, the present invention may be described as a method of screening a DNA library for a gene of interest, including the steps of obtaining a DNA library containing the gene of interest, obtaining a nucleic acid fragment that encodes a bacterial positive selection marker flanked by DNA fragments that are homologous with respective sequences contained in the gene of interest and transforming a host cell containing the library with the nucleic acid fragment, where the host cell is an E. coli cell that expresses a highly efficient recombination function. As used herein the term transforming a cell is meant to convey its ordinary meaning as understood in the art. Transformation indicates that a gene has been introduced into a prokaryotic cell and is stably replicated in that cell. The most preferred cells express the exo and beta recombination functions of bacteriophage xcex. It is an aspect of the disclosure that other bacterial cells with an enhanced recombination function could also be used to practice the disclosed methods. Particularly other enteric bacteria or other gram negative cells may be used for library screening. It is also understood that homologs of exo and beta derived from phage other than lambda may also be used in the methods of the present disclosure. In alternative embodiments, the endogenous bacterial recombination functions such as the RecE and RecT genes of E. coli and their homologs in other bacteria may be enhanced by overexpression in order to achieve library screening using homologous recombination.
In the practice of a preferred method, one incubates the host cell under conditions effective to allow the fragment to recombine into the library and the host cells are grown under selective conditions to identify recombination events. For example, if the selection marker is tetracycline resistance, then the cells would be incubated in media containing tetracycline so that only cells expressing the marker would survive. Colonies of selected cells can then be further tested to confirm which represent homologous recombination events and clones may be isolated from those colonies.
The methods and compositions of the present invention are applicable to any type of target DNA, including but not limited to genomic libraries, cDNA libraries, bacterial artificial chromosome (BAC), or cosmid libraries, and which may be derived from any organism or type of organism, such as an animal, bacteria, plant, or yeast. The DNA may be derived, therefore, from a mammalian, insect, plant, fish, mouse, rat, human, primate, bovine, ovine, feline, canine, porcine, guinea pig, rabbit, hamster, Drosophila, Caenorhabditis elegans, Arabidopsis, corn, wheat, rye, rice, or avian source, or from any other plant, animal, bacteria or yeast in which enough DNA sequence is known to create the targeting homology.
It is an aspect of the disclosure that in certain embodiments the bacteriophage xcex exo, and beta recombination functions are expressed in the host cell. In addition, the xcexbacteriophage gam gene may be expressed in the host cell as well as the E. coli recA gene. One or more of these genes may be expressed from a plasmid in the host cell, or one or more of them may be integrated into the genome of the cell. The recombination genes may also be expressed from a constitutive promoter, however, more preferred is a regulated promoter. A regulated promoter is a promoter that only initiates DNA transcription when certain conditions are met in the cell. Such conditions include the presence or absence of a certain chemical, salt or metabolite, a certain temperature, the presence of phage anti-termination proteins or particular polymerases such as the T7 RNA polymerase. Certain regulated promoters include conditional or inducible promoters that are activated in the presence of a certain nutrient that may be a sugar such as lactose, galactose or arabinose or analogues thereof such as the araB, lacZ, galE or tac promoter. Other types of regulated promoters include the trp promoter, the xcex PL or PR promoters or the tetracycline promoter.
The bacterial positive selectable marker to be recombined into the target DNA may be any marker known in the art, and is preferably an antibiotic resistance marker, and may also be a nutritional marker or a tRNA gene or a xcex gene. Antibiotic resistance markers may include, but are not limited to resistance to ampicillin, tetracycline, streptomycin, penicillin, chloramphenicol or neomycin. In certain embodiments, a second bacterial marker is expressed on the library plasmid or vector. In this way, a double selection is possible so that both the resistance expressed in the targeting construct and the resistance expressed from the library vector must be present for a positive selection. In those embodiments in which the gene isolated from a library is to be used as a targeting vector, or as a vector in a eukaryotic cell, a eukaryotic positive selection marker may be placed adjacent the bacterial positive selectable marker in the targeting fragment to enable one to select homologous recombinants in a eukaryotic cell.
In certain embodiments, a linear fragment containing the targeting construct is introduced into a cell by electroporation or other means known in the art. In alternate embodiments, the fragment may be introduced as a plasmid, and the fragment removed by a restriction enzyme in vivo. In this second embodiment, the fragment is preferably flanked by restriction sites that are recognized by a restriction enzyme that is expressed by the host cell under the control of an inducible promoter such that the fragment can be controllably excised to allow homolgous recombination to occur. A preferred restriction enzyme for use in E. coli is I-SceI.
Although it is understood that one may practice the present invention using any pre-existing library that is available, one may also construct a library. A preferred vector for use in such aspects of the practice of the invention would include a lambda left arm segment, a recombination site, a multicloning site containing a plurality of restriction endonuclease recognition sequences, a stuffer fragment, wherein the stuffer fragment encodes a lambda gene under the control of a constitutive promoter, the expression of which affects the packaging or lysogenic growth of bacteriophage xcex, a second multicloning site containing a plurality of restriction endonuclease recognition sequences, possibly a negative selection marker such as herpes virus thymidine kinase under the operative control of a promoter, effective to act as a negative selection in a eukaryotic host, a bacterial positive selection marker, a bacterial origin of replication, a direct repeat of the recombination site and a lambda right arm segment. A preferred library may be constructed by removing the stuffer fragment by endonuclease digestion, ligating the digested vector in the presence of fragments of DNA that encode the library sequences and amplifying the plaque forming units. In certain embodiments, the method would also include infecting an E. coli host cell wherein the host cell expresses a site specific recombinase gene product effective to convert the library to plasmid form. It is also understood that a lambda suppressing gene, such as the repressor cI may be expressed at high levels by the host cell, rather than from the vector and that such expression would also maintain the library in plasmid form.
A recombinase target site, or sequence specific recombinase target site is a short nucleic acid segment or sequence that is recognized by a sequence or site-specific recombinase and that becomes a crossover region during the site-specific recombination event. Preferred recombination sites include loxP, loxP2, loxP23, loxP3, loxP511, loxB, loxC2, loxL, loxR, loxxcex9486, loxxcex94117, frt, dif, RS or att. The lox sites are nucleotide sequences at which the product of the cre gene of bacteriophage P1, Cre recombinase, can catalyze a site-specific recombination event. The frt recombination site is a nucleotide sequence at which the product of the FLP gene of the yeast 2 xcexcm plasmid, FLP recombinase can catalyze a specific recombination event.
In the preferred library vector, the stuffer fragment encodes the lambda cI gene under the control of a strong constitutive promoter. A strong constitutive promoter is desired so that the vector will not make plaques if the stuffer is intact, but will only form lytic plaques when the stuffer is replaced by target DNA. A preferred promoter is the Con I promoter. A strong constitutive promoter is defined herein as a promoter that requires no inducer and is sufficiently active to direct expression of an amount of repressor protein effective to block xcex replication.
The invention may also be described in certain embodiments as a method of obtaining a targeting vector for use in producing a mammalian embryonic stem cell with a disrupted gene of interest. This vector is then effective for producing a knock out mammal. The preferred method includes obtaining a DNA library comprising said gene of interest of said mammal; obtaining a nucleic acid fragment that encodes a bacterial positive selection marker and a eukaryotic positive selection marker flanked by DNA fragments, wherein said DNA fragments are homologous with respective sequences contained in the gene of interest; co-transforming a host cell with the library and with the nucleic acid fragment, wherein the host cell is an E. coli cell that expresses the exo and beta recombination functions of bacteriophage xcex; incubating the transformed host cell under conditions effective to allow homologous recombination between, the library and the fragment such that the selectable marker is transferred into the library vector; incubating the host cell in a selective medium to select recombination events; and isolating a clone from the selected cell.
The present invention also encompasses methods of screening a library for a selected nucleic acid sequence. The methods include inserting a selectable marker into the library at the position of the selected sequence by homologous recombination in an E. coli cell, wherein the E. coli cell expresses a bacteriophage xcex recombination function, preferably the exo and beta recombination functions of bacteriophage xcex.