The present invention relates to methods and materials, such as vectors, for the systematic and random insertion of genetic material into the genome of an organism.
The present invention relates to methods and materials for the systematic and random insertion of genetic material into the genome of an organism. The invention allows the rapid mutagenesis of organisms to mutate essentially every gene of an organism, particularly fungi, and allow the reliable and efficient identification of the gene being knocked out in each mutagenesis event. The invention also facilitates very high efficiency of homologous recombination, particularly in species, such as filamentous fungi, that have previously been notorious for low frequency of such events.
Numerous methods for introducing foreign genetic material into living cells have become routine since the first instances of genetic engineering almost a quarter century ago. Introduction of foreign genetic material can be into the cell via a vector that may replicate or by incorporation into the genome of the host cell. The introduction of such foreign genetic material has allowed the expression of a protein in a species that usually does not produce the protein. It has also allowed the regulation of the expression of a protein (overexpression and underexpression) by introducing modified regulatory sequences making the transcription and translation of the protein more or less efficient. Another use for genetic engineering has been the modification of the biological activity of a structural protein or enzyme by altering the coding region of a gene and thus altering the amino acid sequence of the protein produced. The altered amino acid sequence can lead to changes in conformation, changes in surface charge, and changes in the higher structure of the protein (tertiary and quanternary structure) which all can lead to changes in biological activity.
With the recent growth of the field of xe2x80x9cfunctional genomicsxe2x80x9d out of the discipline of genomics or gene sequencing, the manipulation of DNA in organisms has taken on another urgent task. In addition to sequencing the genetic material of an organism, functional genomics seeks to identify the function of the genes of a target organism on an industrial scale. By determining the function of most, if not all, genes and the products of those genes in an organism, functional genomics can accelerate the identification of gene and protein targets and allow the identification of compounds that will modulate those genes and gene products to alleviate disease, improve human and animal health, and improve the quality and quantity of food crops. To achieve this, it is necessary to develop rapid, high volume techniques for sytematically altering the expression of essentially every gene in an organism, identifying the corresponding gene and monitoring the effect of the gene alteration on the phenotype of the organism.
Automated processes in molecular genetics have allowed the systematic analysis of genomes from microorganisms, such as yeast and bacteria, by DNA sequencing. Attention is focused on rapidly ascribing functions to newly discovered genes. It is widely recognized in the field of genetics that gene function is most desirably assigned through the analysis of organisms containing defined gene mutations (mutants).
Previous methods of introducing genetic material into a eukaryotic organism are sufficient for mutating a single gene. Such methods include protoplast fusion, transformation by electroporation, particle bombardment, chemical perturbation of cellular envelopes (membranes and walls), phage and viral infection, transduction and physical insertion of DNA into cells. Many of these methods are limited to introducing DNA into a cell in the form of a vector, where the DNA is expressed to produce its gene product. The desired characteristics of a useful gene insertion method for functional genomics include the insertion of a gene or DNA fragment into essentially every gene of the genome of the target organism in an efficient and systematic manner. However, the majority of methods for inserting DNA into the genome of an organism are not target specific. Methods for targeted integration of DNA into a specific location in the genome of an organism are less reliable and often have low efficiency. Accordingly, there is a need for efficient methods for site specific integration of DNA into the genome of an organism.
One method for the site specific insertion of DNA into another piece of DNA, including genomic DNA, involves the use of viral integration systems, such as Crelox (Sauer (1996) Nucleic Acid Res. 24:4608-4613) and Flp recombinase (Seibler and Bode (1997) Biochemistry 36:1740-1747). These systems insert DNA at specific sites in DNA in genomic DNA of a host, but those specific sites must first be randomly engineered into the genome. Recently, the ability of enzymes known as transposases to transfer DNA fragments from one location in DNA into another random location in DNA have been discovered (Devine et al., U.S. Pat. No. 5,677,170; Devine et al., U.S. Pat. No. 5,728,551; Hackett et al., WO 98/40510; Plasternak et al., WO 97/29202; Reznikoff et al., WO 98/10077; Craig WO 98/37205; Strathman et al., (1991) Proc. Nat. Acad. Sci. USA 88:1247-1250; Phadnis et al., (1989) Proc. Nat. Acad. Sci. USA 86:5908-5912; Way et al., (1984) Gene 32:269-279; Kleckner et al., (1991) Method. Enzymol. 204:139-180; Lee et al., (1987) Proc. Nat. Acad. Sci. USA 84:7876; Brown et al. (1987) Cell 49:347-356; Eichinger et al. (1988) Cell 54:955-966; Eichinger et al. (1990) Genes Dev. 4:324-330). Generally, a transposase recognizes a relatively short DNA sequence known as an inverted repeat that is located on the flanks of an internal piece of DNA. The DNA sequence comprising the internal DNA sequence and the two flanking internal repeat sequences is known as a transposon or transposable element. The transposase has the ability to excise the transposon and insert it in another piece of DNA into which it comes into contact. Typically, the location of the insertion is not totally random, but occurs preferentially at target sequence locations (so called xe2x80x9chot spotsxe2x80x9d; Kleckner et al., (1991) Method. Enzymol. 204:139-180). Like the viral systems, the insertions are site specific, but the sites are randomly located in the genome and do not allow site directed insertion.
One use for transposons has been to introduce a desired gene randomly into the organism""s genome. Another use of transposons is as a sequencing tool since the sequence of the transposon is often known, especially at the borders, such that use of primers designed for the transposon would allow sequencing of the DNA into which the transposon is inserted. The lack of randomness in insertion location would detract from the use of transposons as tools to systematically sequence essentially all genes in an organism or to systematically knock out essentially all genes in an organism. Therefore, their use in functional genomics would appear to be limited.
Using transposons has thus far involved engineering the transposon into a plasmid (e.g., Reznikoff et al., WO 98/10077) and introducing the plasmid into a target organism such that the transposed gene is expressed by the plasmid (Devine et al., U.S. Pat. No. 5,677,170; Devine et al., U.S. Pat. No. 5,728,551). Alternatively, genetic material has been introduced into the genome of an organism by directly transferring the transposon from a plasmid to the genome of a target organism in the presence within the cell of the transferring transposase (Hackett et al., WO 98/40510; Plasternak et al., WO 97/29202). For this to occur, the interior of the cell to be transposed must include a transposable element on a plasmid and the corresponding transposase. Consequently, the only use of transposons to get DNA into the genome of an organism using a transposon has been to directly transpose the transposable DNA in the presence of a transposase into a site specific, but not site directed location (Hackett et al., WO 98/40510; Plasternak et al., WO 97/29202). Additionally, vectors containing a transposon event have been limited to plasmids and the use of the transposed vectors has been the expression of the transposed gene""s protein. Moreover, the introduction of the transposon usually occurs at one of the hotspots, not randomly. The use of transposons to introduce DNA into filamentous fungi, and particularly to introduce DNA either directly or indirectly into the fungal genome has only recently been accomplished (Migheli et al. (1999) Genetics 15:1005-1013).
To accomplish site directed insertion of DNA into the genome of an organism, the method of homologous recombination is necessary, particularly when the objective of insertion is to mutate essentially every gene of the organism. However, there is a general difficulty in transforming filamentous fungal cells by homologous recombination. Such recombination has been notoriously inefficient.
Genome-wide mutagenesis is particularly problematic in filamentous fungi for several reasons. First, active and tractable endogenous transposons have not been described for the vast majority of filamentous fungi. Second, during DNA-transformation, homologous recombination occurs less frequently than nonhomologous (illegitimate or ectopic) recombination. During ectopic recombination, the introduced DNA construct does not recombine with its homologous genome segment but recombines at varied sites throughout the genome. Thus, in a resultant group of transformants, strains containing site directed mutations such as gene knockouts (KO""s) as a result of homologous recombination must be identified against a large background of strains containing ectopic (nonhomologous) recombination events. Finally, large homologous chromosomal DNA regions ( greater than 1000 bp) are needed to direct homologous recombination. Thus several rounds of standard recombinant DNA technology (digestion of DNA with restriction enzymes, isolation of DNA fragments, ligation into plasmid vectors, transformation of E. coli and screening of bacterial colonies) are needed to assemble a single gene KO vector construct. This requirement is detrimental to efficient automation.
Filamentous fungi are a large and diverse group within the kingdom Mycota. They impact human health as important recyclers of terrestial biomass, as hosts for industrial chemical, vitamin, enzyme and pharmaceutical production, as agents of deterioration and decay and as pathogens of plants and animals. This group of organisms are generally regarded as distinct from distantly-related unicellular fungi such as the yeast Saccharomyces cerevisiae. This distinction is obvious in terms of growth morphology (multicellular filamentous hyphae as opposed to unicellular buds) and metabolism (e.g., S. cerevisiae is a facultative anaerobe whereas filamentous fungi are strictly aerobic). The systematic analysis and assignment of function to all the genes of filamentous fungi and other eukaryotes would provide much new and valuable information about these important organisms.
The present invention provides techniques and materials to allow the systematic mutation of essentially all genes in a eukaryotic organism, especially a filamentous fungus, by facilitating the homologous recombination of all the genes of the organism. Homologous recombination is facilitated by the large insert vector libraries (e.g., cosmid, BAC, etc.) as a substrate for transposon mediated mutagenesis of the genomic DNA carried by the vector. The use of a large insert vector, such as a cosmid, which is capable of containing large inserts of cloned DNA, allows large flanking DNA sequences that are homologous to genomic DNA on each side of the inserted transposon. Optionally, the genomic DNA sequences that flank the transposon can then be sequenced using primers targeted to the ends of the inserted transposon. The large genomic sequences that flank the transposon allow for increased frequencies of homologous recombination with the genome of the target eukaryotic organism, especially in species where homologous recombination efficiency has previously been low. Transposon mediated mutagenesis of cosmids is not recommended by manufacturers of commercially available transposon systems. Therefore, the present invention uses new methods and materials to solve the problem of homologous recombination of difficult species and the rapid, large scale production of genomic mutants as well as the routine sequencing of the gene being mutated. The present invention allows the industrialization of both the identification of essentially all genes in an organism as well as the assignment of function to each of those genes by analysis of the corresponding genomic mutation.
Thus, in one aspect, the present invention provides a method for facilitating site directed homologous recombination in a eukaryotic organism to produce mutants comprising:
1) providing at least one cosmid, wherein said cosmid comprises a first vector and genomic DNA from a target eukaryotic organism and wherein said first vector is not more than 6.4 kb in length and comprises a first selectable marker functional for selection in bacteria;
2) providing a second vector comprising a transposable element, said transposable element comprising a nucleotide sequence coding for a second selectable marker flanked on each side by an inverted repeat sequence, wherein said second selectable marker is bifunctional for selection in bacteria and in the target organism, and wherein said inverted repeat sequences are functional as a binding site for a transposase;
3) incubating at least one of said cosmids with said second vector in vitro, in the presence of a transposase specific for the inverted repeat sequences on said second vector, such that said transposable element transposes into said genomic DNA to produce a disrupted cosmid;
4) amplifying said disrupted cosmid in a bacterial cell and selecting for the presence of said first and second selectable markers in said bacterial cell;
5) introducing the disrupted cosmid amplified in step 4) into a target cell from said target organism so that homologous recombination can occur between said genomic DNA in said disrupted cosmid and the genome of said target organism and thereby produce a mutated target cell; and
6) selecting for the presence of said second selectable marker and screening for successful homologous recombination produced by step 5) in said mutated target cell.
Minimal plasmid vectors are provided for use in the generation of large insert libraries, particularly for use in cloning large fragments of DNA, such as are found in genomic DNA samples. The vectors of the invention comprise an origin of replication for bacterial cells, a selectable marker gene for bacterial cells, a bacteriophage packaging site, and a multiple cloning site comprising recognition sites for one or more rare-cutting restriction endonucleases, which endonucleases preferably include one or more homing endonucleases. The vectors are less than about 6.5 kb in length, and may be less than about 2.3 kb in length. In a preferred embodiment of the invention, the large insert vector library is a cosmid library or a BAC library, more preferably a cosmid library. Preferred cosmid vectors are pcosKA5, pcosJH1 and pPGFRKA1 (pcosKA4).
In a preferred embodiment of the invention, said transposable element and transposase are systems of Himar1, AT-2, GPS-1, GPS-2, Himar1, EZ::tn, SIF or Mu.
The most preferred embodiment of the invention relates to homologous recombination in filamentous fungi, particularly Magnaporthe grisea, Magnaporthe graminicola, Botrytis cinerea, Erysiphe graminis, Aspergillus niger, Aspergillus fumigatus or Phytophthora infestans.