Diversification of DNA molecules provides a way to generate new proteins with properties not found in nature. This can be achieved in vitro (by manipulation outside living cells) by a variety of methods such as the insertion of specific novel and random sequence oligonucleotides (Gold, L et al 1997 Proceedings of the National Academy of Science, 94:59-97) in selected regions of genes, or the cleavage of variant DNA molecules (two sequences that have similar functions but differ in one or more sites) and reassembly of the fragments in new combinations (Stemmer, WPC 1994 Proceedings of the National Academy of Science, 91: 10747-10751), or by amplification of DNA by the polymerase chain reaction under conditions where the polymerase is error prone (Leung, D W et al 1989 Technique 1: 11-15). Screening for desired properties of any protein coded by the resultant novel nucleotide sequence requires transcription and translation of the sequence to yield the corresponding peptide and the appropriate post-translational modification. In these cases, this is usually achieved by introducing each new construct into a cell by transfection or electroporation (hereafter both processes will be covered by the term transfection) to form a transformed cell, a complex and time consuming procedure since it is necessary to check each construct to ensure it is correctly inserted and is complete. This difficulty is compounded where the new construct codes for one component of a multimeric protein since the procedure must be done twice for each new combination. That problem can be reduced by the use of fungal heterokaryons to lower the number of transfections required to one per component of a combinatorial array (U.S. Pat. No. 5,643,745, to Stuart, issued Jul. 1, 1997). However, the number of transfections required is still large and each must be checked to ensure the DNA insert is correctly placed and complete.
Genetic recombination
Genetic recombination in eukaryotes, higher organisms that have a true nucleus, occurs during the prophase of the reduction division that converts a diploid cell having two complete sets of homologous chromosomes to a tetrad (or sometimes an octad) of haploid cells each with one complete set of chromosomes. Two manifestations of recombination events are recognized: crossing over in which genes located at different sites (loci) on the same chromosome are recombined by reciprocal exchange of chromosome sections between a pair of homologous chromosomes and gene conversion in which the number of copies of a pair of allelic genes, ie genes that occupy the same locus on homologous chromosomes, is unequal in the tetrad or octad. Instead of a two:two segregation of the parental alleles, the tetrad comprises three haploid cells carrying one of the parental versions of the gene and one carrying the other parental version of the gene. Crossing over was first discovered in the fruit-fly Drosophila (Morgan Proc. Soc. Exp. Biol. Med. 8:17 1910) and gene conversion in the fungus Neurospora (M B Mitchell Proc. Natl. Acad. Sci. USA 41: 216-220 1955). There is now evidence that both crossing over and gene conversion occur universally in species that reproduce sexually and that a process having similar outcomes occurs in bacteria and their viruses and plasmids.
Genetic recombination in eukaryotes occurs in diploid cells (cells that contain two complete sets of homologous chromosomes) that are undergoing meiosis. Prior to the division, each of the two chromosome sets is replicated, generating two pairs of identical sister chromatids. The process of genetic recombination involves the establishment of joints between two homologous but not necessarily identical DNA sequences, one located on one chromatid of one sister pair and the other in a homologous chromatid which is a member of the other sister pair. The joints establish regions in one or both chromatids where one strand of the DNA duplex has the sequence of one homologue and the second strand has the sequence of the other homologue. Where the DNA sequences of the homologues differ, bases will be in mismatched pairs, that is pairs which are not A:T or G:C (A=deoxyadenine, T=deoxythymine, C=deoxycytidine and G=deoxyguanine). Enzymatic machinery corrects mismatched base pairs and the joints between the molecules are resolved, separating the two chromatids once more. For each site of mismatch, in half of the cases, the base pair present in one chromatid is now replaced by the base pair originally present in the other homologous chromatid. This accounts for gene conversion. In some cases the joints between molecules are resolved such that there is a reciprocal exchange of the regions each side of the joint. This process is called crossing over and also leads to novel combinations of DNA sequence information by which the parental homologues differed. Each chromatid is incorporated into one of the haploid cells (cells having only one set of chromosomes) that arise from meiosis, becoming a member of the complete set of chromosomes present in each cell.
The molecular processes of crossing over and gene conversion are yet to be fully understood. In the most widely accepted model for the molecular events of recombination (FIG. 3) (H Sun et al Cell 64: 1155-1161, 1991) it is supposed that one of the two homologous chromatids suffers a break in both strands of the DNA molecule and that the strands that end with a 5' phosphate are resected, leaving a single strand tail of several hundred bases that ends with a 3' hydroxyl group. It is proposed that the single strand tail pairs with the complementary strand of the unbroken chromosome to initiate the joint. The joint is thought to be completed by DNA synthesis from the 3' ends to provide a replacement strand for the DNA lost in the initial resection followed by rejoining of the breaks. This will form a double junction between the molecules in the manner shown in FIG. 3. Each junction is free to move. This leads to strand exchange between the two DNA molecules forming heteroduplex DNA. It is supposed that recombination is completed by scission of the junctions and correction of mispaired bases. Scission of the junctions can occur by breaks in either the "inner" or "outer" strands with equal probability (FIG. 3). Due to the limitations of a two dimensional representation of the junctions, the expectation of an equal frequency of these two modes of scission is not self evident. However in reality, the two pairs of complementary strands, both the inner and outer pair, are identically juxtaposed. If the resolution of both junctions occurs in the inner strands or alternatively in the outer strands, only gene conversion can occur. If the resolution of one junction is by scission of the inner strands and the other junction by scission of the outer strands, the flanking regions are reciprocally exchanged and there is both a crossover event and also the possibility of gene conversion.
There is direct evidence that recombination is initiated by two strand breaks in the yeast Saccharomyces cereviseae (A Schwacha and N Kleckner, Cell 83: 1-20 1995). However, the exact series of events by which these are processed to complete a recombination event is not clear. Indeed, Bowring and Catcheside working with the fungus Neurospora crassa (Genetics 143: 129-136 1996) have shown that most of the crossing over events previously thought to be associated with gene conversion are several hundred kilobases away, too far to be directly associated, suggesting that gene conversion and crossing over can be catalyzed by different recombination pathways.
Biological processes including recombination are error prone. M K Watters and D R Stadler (Genetics 139, 137-145 1995) examined the spectrum of spontaneous mutations (changes in the sequence of DNA bases in a gene, from that present in wild-type cells, that render it defective) in the mtr gene of Neurospora crassa. Watters and Stadler found that the spectrum of mutations which occur during the sexual phase that includes meiosis and recombination is distinct from those that occur during asexual reproduction by normal vegetative growth. Error prone recombination is a source of sequence diversification in vivo additional to that obtainable by the generation of new combinations of multiple sequence differences that distinguish homologous DNA sequences.
Genetic recombination in eukaryotes occurs in diploid cells that contain two complete sets of chromosomes and thus two complete sets of genes. The diploid state is established by the fusion of two haploid cells, usually of different parentage. This can be achieved by the fusion of gametes, as in the fusion of eggs and sperm in humans and other animals or of pollen cells with ovules in plants, or by fusion of two strains in the fungi where ability to fuse is usually controlled by mating type genes that ensure those strains that fuse are of different mating type and thus not genetically identical. In plants and animals, the fusion of haploid gametes establishes a clone of diploid cells which normally develops into an individual adult member of the species where genetic recombination occurs in specialist diploid cells in those parts of adults that give rise either to eggs or sperm. In the fungi, fusion of haploid strains usually gives rise to a dikaryon (a cell having haploid nuclei of two types, each with the genetic composition of one of the two strains, in a common cytoplasm). The dikaryon can form the main phase of the life cycle, as in the macrofungi ("mushrooms" and "toadstools"), or can be transient and give rise to diploid cells, immediately or after a limited number of mitotic cell divisions, that then undergo meiosis.
Genetic recombination in eukaryotes occurs during meiosis, the reduction division in which a diploid nucleus gives rise to four haploid nuclei each having only one set of chromosomes. During this process, the genetic information in the two sets of chromosomes present in the nucleus of the diploid cell is recombined. New gene combinations can be generated by reassortment of chromosomes between the sets present in the two haploid cells that contributed to the diploid cell undergoing meiosis and also by crossing over and gene conversion which generate new combinations of the sequence information present in pairs of homologous chromosomes. In prokaryotes, genetic recombination can occur between DNA sequences present in the chromosome and those carried by plasmids such as the fertility factor F of Eschericia coli or bacteriophage such as phage .lambda. (lambda) and between two phage molecules, two plasmids or any combination thereof.
The methods and compositions to cross together two genetically distinct individuals or strains of a living organism in order to obtain individuals with new gene combinations by reassortment, crossing over and gene conversion varies from species to species and for most species is within the common art of the biological sciences. Some species are better characterized genetically than are others, as a result of their being of particular economic importance, particular ecological or aesthetic importance or are species that are particularly favorable for research into the fundamental processes of biology. The best characterized species include the bacterium Escherichia coli, the plants Arabidopsis thaliana and Oryza sativa, the insect Drosophila melanogaster, the mammal Mus musculus, the nematode Cenorhabditis elegans, the slime mould Dictyostelium discoideum and the fungi Saccharomyces cereviseae, Aspergillus nidulans and Neurospora crassa. In each case there are compendia of standard methods for their growth and for conducting crosses. For example for Neurospora crassa, these include D D Perkins et al (Microbiol. Rev. 46:426-570 1982) R H Davis and F J deSerres (Methods in Enzymol. 17A: 79-143 1970). The following details of methods and compositions for genetic recombination in the fungi are given as examples and are not intended to limit the application of the invention for in vivo diversification of DNA sequences to these species. Nevertheless, bacteria are in general not suitable for the purpose of diversifying and expressing eukaryote sequences due to the lack of the correct processing pathways for proper gene expression and modification of any protein product. Amongst the eukaryotes, only in the fungi has understanding of the relevant molecular processes reached the level required for practical application of the present invention.
Recombination Hotspots
Crossing over and gene conversion during meiosis do not occur at random positions within chromosomes. Recombination is particularly frequent in regions called recombination hotspots. Recombination hotspots are also called recombinators. Recombination hotspots typically occur at several locations on a chromosome, frequently, but not always being in the regulatory region 5' of the coding sequence of a gene. They have been directly demonstrated in several species including the yeasts, Schizosaccharomyces pombe at the ade6 gene and Saccharomyces cerevisiae at the arg4 and his4 loci (M Lichten and A S H Goldman Ann. Rev. Genet. 29: 423-444 1995) and in the filamentous fungi in the Ascomycete N. crassa at cog (D G Catcheside & T Angel Aust. J. Biol. Sci 27: 219-229 1974) and at the am and his-3 loci and in the Basidiomycete Schizophilum commune (G Simchen and J Stamberg Heredity 24: 369-381 1969) at mating type loci . Recombination hotspots that have been studied include the arg4 and his4 hotspots in yeast. The ability of yeast recombinators to diversify heterologous DNA has not been demonstrated, and further, unlike cog and other recombinators in Neurospora, the yeast recombinators have not been shown to be regulated.
There is indirect evidence that recombination hotspots are widely distributed in higher eukaryotes including Homo sapiens (K F Lindahi Trends. Genet. 7: 273-276 1991) and plants including Zea mays (L Civardi et al Proc. Nat. Acad. Sci. USA 91: 8268-8272 1994). Recombination hotspots in bacteria include .chi. (chi) (R S Myers and F W Stahl, Ann. Rev. Genet. 28: 49-70 1995) which stimulates recombination between any pair of bacterial chromosomes, phages or plasmids and site specific recombinators such as att which stimulate insertion and excision of phage such as phage .lambda. (lambda).
In the case of the filamentous fungi, it is known that at least some of the recombination hotspots are subject to regulatory genes that turn them off. The genetic systems that regulate hotspot activity are well known only in Neurospora where the genes rec-1, rec-2 and rec-3 each turn off a different subset of hotspots scattered in the Neurospora genome. rec-1 blocks recombination at the nit-2 and his-1 loci. rec-2 blocks recombination at the his-3 locus and also in the chromosomal regions between the his-3 and ad-3, arg-3 and sn and pyr-3 and his-5 loci. rec-3 blocks recombination at the am and his-2 loci. Control of recombination by rec genes in N. crassa has been reviewed by DEA Catcheside (Genetical Research, 47: 157-165 1986).
There remains a need for an effective reagents for and methods employing the process of recombination and recombination hot spots to introduce sequence variation into, to diversify, heterologous DNA.