One of the primary goals of protein design is to generate proteins with new or improved properties. The ability to confer a desired activity on a protein or enzyme has considerable practical application in the chemical and pharmaceutical industry. Directed protein evolution has emerged as a powerful technology platform in protein engineering, in which libraries of variants are searched experimentally for clones possessing the desired properties.
Directed protein evolution harnesses the power of natural selection to evolve proteins or nucleic acids with desirable properties not found in nature. Various techniques are used for generating protein mutants and variants and selecting desirable functions. Recombinant DNA technologies have allowed the transfer of single structural genes or genes for an entire pathway to a suitable surrogate host for rapid propagation and/or high-level protein production. Accumulated improvements in activity or other properties are usually obtained through iterations of mutation and screening. Applications of directed evolution are mainly found in academic and industrial laboratories to improve protein stability and enhance the activity or overall performance of enzymes and organisms or to alter enzyme substrate specificity and to design new activities. Most directed evolution projects seek to evolve properties that are useful to humans in an agricultural, medical or industrial context (biocatalysis).
The evolution of whole metabolic pathways is a particularly attractive concept, because most natural and novel compounds are produced by pathways rather than by single enzymes. Metabolic pathways engineering usually requires the coordinated manipulation of all enzymes in the pathway. The evolution of new metabolic pathways and the enhancement of bioprocessing usually is performed through a process of iterative cycles of recombination and screening or selection to evolve individual genes, whole plasmids, multigene clusters, or even whole genomes.
Shao et al (Nucleic Acids Research 37(2):e16 Epub 2008 Dec. 12) describe the assembly of large recombinant DNA encoding a whole biochemical pathway or genome in a single step via in vivo homologous recombination of two flanking (anchoring) regions at the 5′ and 3′ ends containing sequences of the 5′ or 3′ end of the adjacent fragment in Saccharomyces cerevisiae. 
Elefanty et al. (Proc. Natl. Acad. Sci. 95, 11897-11902 (1998) describe gene targeting experiments to generate mutant mice, in which the lacZ reporter gene has been knocked in to the SCL locus. Reference is made to FIG. 1 showing the SCL-lacZ gene targeting strategy employing two anchoring sequences, i.e. one at each of the 5′ and 3′ end.
Directed evolution can be performed in living cells, also called in vivo evolution, or may not involve cells at all (in vitro evolution). In vivo evolution has the advantage of selecting for properties in a cellular environment, which is useful when the evolved protein or nucleic acid is to be used in living organisms. In vivo homologous recombination in yeast has been widely used for gene cloning, plasmid construction and library creation.
Library diversity is obtained through mutagenesis or recombination. DNA shuffling allows the direct recombination of beneficial mutations from multiple genes. In DNA shuffling a population of DNA sequences are randomly fragmented and then reassembled into full-length hybrid sequences.
For the purpose of homologous recombination naturally occurring homologous genes are used as the source of starting diversity. Single-gene shuffling library members are typically more than 95% identical. The familiy-shuffling, however, allows block exchanges of sequences that are typically more than 60% identical. The functional sequence diversity comes from related parental sequences that have survived natural selection; thus, much larger numbers of mutations are tolerated in a given sequence without introducing deleterious effects on the structure or function.
The recombination of DNA fragments of different origin with up to 30% diversity is described in WO1990007576A1. Hybrid genes are produced in vivo by intergeneric and/or interspecific recombination in mismatch repair deficient bacteria or in bacteria of which the mismatch repair (MMR) system is transitorily inactivated. Thereby those processes by which damaged DNA are repaired, are avoided, which would have an inhibitory effect on the recombination frequency between divergent sequences, i.e. homeologous recombination.
A review of basic mechanisms of MMR is provided by Kunz et al (Cell. Mol. Life. Sci. 66 (2009) 1021-1038).
Targeted homeologous recombination is described in MMR deficient plants (WO2006/134496A2). Targeting to a locus with sequences having up to 10% differences was possible.
Homologous recombination into bacteria for the generation of polynucleotide libraries is disclosed in WO03/095658A1. An expression library of polynucleotides was generated, wherein each polynucleotide is integrated by homologous recombination into the genome of a competent bacterium host cell, using a non-replicating linear integration cassette comprising the polynucleotide and two flanking sequences homologous with a region of the host cell genome.
The diversity of libraries can be enhanced by taking advantage of the ability of haploid cells to efficiently mate leading to the formation of a diploid organism. In its vegetative life cycle S. cerevisiae cells have a haploid genome, i.e. every chromosome is present as a single copy. Under certain conditions the haploid cells can mate. By this way a diploid cell is formed. Diploid cells can form haploid cells again, especially when certain nutrients are missing. They then undergo a process called meiosis followed by sporulation to form four haploid spores. During meiosis the different chromosomes of the two parental genomes recombine. During meiotic recombination DNA fragments are exchanged resulting in recombined DNA material.
WO2005/075654A1 discloses a system for generating recombinant DNA sequences in Saccharomyces cerevisiae, which is based on the sexual reproductive cycle of S. cerevisiae. Heterozygous diploid cells are grown under conditions which induce the processes of meiosis and spore formation. Meiosis is generally characterized by elevated frequencies of genetic recombination. Thus, the products of meiosis, which are haploid cells or spores, can contain recombinant DNA sequences due to recombination between the two diverged DNA sequences. By an iterative method recombinant haploid progeny is selected and mated to one another, the resulting diploids are sporulated again, and their progeny spores are subjected to appropriate selection conditions to identify new recombination events. This process is described in wild-type or mismatch repair defective S. cerevisiae cells. Therefore, the genes of interest, each flanked by two selection markers, are integrated into an identical locus of each of the two sister chromosomes of mismatch repair deficient diploid strains. DNA sequences are added to the 5′ or 3′ end of the new DNA fragment that are 100% identical to the flanking DNA sequences of the locus where the DNA has to be integrated. These flanking target sequences are about 400-450 nucleotides long. Then the cells are forced to initiate sporulation. During the sporulation the recombination process takes place. The resulting spores and recombinant sequences can be differentiated by selection for the appropriate flanking markers.
The ability of yeast to efficiently recombine homologous DNA sequences can also be exploited to increase the diversity of a library. When two genes that share 89.9% homology were mutated by PCR and transformed into wild type yeast, a chimeric library of 10e7 was created through in vivo homologous recombination, showing several cross-over points throughout the two genes (Swers et al Nucleic Acids Research 32(3) e36 (2004)).
A method of mitotic homeologous recombination is described by Nicholson et al (Genetics 154: 133-146 (2000)). Effects of defined mismatches contained in short inverted repeats on recombination rates in wild-type or MMR-defective strains were investigated.
It is the object of the present invention to provide an improved method of preparing and assembling a diversity of gene mosaics, especially for recombining long DNA fragments. As a result it would be desirable to provide respective libraries of variants for the selection of improved recombinants.
The object is achieved by the provision of the embodiments of the present application.