Genetic information is eventually decoded into a protein which performs most of the vital functions in living organisms. As one of important biological macromolecules, protein not only serves as a component of cells but also participates in all the biochemical reactions with a high specificity.
The function of protein comprised of 20 kinds of amino acids is determined by the structure which is divided into four levels; primary, secondary, tertiary and quaternary structures. Since the primary structure of protein, i.e., amino acid sequence, especially contains the information regarding the shape and the function thereof, the whole structure or function of the protein can be changed even by a mutation in one amino acid residue (Shao, Z. and Arnold F. H., Curr. Opin. Struct. Biol. 6:513-518, 1996).
The diversity of organism reflects the diversity of genetic information encoded in DNA or RNA. In nature, the genetic information is changed slowly and continuously by a natural evolution process comprising mutation, sexual reproduction and natural selection. For example, during meiosis in sexual reproduction, homologous chromosomes derived from two individuals might exchange or reassemble their genetic materials through homologous recombination. Such reassembly of the DNA provides more chances for living organisms to expedite an evolution. However, it takes long time for this type of evolution to occur in natural environment, partly due to its strong dependency on fortuity. Therefore, there have been many efforts to obtain, in a short period of time, a gene evolved for the desired purpose and a mutant protein by in vitro mutagenesis in combination with an appropriate screening method (Eigen, M., Naturwissenschaften 58:465-523, 1971; Bradt, R. M., Nature 317:804-806, 1985; Pal, K. F., Bio. Cybern. 69:539-546, 1993).
Current method in widespread use for creating mutant proteins is site-directed mutagenesis (Sambrook, J. et al., Molecular Cloning 2nd, Cold Spring Harbor Lab Press, 1989). This method replaces nucleotides of desired site with a synthetically mutagenized oligonucleotide. However, there are limitations of the method in that it requires exact information on the amino acid sequence and the function of the site to be mutagenized in-proteins. As another method for creating mutant proteins in a recombinant DNA library format, error-prone polymerase chain reaction (error-prone PCR) is used widely (Leung, D. W. et al., Technique 1:11-15, 1989; Caldwell, R. C. et al., PCR Methods and Applications 2:28-33, 1992). Error-prone PCR can be used for constructing a mutant DNA library of a gene by controlling the polymerization conditions to decrease the fidelity of polymerase. However, the error-prone PCR suffers from a low processibility of the polymerase, which limits the practical applications of the method for average-sized gene. Another limitation of error-prone PCR is that the frequency of co-occurrence of a plurality of mutations within a short-length region of DNA is too low for multiple mutations to be introduced.
To overcome said shortcomings of these methods, various methods for constructing a mutant DNA library from the mixture of homologous polynucleotides have been developed. Those are DNA shuffling method of Maxygen (U.S. Pat. Nos. 5,605,793; 6,117,679; 6,132,970), Gene Reassembly method of Diversa (U.S. Pat. No. 5,965,408) and recombination method developed by Frances H. Arnold (U.S. Pat. No. 6,153,410).
The DNA shuffling method of Maxygen, Inc. (U.S. Pat. Nos. 5,605,793; 6,117,679 and 6,132,970; Stemmer, W. P. C., Nature, 370: 389-391, 1994; Stemmer, W. P. C., Proc. Natl. Acad. Sci. USA, 91: 10747-10751, 1994) comprises the steps of fragmenting at least one kind of double-stranded DNAs to be shuffled and conducting polymerase chain reactions (PCR) with the combined fragments, wherein the homologous fragments from different parent DNAs are annealed with each other to form partially overlapping DNA segments and DNA synthesis occurs by employing the respective DNA fragments as a template concurrently as a primer for each other to produce a random recombinant DNA library. However, this method requires a relatively large amount of DNA for preparing DNA fragments and DNase I used in the fragmentation process has to be removed from the resulting DNA fragments in an enough purity not to disturb subsequent polymerization process. Further, the application of the method is limited by the property of the DNase I. For example, DNase I widely used for the purpose is liable to cleave a 3′-phosphodiester bond having a pyrimidine base rather than a purine base at its terminus, which is a serious obstacle to get a completely randomized pool of DNA fragments (Shao, Z. et al, Nucleic Acids Res. 26:681-683, 1998).
Gene Reassembly method of Diversa Corporation (U.S. Pat. No. 5,965,408) comprises the steps of synthesizing DNA fragments by polymerization process employing at least one kind of double-stranded DNAs to be shuffled as templates and conducting polymerase chain reactions (PCR) with the combined fragments to produce a random recombinant DNA library. It employs partially synthesized fragments produced by UV treatment or adduct formation on the template DNA, thus preventing a complete polymerization on the template DNA. Despite of the randomness of the constructed DNA library, there are still problems for the method of Diversa Corporation in view of mutagenic potential of used reagents and tediousness to optimize the reaction conditions for the treatment of polymerization terminating reagent to obtain the desired size of fragments. In addition, when pyrimidine bases exist contiguously on the DNA strand, UV treatment induces pyrimidine dimers such as a thymidine dimer, which makes the template DNA distorted and prevent the progress of polymerase along with the strand. As a result, polymerizations are likely to end up at the site of pyrimidine dimer, thus DNA fragments obtained having insufficient randomness.
DNA shuffling and Gene Reassembly methods are characterized in that the formation of partially overlapping DNA segments is a prerequisite step and each DNA fragment derived from starting DNAs to be shuffled serves as not only a template but a primer.
Another method proposed by Arnold, staggered extension process (StEP)(U.S. Pat. No. 6,153,410; Zhao, H. et al., Nat. Biotechnol. 16:258-261, 1998; Encell, L. P. et al., Nature Biotech. 16:234-235, 1998) involves priming template double-stranded polynucleotides with random or specific primers, conducting PCR while controlling the reaction conditions to produce, in each cycle of reactions, short DNA fragments of staggered extension from the templates, and conducting repeated PCR to accomplish the recombination between genes by template switching. In case of polymerase reaction, there exist specific sequence-specific pause sites in each of target DNAs. In this line, StEP method has a problem in that the recombinant DNA library is biased from randomness since the extension rate of DNA fragments extended from the primers differs from each other even if the primers are annealed to the same region of different template DNAs (Encell, L. P. and Loeb, L. A., Nature Biotech., 16: 234-235 (1998)). In StEP method, PCR conditions have to be strictly controlled in order to get short DNA fragments from staggered extension of primers by shortening the polymerization time and lowering the reaction temperature. Failure to maintain the desirable range of temperature (e.g., too low temperature) during PCR process in StEP method may lead to non-specific annealing and further formation of undesirable recombinants.
A method for constructing a recombinant DNA library whereby said drawbacks of the conventional methods are overcome would be powerful for the production of mutant proteins having improved properties. The present invention described herein is directed to a method of in vitro recombination of heterologous DNA strands, which comprises preparing unidirectional single-stranded DNA fragments, mixing the DNA fragments with specific primers, followed by polymerization and further repeating the above steps to produce a recombinant DNA library. Further advantages of the present invention will become apparent from the following description of the invention with reference to the attached drawings.