1. Field of the Invention
The present invention relates generally to in vitro methods for mutagenesis and recombination of polynucleotide sequences. More particularly, the present invention involves a simple and efficient method for in vitro mutagenesis and recombination of polynucleotide sequences based on polymerase-catalyzed extension of primer oligonucleotides, followed by gene assembly and optional gene amplification.
2. Description of Related Art
The publications and other reference materials referred to herein to describe the background of the invention and to provide additional detail regarding its practice are hereby incorporated by reference. For convenience, the reference materials are numerically referenced and grouped in the appended bibliography.
Proteins are engineered with the goal of improving their performance for practical applications. Desirable properties depend on the application of interest and may include tighter binding to a receptor, high catalytic activity, high stability, the ability to accept a wider (or narrower) range of substrates, or the ability to function in nonnatural environments such as organic solvents. A variety of approaches, including xe2x80x98rationalxe2x80x99 design and random mutagenesis methods, have been successfully used to optimize protein functions (1). The choice of approach for a given optimization problem will depend upon the degree of understanding of the relationships between sequence, structure and function. The rational redesign of an enzyme catalytic site, for example, often requires extensive knowledge of the enzyme structure, the structures of its complexes with various ligands and analogs of reaction intermediates and details of the catalytic mechanism. Such information is available only for a very few well-studied systems; little is known about the vast majority of potentially interesting enzymes. Identifying the amino acids responsible for existing protein functions and those which might give rise to new functions remains an often-overwhelming challenge. This, together with the growing appreciation that many protein functions are not confined to a small number of amino acids, but are affected by residues far from active sites, has prompted a growing number of groups to turn to random mutagenesis, or xe2x80x98directedxe2x80x99 evolution, to engineer novel proteins (1).
Various optimization procedures such as genetic algorithms (2,3) and evolutionary strategies (4,5) have been inspired by natural evolution. These procedures employ mutation, which makes small random changes in members of the population, as well as crossover, which combines properties of different individuals, to achieve a specific optimization goal. There also exist strong interplays between mutation and crossover, as shown by computer simulations of different optimization problems (6-9). Developing efficient and practical experimental techniques to mimic these key processes is a scientific challenge. The application of such techniques should allow one, for example, to explore and optimize the functions of biological molecules such as proteins and nucleic acids, in vivo or even completely free from the constraints of a living system (10,11).
Directed evolution, inspired by natural evolution, involves the generation and selection or screening of a pool of mutated molecules which has sufficient diversity for a molecule encoding a protein with altered or enhanced function to be present therein. It generally begins with creation of a library of mutated genes. Gene products which show improvement with respect to the desired property or set of properties are identified by selection or screening. The gene(s) encoding those products can be subjected to further cycles of the process in order to accumulate beneficial mutations. This evolution can involve few or many generations, depending on how far one wishes to progress and the effects of mutations typically observed in each generation. Such approaches have been used to create novel functional nucleic acids (12), peptides and other small molecules (12), antibodies (12), as well as enzymes and other proteins (13,14,16). Directed evolution requires little specific knowledge about the product itself, only a means to evaluate the function to be optimized. These procedures are even fairly tolerant to inaccuracies and noise in the function evaluation (15).
The diversity of genes for directed evolution can be created by introducing new point mutations using a variety of methods, including mutagenic PCR (15) or combinatorial cassette mutagenesis (16). The ability to recombine genes, however, can add an important dimension to the evolutionary process, as evidenced by its key role in natural evolution. Homologous recombination is an important natural process in which organisms exchange genetic information between related genes, increasing the accessible genetic diversity within a species. While introducing potentially powerful adaptive and diversification competencies into their hosts, such pathways also operate at very low efficiencies, often eliciting insignificant changes in pathway structure or function, even after tens of generations. Thus, while such mechanisms prove beneficial to host organisms/species over geological time spans, in vivo recombination methods represent cumbersome, if not unusable, combinatorial processes for tailoring the performance of enzymes or other proteins not strongly linked to the organism""s intermediary metabolism and survival.
Several groups have recognized the utility of gene recombination in directed evolution. Methods for in vivo recombination of genes are disclosed, for example, in published PCT application WO 97/07205 and U.S. Pat. No. 5,093,257. As discussed above, these in vivo methods are cumbersome and poorly optimized for rapid evolution of function. Stemmer has disclosed a method for in vitro recombination of related DNA sequences in which the parental sequences are cut into fragments, generally using an enzyme such as DNase I, and are reassembled (17,18,19). The non-random DNA fragmentation associated with DNase I and other endonucleases, however, introduces bias into the recombination and limits the recombination diversity. Furthermore, this method is limited to recombination of double-stranded polynucleotides and cannot be used on single-stranded templates. Further, this method does not work well with certain combinations of genes and primers. It is not efficient for recombination of short sequences (less than 200 nucleotides (nts)), for example. Finally, it is quite laborious, requiring several steps. Alternative, convenient methods for creating novel genes by point mutagenesis and recombination in vitro are needed.
The present invention provides a new and significantly improved approach to creating novel polynucleotide sequences by point mutation and recombination in vitro of a set of parental sequences (the templates). The novel polynucleotide sequences can be useful in themselves (for example, for DNA-based computing), or they can be expressed in recombinant organisms for directed evolution of the gene products. One embodiment of the invention involves priming the template gene(s) with random-sequence oligonucleotides to generate a pool of short DNA fragments. Under appropriate reaction conditions, these short DNA fragments can prime one another based on complementarity and thus can be reassembled to form full-length genes by repeated thermocycling in the presence of thermostable DNA polymerase. These reassembled genes, which contain point mutations as well as novel combinations of sequences from different parental genes, can be further amplified by conventional PCR and cloned into a proper vector for expression of the encoded proteins. Screening or selection of the gene products leads to new variants with improved or even novel functions. These variants can be used as they are, or they can serve as new starting points for further cycles of mutagenesis and recombination.
A second embodiment of the invention involves priming the template gene(s) with a set of primer oligonucleotides of defined sequence or defined sequence exhibiting limited randomness to generate a pool of short DNA fragments, which are then reassembled as described above into full length genes.
A third embodiment of the invention involves a novel process we term the xe2x80x98staggered extensionxe2x80x99 process, or StEP. Instead of reassembling the pool of fragments created by the extended primers, full-length genes are assembled directly in the presence of the template(s). The StEP consists of repeated cycles of denaturation followed by extremely abbreviated annealing/extension steps. In each cycle the extended fragments can anneal to different templates based on complementarity and extend a little further to create xe2x80x9crecombinant cassettes.xe2x80x9d Due to this template switching, most of the polynucleotides contain sequences from different parental genes (i.e. are novel recombinants). This process is repeated until full-length genes form. It can be followed by an optional gene amplification step.
The different embodiments of the invention provide features and advantages for different applications. In the most preferred embodiment, one or more defined primers or defined primers exhibiting limited randomness which correspond to or flank the 5xe2x80x2 and 3xe2x80x2 ends of the template polynucleotides are used with StEP to generate gene fragments which grow into the novel full-length sequences. This simple method requires no knowledge of the template sequence(s).
In another preferred embodiment, multiple defined primers or defined primers exhibiting limited randomness are used to generate short gene fragments which are reassembled into full-length genes. Using multiple defined primers allows the user to bias in vitro recombination frequency. If sequence information is available, primers can be designed to generate overlapping recombination cassettes which increase the frequency of recombination at particular locations. Among other features, this method introduces the flexibility to take advantage of available structural and functional information as well as information accumulated through previous generations of mutagenesis and selection (or screening).
In addition to recombination, the different embodiments of the primer-based recombination process will generate point mutations. It is desirable to know and be able to control this point mutation rate, which can be done by manipulating the conditions of DNA synthesis and gene reassembly. Using the defined-primer approach, specific point mutations can also be directed to specific positions in the sequence through the use of mutagenic primers.
The various primer-based recombination methods in accordance with this invention have been shown to enhance the activity of Actinoplanes utahensis ECB deacylase over a broad range of pH values and in the presence of organic solvent and to improve the thermostability of Bacillus subtilis subtilisin E. DNA sequencing confirms the role of point mutation and recombination in the generation of novel sequences. These protocols have been found to be both simple and reliable.
The above discussed and many other features and attendant advantages will become better understood by reference to the following detailed description when taken in conjunction with the accompanying drawings.