Manipulation of nucleic acids and proteins is an important aspect of modern molecular biology. In particular, the science of combinatorial genetics has advanced in recent years as it has become apparent that proteins having altered structure and function can be engineered by swapping large or small portions of the amino acid sequence with other related or unrelated amino acid sequences. Using this approach, it is also possible to engineer novel proteins that bring together in a single molecule the structures and functions of diverse molecules. Such manipulations are most readily undertaken at the nucleic acid level. The nucleic acids thus produced can be transcribed to produce fusion RNAs and translated either in vitro or in vivo using known methods and the recombinant proteins thus produced can be isolated. Other manipulations, such as inserting polynucleotides of interest into a chromosome, deleting sections of a chromosome, or cloning sections of a chromosome are also of interest. Moreover, various approaches are known for shuffling gene pieces and selecting or screening for products having one or more desired activities or properties. Examples of such technologies include U.S. Pat. Nos. 5,605,793, 5,830,721, and 6,132,970. A number of companies including Maxygen, Diversa, Applied Molecular Evolution, and Genencor International among others specialize in directed molecular evolution. See Pollack, A., “Selling Evolution in Ways Darwin Never Imagined,” New York Times, page B1, Oct. 28, 2000.
Recently, non-transposition based methods for generating large libraries of randomly fused genes have been reported. Ostermeir and co-workers first described a method termed ITCHY that involves the incremental truncation of two genes by use of ExoIII nuclease followed by S1 nuclease treatment, polymerization, and ligation of fragments to form random fusions (Ostermeier et al., 1999; Lutz et al., 2000). A second method has also been reported that utilizes random cleavage of DNA followed by a series of digestion and ligation reactions to create random fusions. This technique, called SHIPREC, also has features that increase the amount of useful fusions if both proteins have similar length and domain organization (Seiber et al., 2001). Existing methods for gene shuffling and chromosome manipulation are complex and can exhibit a bias for or against particular recombination sites and sequences. Thus, the art continues to develop more sophisticated manipulations.
Various aspects of an in vitro transposition system that employs sequences from, and sequences derived from, the Tn5 transposon are described in U.S. Pat. Nos. 5,925,545, 5,948,622, and 5,965,443, each of which is incorporated by reference herein as if set forth in its entirety. International publication Number WO 00/17343, also incorporated herein by reference as if set forth herein in its entirety, discloses a system for introducing into cells synaptic complexes that comprise a Tn5 transposase and a polynucleotide having flanking sequences that operably interact with the transposase to form a synaptic complex.
Even though these known systems for Tn5-based in vitro transposition are effective and very useful, they do not provide sufficient manipulative control to meet the technological goals noted above.
Efforts are also underway to define so-called minimal bacterial genomes for growth under defined conditions and, similarly, to identify genes essential for growth under defined conditions. Determining the content required for a minimal bacterial genome is of intense interest. One approach is to assemble the theoretical minimal genome in silico by comparing a variety of different microbial genomes. Alternatively, the smallest genome amongst existing genomes (mycoplasma) can be analyzed by mutagenesis. E. coli K12 is a preferred bacterium, because of it simplicity in handling, and its short generation time. It is desirable to try to generate a minimal or significantly reduced E. coli K12 genome, which may shorten the already short doubling time in rich media.
Recently developed transposon-based approaches involve inserting a transposon into a gene to (1) knockout or disrupt a gene function or (2) introduce a lethal mutation that cannot be observed in an essential gene. These methods essentially catalogue transposition into non-essential genes. It is assumed that any gene that contains no transposon insert is essential.
An important alternative approach involves affirmatively identifying essential genes in libraries of cells, where the cells contain transposons having selectively regulated outwardly-facing promoters inserted upstream from an essential gene. While the cells of interest are not viable on media that cannot activate a transposon promoter, expression is restored by selectively activating either or both of the transposon promoters. Unfortunately, only a few of the transposon inserts in a library will insert into the promoter region of an essential gene.