Thousands of different types of protein species constitute a major molecular component of cellular life. These molecules are composed of amino acid chains, the sequence of which is encoded by the genes in the organism's DNA. The protein function can be diverse and specific functions have been evolved for different cellular demands. Native wild type protein molecules can obviously be studied for their function biochemically and genetically. The data thus obtained can be informative but very often such information is relatively limited. A better description of protein function can be gained through mutational analysis in which various types of mutations are introduced into the protein primary sequence and the mutated proteins are then analyzed for their function. With current recombinant DNA technology (Sambrook et al. 1989, Sambrook and Russell 2001), generation of mutations is relatively easy and therefore mutational analysis of proteins has become a standard in functional studies of proteins.
In principle, three different types of mutations can be introduced into a protein sequence (i) substitutions, (ii) insertions, and (iii) deletions. In a substitution mutation, a particular amino acid (or an amino acid stretch) in a protein is changed to another (or to another amino acid stretch of same length). In an insertion, an amino acid or a stretch of amino acids is added to the protein thus increasing the length of the amino acid chain. In a deletion mutation, an amino acid or a stretch of amino acids are eliminated from the protein sequence and thus the protein becomes smaller in size.
Various mutagenesis methods are currently available for generation of different types of mutations. These methods are typically straightforward to use. However, in most of the cases the wanted mutations are generated one by one and, therefore, their construction is time-consuming and labor-intensive. It would be desirable if a number of mutations could be generated simultaneously. For certain types of insertion mutations this type of approach has been described (Hayes and Hallet 2000). However, an efficient method for simultaneous generation of substitution and deletion mutations is still lacking.
One of the in vitro transposition systems we utilised for the present invention was a bacteriophage Mu-derived transposition system that has recently been introduced (Haapa et al. 1999a) and shown to function efficiently in many types of molecular biology applications (Wei et al. 1997, Taira et al. 1999, Haapa et al 1999ab, Vilen et al. 2001). Mu transposition proceeds within the context of protein-DNA complexes that are called DNA transposition complexes or transpososomes (Mizuuchi 1991, Savilahti et al. 1995). These complexes are assembled from a tetramer of MuA transposase protein and Mu-transposon-derived DNA-end-segments (i.e. transposon end sequences recognised by MuA) containing MuA binding sites. When the complexes are formed they can react in divalent metal ion-dependent manner with any target DNA and splice the Mu end segments into the target (Savilahti et al 1995). In the simplest case, the MuA transposase protein and a short 50 bp Mu right-end (R-end) fragment are the only macromolecular components required for transpososome assembly (Savilahti et al. 1995, Savilahti and Mizuuchi 1996). Analogously, when two R-end sequences are located as inverted terminal repeats in a longer DNA molecule, transposition complexes form by synapsing the transposon ends. Target DNA in Mu DNA transposition in vitro can be linear, open circular, or supercoiled (Haapa et al. 1999a).
Mu transposition complex, the machinery within which the chemical steps of transposition take place, is initially assembled from four molecules of MuA transposase protein that first bind specific binding sites in the transposon ends (FIGS. 5A and 5B). The 50 bp Mu right end DNA segment contains two of these binding sites (they are called R1 and R2 and each of them is 22 bp long, Savilahti et al. 1995). When two ends, each bound by two MuA monomers, meet, the transposition complex is formed through conformational changes, the nature of which are not fully understood because of a lack of atomic resolution structural data on Mu transpososomes. However, the assembly of the minimal Mu transpososome is clearly dependent upon the correct binding of MuA transposase to Mu ends of the donor DNA. Thus, modifications in the conserved nucleotide sequence of transposon ends (e.g. R1 and R2 sequences in Mu R-end) should potentially have a negative effect on the efficiency of the transposition since every altered nucleotide conceivably interferes with the MuA binding. It has been documented (Lee and Harshey 2001, Coros and Chaconas 2001) that the two last base pairs in the Mu transposon end can be modified without severe effect on transpososome function. However, no detailed analysis has been conducted for elucidation of the effects of modified R1 and R2 binding sites. In one example (Laurent et al. 2000) a NotI restriction site was engineered close to the transposon end that changed one base pair in the R1 sequence. In vivo studies indicate that within the R1 and R2 sequences mutations generally have negative effects on transposition efficiency (Groenen et al. 1985, 1986). In addition, these effects are typically additive.