Safe delivery of transgenes into the human genome remains an open problem of critical importance to clinical genetics. Many existing technologies have major limitations. For instance, retroviruses, lentiviruses, and transposons integrate non-specifically and can therefore cause cancer by mutagenesis1-3. Transgenes can also be integrated using the endogenous homologous repair pathways, although this process must be stimulated by generating double-stranded breaks at the target site using programmable nucleases technologies such as meganucleases4,5, zinc finger nucleases6, TALE nucleases7,8, or the RNA-guided Cas9 protein9,10. This technique is limited by the fact that homologous recombination in humans is less efficient than the competing mutagenic nonhomologous end joining pathway11,12.
Site-specific recombinases, which catalyze recombination at precise sites, have properties that make them promising candidates for use as safe gene delivery vectors. For instance, many of them require no host-encoded factors for function13. The size of the integrated cassette is less restricted than for other methods. The sequence specificity, i.e. the intended binding site of a protein, of recombinases can be altered either by direction evolution or by fusing them to modular DNA-binding domains14-23. Unfortunately, many reprogrammed variants are promiscuous in their activity. This problem isn't restricted to artificial variants, as activity at off-target human genomic loci has been reported for some wild-type (WT) recombinases24-29. If recombinases are to be used as gene delivery vectors it is imperative to identify ways to enhance their accuracy.
One way to improve the accuracy of DNA-binding proteins is to increase the number of specific or decrease the number of non-specific DNA-protein contacts30,31. While powerful, this approach can be inconvenient if the goal is to generate variants of a protein with different specificities: a specificity change would alter the DNA-protein interaction, requiring re-optimization of accuracy. There is therefore a need for ways to systematically enhance accuracy without altering the DNA-protein interface.
Cre catalyzes a reversible, directional recombination between two 34 base-pair (bp) loxP sequences named which consist of a pair of 13 bp inverted repeats flanking a 8 bp asymmetrical spacer32-35. Mutagenic studies of loxP have shown that many mutations have non-catastrophic effects on recombination efficiency36-38.