Genome engineering requires the consolidation of many diverse concepts (Silva, Poirot et al. 2011), the most fundamental being the need to specifically and efficiently target a DNA sequence within a complex genome. Re-engineering a DNA binding protein for this purpose has been mainly limited to few semi-modular archetypes (Pingoud and Wende 2011) such as artificial zinc-finger proteins (ZFP), the naturally occurring LAGLIDADG homing endonucleases (LHE), and the chimeric Transcription Activator Like Effectors nuclease (TALEN).
Meganucleases, also called homing endonucleases (HEs), can be divided into five families based on sequence and structure motifs: LAGLIDADG, GIY-YIG, HNH, His-Cys box and PD-(D/E)XK (Stoddard 2005; Zhao, Bonocora et al. 2007). The most well studied family is that of the LAGLIDADG proteins, with a considerable body of biochemical, genetic and structural work having established that these endonucleases could be used as molecular tools (Stoddard, Monnat et al. 2007; Arnould, Delenda et al. 2011). Although numerous engineering efforts have focused on LAGLIDADG HEs, members from two other families, GIY-YIG and HNH, are of particular interest. Biochemical and structural studies have established that in both families, member proteins can adopt a bipartite fold with distinct functional domains: (1) a catalytic domain responsible mainly for DNA cleavage, and; (2) a DNA-binding domain to provide target specificity (Stoddard 2005; Marcaida, Munoz et al. 2010).
Zinc-finger nucleases (ZFNs), generated by fusing Zinc-finger-based DNA-binding domains to an independent catalytic domain (Kim, Cha et al. 1996; Smith, Berg et al. 1999; Smith, Bibikova et al. 2000), represent another type of engineered nuclease commonly used to stimulate gene targeting and have been successfully used to induce gene correction, gene insertion, and gene deletion. The archetypal ZFNs are based on the catalytic domain of the Type IIS restriction enzyme FokI and Zinc Finger-based DNA binding domains made of strings of 3 or 4 individual Zinc Fingers, each recognizing a DNA triplet (Pabo, Peisach et al. 2001). Two Zinc Finger-FokI monomers have to bind to their respective Zinc Finger DNA-recognition sites on opposite strands in an inverted orientation in order to form a catalytically active dimer that catalyze double strand cleavage (Bitinaite, Wah et al. 1998).
Recently, a new class of chimeric nuclease using a FokI catalytic domain has been described (Christian, Cermak et al. 2010; Li, Huang et al. 2011). The DNA binding domain of these nucleases is derived from Transcription Activator Like Effectors (TALE), a family of proteins used in the infection process by plant pathogens of the Xanthomonas genus. In these DNA binding domains, sequence specificity is driven by a series of 33-35 amino acids repeats, differing essentially by the two positions. Each base pair in the DNA target is contacted by a single repeat, with the specificity resulting from the two variant amino acids of the repeat (the so-called repeat variable dipeptide, RVD). The apparent modularity of these DNA binding domains has been confirmed to a certain extent by modular assembly of designed TALE-derived protein with new specificities (Boch, Scholze et al. 2009; Moscou and Bogdanove 2009). As such, DNA cleavage by a TALEN requires two DNA recognition regions flanking an unspecific central region.
One notable constraint imposed by FokI nuclease domain is the requirement to function as a dimer to efficiently cleave DNA. For any given DNA target, this necessitates the design of two distinct ZFNs or two TALENs, such that each pair of zinc finger or TAL effector domains is oriented for FokI dimerization and DNA cleavage (Kleinstiver, Wolfs et al. 2012).
To overcome these drawbacks, the inventors and others have recently developed new types of monomeric chimeric endonucleases, in which DNA binding domain such as Zinc Finger, Homing Endonuclease (Kleinstiver, Wolfs et al. 2012) and TALE (International PCT application WO2012/138927) was fused to a monomeric catalytic domain.
In considering design possibilities for the monomeric chimeric endonuclease, the inventors reasoned that a low affinity cleavage domain that retained some sequence specificity would alleviate accidental off-site cleavage events resulting from DNA proximity during target-site scanning by the DNA binding domain. The inventors chose a homing endonuclease member of the GIY-YIG protein family, I-TevI (Mueller, Smith et al. 1995; Edgell, Stanger et al. 2004). By contrast to Fok1, I-TevI endonuclease do not require dimerization for DNA processing activity, thereby alleviating the need for “dual” target sites with intervening DNA “spacers” as for current TAL-nucleases and Zing-finger nucleases.
I-TevI exhibits a tripartite protein layout wherein an N-terminal catalytic domain is tethered by a long, flexible linker to a minimal C-terminal DNA binding domain. In the protein-DNA interaction the C-terminal domain is responsible for binding specificity as well as the majority of the complex affinity. However, the N-terminal I-TevI catalytic domain has been described as having its own DNA cleavage selectivity (Dean, Stanger et al. 2002), which interferes with the overall specificity of the chimeric endonuclease. This cleavage specificity reduces the number of possible nucleic acid sequences that can be targeted by the chimeric endonucleases.
I-TevI catalytic domain has been characterized biochemically in vitro as being specific to the CAACGC natural target sequence and, to a certain extend to sequences defined by the degenerate CN↑NN↓G motif, where arrows represent bottom (↑) and top (↓) strand cleavage. This general motif theoretically increases the number of potential cleavage sites.
However, the inventors have observed dramatic variation in efficacy of targeted gene disruption by TevI chimeric endonuclease using the above motif. It has appeared that targeting any sequences corresponding to this motif, often result into poor cleavage activity.