Transcription activator-like (TAL) effectors represent a class of DNA binding proteins secreted by plant-pathogenic bacteria of the species, such as Xanthomonas and Ralstonia, via their type III secretion system upon infection of plant cells. Natural TAL effectors specifically have been shown to bind to plant promoter sequences thereby modulating gene expression and activating effector-specific host genes to facilitate bacterial propagation (Römer, P., et al., Plant pathogen recognition mediated by promoter activation of the pepper Bs3 resistance gene. Science 318, 645-648 (2007); Boch, J. & Bonas, U. Xanthomonas AvrBs3 family-type III effectors: discovery and function. Annu. Rev. Phytopathol. 48, 419-436 (2010); Kay, S., et al. U. A bacterial effector acts as a plant transcription factor and induces a cell size regulator. Science 318, 648-651 (2007); Kay, S. & Bonas, U. How Xanthomonas type III effectors manipulate the host plant. Curr. Opin. Microbiol. 12, 37-43 (2009).) Natural TAL effectors are generally characterized by a central repeat domain and a carboxyl-terminal nuclear localization signal sequence (NLS) and a transcriptional activation domain (AD). The central repeat domain typically consists of a variable amount of between 1.5 and 33.5 amino acid repeats that are usually 33-35 residues in length except for a generally shorter carboxyl-terminal repeat referred to as half-repeat. The repeats are mostly identical but differ in certain hypervariable residues. DNA recognition specificity of TAL effectors is mediated by hypervariable residues typically at positions 12 and 13 of each repeat—the so-called repeat variable diresidue (RVD) wherein each RVD targets a specific nucleotide in a given DNA sequence. Thus, the sequential order of repeats in a TAL protein tends to correlate with a defined linear order of nucleotides in a given DNA sequence. The underlying RVD code of some naturally occurring TAL effectors has been identified, allowing prediction of the sequential repeat order required to bind to a given DNA sequence (Boch, J. et al. Breaking the code of DNA binding specificity of TAL-type III effectors. Science 326, 1509-1512 (2009); Moscou, M. J. & Bogdanove, A. J. A simple cipher governs DNA recognition by TAL effectors. Science 326, 1501 (2009)). Further, TAL effectors generated with new repeat combinations have been shown to bind to target sequences predicted by this code. It has been shown that the target DNA sequence generally start with a 5′ thymine base to be recognized by the TAL protein.
The modular structure of TALs allows for combination of the DNA binding domain with effector molecules such as nucleases. In particular, TAL effector nucleases allow for the development of new genome engineering tools known.
Zinc-finger nucleases (ZFN) and meganucleases are examples of other genome engineering tools. ZFNs are chimeric proteins consisting of a zinc-finger DNA-binding domain and the a nuclease domain. One example of a nuclease domain is the non-specific cleavage domain from the type IIS restriction endonuclease FokI (Kim, Y G; Cha, J., Chandrasegaran, S. Hybrid restriction enzymes: zinc finger fusions to Fok I cleavage domain Proc. Natl. Acad. Sci. USA. 1996 Feb. 6; 93(3):1156-60) typically separated by a linker sequence of 5-7 bp. A pair of the Fold cleavage domain is generally required to allow for dimerization of the domain and cleavage of a non-palindromic target sequence from opposite strands. The DNA-binding domains of individual Cys2His2 ZFNs typically contain between 3 and 6 individual zinc-finger repeats and can each recognize between 9 and 18 base pairs.
One problem associated with ZNFs is the possibility of off-target cleavage which may lead to random integration of donor DNA or result in chromosomal rearrangements or even cell death which still raises concern about applicability in higher organisms (Zinc-finger Nuclease-induced Gene Repair With Oligodeoxynucleotides: Wanted and Unwanted Target Locus Modifications Molecular Therapy vol. 18 no. 4, 743-753 (2010)).
Another group of genomic engineering proteins are sequence-specific rare cutting endonucleases with recognition sites exceeding 12 bp—so-called meganucleases or homing endonucleases. The large DNA recognition sites of 12 to 40 base pairs usually occur only once in a given genome and meganucleases (such as, e.g., I-SceI) are therefore considered the most specific restriction enzymes in nature and have been used to modify all sorts of genomes from plants or animals. One example of a meganuclease is PI-SceI, which belongs to the LAGLIDADG (SEQ ID NO: 233) family of homing endonucleases. However, the repertoire of naturally occurring meganucleases is limited and decreases the probability of finding a specific enzyme for a defined genomic target sequence. Meganucleases are therefore engineered to modify their recognition sequence. To develop tailored meganucleases with new recognition sites, two main approaches have been adopted: random mutagenesis of residues in the binding domain and subsequent selection of functional variants or fusing other enzyme domains to meganuclease half-sites to create chimeric meganucleases.
There is a need to improve these tools to (1) make them more flexible and reliable, (2) develop new means to predict and rationally design new binders, (3) tailor and modify effector activities and (4) efficiently assemble, test and deliver the engineered molecules.