Various methods and compositions for targeted cleavage of genomic DNA have been described. Such targeted cleavage events can be used, for example, to induce targeted mutagenesis, induce targeted deletions of cellular DNA sequences, and facilitate targeted recombination at a predetermined chromosomal locus. See, for example, U.S. Pat. Nos. 8,623,618; 8,034,598; 8,586,526; 6,534,261; 6,599,692; 6,503,717; 6,689,558; 7,067,317; 7,262,054; 7,888,121; 7,972,854; 7,914,796; 7,951,925; 8,110,379; 8,409,861; U.S. Patent Publications 20030232410; 20050208489; 20050026157; 20060063231; 20080159996; 201000218264; 20120017290; 20110265198; 20130137104; 20130122591; 20130177983; 20130177960 and 20150056705, the disclosures of which are incorporated by reference in their entireties for all purposes.
These methods often involve the use of engineered cleavage systems to induce a double strand break (DSB) or a nick in a target DNA sequence such that repair of the break by an error prone process such as non-homologous end joining (NHEJ) or repair using a repair template (homology directed repair or HDR) can result in the knock out of a gene or the insertion of a sequence of interest (targeted integration). Cleavage can occur through the use of specific nucleases such as engineered zinc finger nucleases (ZFN), transcription-activator like effector nucleases (TALENs), or using the CRISPR/Cas system with an engineered crRNA/tracr RNA (‘single guide RNA’) to guide specific cleavage. Clinical trials using cells modified using engineered nucleases have demonstrated therapeutic utility (see, e.g. Tebas et al (2014) New Eng J Med 370(10):901). Targeted cleavage using one of the above mentioned nuclease systems can be exploited to insert a nucleic acid into a specific target location using either HDR or NHEJ-mediated processes.
In particular, transcription activator-like effector (TALE) proteins have gained broad appeal as a platform for targeted DNA recognition due in large measure to their simple, code-like rules for design. See, e.g., U.S. Pat. Nos. 8,586,526; 8,697,853; 8,685,737; 8,586,363; 8,470,973; 8,450,471; 8,440,432; 8,440,431; 8,420,782 and U.S. Patent Publication No. 20130196373. These design rules relate the DNA base specified by a single TALE repeat to the identity of residues at two key positions (repeat variable diresidue residues or “RVD”), and allow for the design for new sequence targets via simple modular shuffling of these units. When bound to DNA, TALE proteins identify base sequences via contacts from a central array of TALE repeat units with each unit specifying one base. Moscou et al. (2009) Science 326:1501 ; Boch et al. (2009) Science 326:1509-1512 (2009); Deng et al. (2012) Science 335:720-723 ; Mak et al. (2012) Science 335:716-719. Repeats exhibit little diversity except at their RVD (positions 12 and 13) that recognizes the targeted base. Critically, the base preference of a TALE repeat is substantially determined by the identity of its resident RVD. Natural TALEs typically employ just four RVD sequences—NI, HD, NG, or NN—to recognize target bases of A, C, T or G/A, respectively, which four RVDs are known as canonical RVDs.
However, a key limitation of these rules is that their very simplicity precludes options for enhancing activity and/or specificity. For instance, as created using the natural code, TALENs can specify unintended bases in their binding sites and, in addition, also cleave non-targeted cellular sequence. See, e.g., Miller et al. (2011) Nat Biotechnol 29:143-148 ; Hockemeyer et al. (2011) Nat Biotechnol 29:731-734 ; Tesson et al. (2011) Nat Biotechnol 29:95-696; Mali et al. (2013) Nat Biotechnol 31(9): doi:10.1038/nbt.2675. Juillerat et al. (2014) Nucleic Acids Res (2014); Guilinger et al. (2014) Nat Methods 11(4):429-35 ; Osborn et al.(2013) Mol Ther 21, 1151-1159 (2013).
Thus, there remains a need for additional TALE protein compositions and methods, particularly for TALEs that exhibit enhanced specificity and/or activity.