Phytopathogenic bacteria of the genus Xanthomonas cause severe diseases on many important crop plants. The bacteria translocate an arsenal of effectors including members of the large transcription activator-like (TAL)/AvrBs3-like effector family via the type III secretion system into plant cells (Kay & Bonas (2009) Curr. Opin. Microbiol. 12:37-43, White & Yang (2009) Plant Physiol. doi:10.1104/pp.1109.139360; Schornack et al. (2006) J. Plant Physiol. 163:256-272). TAL effectors, key virulence factors of Xanthomonas, contain a central domain of tandem repeats, nuclear localization signals (NLSs), and an activation domain (AD) and act as transcription factors in plant cells (Kay et al. (2007) Science 318:648-651; Römer et. al (2007) Science 318:645-648; Gu et al. (2005) Nature 435, 1122-1125; FIG. 1a). The type member of this effector family, AvrBs3 from Xanthomonas campestris pv. vesicatoria, contains 17.5 repeats and induces expression of UPA (upregulated by AvrBs3) genes including the Bs3 resistance gene in pepper plants (Kay et al. (2007) Science 318:648-651; Römer et al. (2007) Science 318:645-648; Marois et al. (2002) Mol. Plant-Microbe Interact. 15:637-646). The number and order of repeats in a TAL effector determine its specific activity (Herbers et al. (1992) Nature 356:172-174). The repeats were shown to be essential for DNA-binding of AvrBs3 and constitute a novel DNA-binding domain (Kay et al. (2007) Science 318:648-651). How this domain contacts DNA and what determines specificity has remained enigmatic.
Selective gene expression is mediated via the interaction of protein transcription factors with specific nucleotide sequences within the regulatory region of the gene. The manner in which DNA-binding protein domains are able to discriminate between different DNA sequences is an important question in understanding crucial processes such as the control of gene expression in differentiation and development.
The ability to specifically design and generate DNA-binding domains that recognize a desired DNA target is highly desirable in biotechnology. Such ability can be useful for the development of custom transcription factors with the ability to modulate gene expression upon target DNA binding. Examples include the extensive work done with the design of custom zinc finger DNA-binding proteins specific for a desired target DNA sequence (Choo et al. (1994) Nature 372:645; Pomerantz et al., (1995) Science 267:93-96; Liu et al., Proc. Natl. Acad. Sci. USA 94:5525-5530 (1997); Guan et al. (2002) Proc. Natl. Acad. Sci. USA 99:13296-13301; U.S. Pat. Nos. 7,273,923; 7,220,719). Furthermore, polypeptides containing designer DNA-binding domains can be utilized to modify the actual target DNA sequence by the inclusion of DNA modifying domains, such as a nuclease catalytic domain, within the polypeptide. Examples of such include the DNA binding domain of a meganuclease/homing endonuclease DNA recognition site in combination with a non-specific nuclease domain (see US Pat. Appl. 2007/0141038), modified meganuclease DNA recognition site and/or nuclease domains from the same or different meganucleases (see U.S. Pat. App. Pub. 20090271881), and zinc finger domains in combination with a domain with nuclease activity, typically from a type IIS restriction endonuclease such as FokI (Bibikova et al. (2003) Science 300:764; Urnov et al. (2005) Nature 435, 646; Skukla, et al. (2009) Nature 459, 437-441; Townsend et al. (2009) Nature 459:442445; Kim et al. (1996) Proc. Natl Acad. Sci USA 93:1156-1160; U.S. Pat. No. 7,163,824). The current methods utilized for identifying custom zinc finger DNA-binding domains employ combinatorial selection-based methods utilizing large randomized libraries (typically >108 in size) to generate multi-finger domains with desired DNA specificity (Greisman & Pabo (1997) Science 275:657-661; Hurt et al. (2003) Proc Natl Acad Sci USA 100:12271-12276; Isalan et al. (2001) Nat Biotechnol 19:656-660. Such methods are time intensive, technically demanding and potentially quite costly. The identification of a simple recognition code for the engineering of DNA-binding polypeptides would represent a significant advancement over the current methods for designing DNA-binding domains that recognize a desired nucleotide target.