The use of clustered regularly interspaced short palindromic repeats (CRISPR) and associated Cas proteins (CRISPR-Cas system) for site-specific DNA cleavage has shown great potential for a number of biological applications. CRISPR is used for genome editing; the genome-scale-specific targeting of transcriptional repressors (CRISPRi) and activators (CRISPRa) to endogenous genes; and other applications of RNA-directed DNA targeting with Cas enzymes.
CRISPR-Cas systems are native to bacteria and Archaea and provide adaptive immunity against viruses and plasmids. Three classes of CRISPR-Cas systems could potentially be adapted for research and therapeutic reagents. Type-II CRISPR systems have a desirable characteristic in utilizing a single CRISPR associated (Cas) nuclease (specifically Cas9) in a complex with the appropriate guide RNAs (gRNAs). In bacteria or Archaea, Cas9 guide RNAs comprise 2 separate RNA species. A target-specific CRISPR-activating RNA (crRNA) directs the Cas9/gRNA complex to bind and target a specific DNA sequence. The crRNA has 2 functional domains, a 5′-domain that is target specific and a 3′-domain that directs binding of the crRNA to the transactivating crRNA (tracrRNA). The tracrRNA is a longer, universal RNA that binds the crRNA and mediates binding of the gRNA complex to Cas9. Binding of the tracrRNA induces an alteration of Cas9 structure, shifting from an inactive to an active conformation. The gRNA function can also be provided as an artificial single guide RNA (sgRNA), where the crRNA and tracrRNA are fused into a single species (see Jinek, M., et al., Science 337 p 816-21, 2012). The sgRNA format permits transcription of a functional gRNA from a single transcription unit that can be provided by a double-stranded DNA (dsDNA) cassette containing a transcription promoter and the sgRNA sequence. In mammalian systems, these RNAs have been introduced by transfection of DNA cassettes containing RNA Pol III promoters (such as U6 or H1) driving RNA transcription, viral vectors, and single-stranded RNA following in vitro transcription (see Xu, T., et al., Appl Environ Microbiol, 2014. 80(5): p. 1544-52).
In the CRISPR-Cas system, using the system present in Streptococcus pyogenes as an example (S.py. or Spy), native crRNAs are about 42 bases long and contain a 5′-region of about 20 bases in length that is complementary to a target sequence (also referred to as a protospacer sequence or protospacer domain of the crRNA) and a 3′ region typically of about 22 bases in length that is complementary to a region of the tracrRNA sequence and mediates binding of the crRNA to the tracrRNA. A crRNA:tracrRNA complex comprises a functional gRNA capable of directing Cas9 cleavage of a complementary target DNA. The native tracrRNAs are about 85-90 bases long and have a 5′-region containing the region complementary to the crRNA. The remaining 3′ region of the tracrRNA includes secondary structure motifs (herein referred to as the “tracrRNA 3′-tail”) that mediate binding of the crRNA:tracrRNA complex to Cas9.
Jinek et al. extensively investigated the physical domains of the crRNA and tracrRNA that are required for proper functioning of the CRISPR-Cas system (Science, 2012. 337(6096): p. 816-21). They devised a truncated crRNA:tracrRNA fragment that could still function in CRISPR-Cas wherein the crRNA was the wild type 42 nucleotides and the tracrRNA was truncated to 75 nucleotides. They also developed an embodiment wherein the crRNA and tracrRNA are attached with a linker loop, forming a single guide RNA (sgRNA), which varies between 99-123 nucleotides in different embodiments.
At least three groups have elucidated the crystal structure of Streptococcus pyogenes Cas9 (SpyCas9). In Jinek, M., et al., the structure did not show the nuclease in complex with either a guide RNA or target DNA. They carried out molecular modeling experiments to reveal predictive interactions between the protein in complex with RNA and DNA (Science, 2014. 343, p. 1215, DOI: 10.1126/science/1247997).
In Nishimasu, H., et al., the crystal structure of Spy Cas9 is shown in complex with sgRNA and its target DNA at 2.5 angstrom resolution (Cell, 2014. 156(5): p. 935-49, incorporated herein in its entirety). The crystal structure identified two lobes to the Cas9 enzyme: a recognition lobe (REC) and a nuclease lobe (NUC). The sgRNA:target DNA heteroduplex (negatively charged) sits in the positively charged groove between the two lobes. The REC lobe, which shows no structural similarity with known proteins and therefore likely a Cas9-specific functional domain, interacts with the portions of the crRNA and tracrRNA that are complementary to each other.
Another group, Briner et al. (Mol Cell, 2014. 56(2): p. 333-9, incorporated herein in its entirety), identified and characterized the six conserved modules within native crRNA:tracrRNA duplexes and sgRNA. Anders et al. (Nature, 2014, 513(7519) p. 569-73) elucidated the structural basis for DNA sequence recognition of protospacer associate motif (PAM) sequences by Cas9 in association with an sgRNA guide.
The CRISPR-Cas endonuclease system is utilized in genomic engineering as follows: the gRNA complex (either a crRNA:tracrRNA complex or an sgRNA) binds to Cas9, inducing a conformational change that activates Cas9 and opens the DNA binding cleft, the protospacer domain of the crRNA (or sgRNA) aligns with the complementary target DNA and Cas9 binds the PAM sequence, initiating unwinding of the target DNA followed by annealing of the protospacer domain to the target, after which cleavage of the target DNA occurs. The Cas9 contains two domains, homologous to endonucleases HNH and RuvC respectively, wherein the HNH domain cleaves the DNA strand complementary to the crRNA and the RuvC-like domain cleaves the non-complementary strand. This results in a double-stranded break in the genomic DNA. When repaired by non-homologous end joining (NHEJ) the break is typically repaired in an imprecise fashion, resulting in the DNA sequence being shifted by 1 or more bases, leading to disruption of the natural DNA sequence and, in many cases, leading to a frameshift mutation if the event occurs in a coding exon of a protein-encoding gene. The break may also be repaired by homology directed recombination (HDR), which permits insertion of new genetic material based upon exogenous DNA introduced into the cell with the Cas9/gRNA complex, which is introduced into the cut site created by Cas9 cleavage.
The wild-type (WT) Cas9 protein cleaves most DNA targets with high efficiency but exhibits a sufficient level of unwanted off-target editing to complicate research applications and to offer serious concerns for medical applications. In this context, off-target cleavage is defined as a DNA cleavage event that occurs at a site where the genomic DNA target site differs from perfect complementarity to the protospacer domain of the crRNA or sgRNA. It is undesired to introduce cleavage events at non-targeted sites through such off-target cleavage paths. Typically, cleavage is only desired at sites in the genome that have perfect complementarity to the gRNA. Several groups have published novel mutant Cas9 enzymes that show reduced off-target cleavage activity (see: Slaymaker et al., Science, 2016, 351:84-88; Kleinstiver et al., Nature, 2016, 529:490-495; and Chen et al., Nature 2017, 550:407-410). The mutants described in these three publications were designed by selective mutation of specific amino-acid residues in the Cas9 protein that were identified as contacts sites between the protein and the RNA guide and/or the DNA substrate based on crystal structure of the Cas9 protein. While knowledge of mechanism of action is not needed to practice these inventions (i.e., to perform genome editing with improved specificity), it was originally thought that improved-fidelity mutants worked by reducing the relative affinity of the mutant Cas9 nuclease to the substrate DNA compared to the WT enzyme, making it more likely that mismatches between the guide RNA and the substrate DNA would be destabilizing. It was more recently proposed that the mutations restrict transition of Cas9 structure from an inactive conformation to an active conformation and that this transition occurs less efficiently in the presence of mismatches between the RNA guide and the DNA target. Regardless of mechanism, these mutant Cas9 enzymes do show reduced cleavage of target DNA having imperfect complementarity to the guide RNA, as desired. However, this improved specificity comes at the cost of also having reduced on-target activity, which is not desired. In all 3 prior art examples that disclosed improved-specificity Cas9 mutants, genome editing using CRISPR/Cas9 methods was done using plasmid or other expression-based approaches, i.e., methods that were first described in 2013 (see: Cong et al., Science, 2013, 339 p. 819-823; Mali et al., Science, 2013, 339 p. 823-826). It is now appreciated, however, that plasmid systems introduce complications into genome editing. For example, the plasmid can integrate into the host genome and thereby lead to other genome changes that are desired, or it can trigger innate immune responses and result in cell death. For these and other reasons, plasmid systems are not ideal for research applications where precision editing is desired and are impractical for medical applications where such side effects cannot be tolerated. More recently, methods using ribonucleoprotein (RNP) complexes, where recombinant Cas9 protein pre-complexed with synthetic gRNAs, have been shown to be preferable to using DNA-based expression constructs. RNP methods result in high activity genome editing with reduced side effects (see: Cho et al., Genome Research, 2014, 24 p. 132-141; Aida et al., Genome Biology, 2015, 16 p. 87-98). It is therefore desirable to develop high-fidelity genome editing methods that are compatible with RNP protocols. The previously cited published examples that describe improved-specificity Cas9 mutants all employed plasmid-based DNA expression cassettes to perform and study genome editing outcomes. This method results in high levels of overexpression of the mutant Cas9 protein for long periods of time, increasing the apparent enzymatic activity of the mutants. We describe herein that these improved-specificity Cas9 mutants (eSpCas9(1.1) and Cas9-HF1) have reduced enzymatic activity when using RNP methods to perform genome editing, with the result that cleavage of target DNA sites is significantly compromised when compared with cleavage by the WT Cas9 protein, often to the extent that targets sites that work with high efficiency using WT Cas9 protein do not show any evidence for cleavage when using the mutant variants. Therefore, the published mutant Cas9 proteins have limited utility for precision genome editing, especially when the more medically-relevant RNP methods are employed. Therefore, a need remains for methods to improve specificity of Cas9 genome editing. In particular, there exists a need for mutants of Cas9 that show improved specificity while retaining high enzymatic activity similar to the WT Cas9 enzyme when employed in the RNP format.