Small RNA-based defense systems that provide adaptive, heritable immunity against viruses, plasmids, and other mobile genetic elements have recently been discovered in archaea and bacteria. The RNA and protein components of these immune systems arise from the CRISPR (clustered regularly interspaced short palindromic repeat) and Cas (CRISPR-associated) genes, respectively. CRISPR locus consists of variable similar sized, short regions (spacers) that separate each of short repeats. The spacers are mainly homologous to the invading sequences and the repeats are identical sequences. Cas genes are often located adjacent to the CRISPR locus. Prokaryotes with CRISPR-Cas immune systems capture short invader sequences with the CRISPR loci in the genomes, and small RNAs produced from the CRISPR loci (crRNAs) guide Cas proteins to recognize and degrade (or otherwise silence) the invading nucleic acids.
CRISPR-Cas systems operate through three general steps to provide immunity: adaptation, crRNA biogenesis, and invader silencing. In the adaptation phase, a short fragment of foreign DNA (protospacer) is acquired from the invader and integrated into the host CRISPR locus adjacent to the leader. Protospacer adjacent motifs (PAMs) are found near invader sequences selected for CRISPR integration.
In the crRNA biogenesis phase, CRISPR locus transcripts are processed to release a set of small individual mature crRNAs (each targeting a different sequence). Mature crRNA generally retain some of the repeat sequence, which is thought to provide a recognizable signature of the crRNA. In the silencing phase, crRNA-Cas protein complexes recognize and degrade foreign DNAs or RNAs.
There are three types of CRISP-Cas systems. Type II CRISPR-Cas systems has been extensively studied partially because they offered practical applications in the dairy industry to generate phage-resistant Streptococcus thermophilus (S. thermophilus) strains. In addition to its content and architecture, Type II systems also differ from other types in the biogenesis of crRNA. A set of small non-coding RNAs called tracrRNA (trans-activating CRISPR RNA) are produced from a region outside but close to the CRISPR locus. The tracrRNAs are partially complementary to the type II CRISPR repeat sequences and hybridize to the repeats within the long precursor CRISPR RNA and the RNA duplexes are processed by non-CRISPR RNase III to generate mature crRNAs. Cas9, a large type II signature protein, is thought to be the only protein involved in the crRNA-guided silencing of foreign nucleic acids.
Jinek et al. “A Programmable Dual-RNA—Guided DNA Endonuclease in Adaptive Bacterial Immunity,” Science 337(6096), p.816-821 (August 2012) show that crRNA fused to a tracrRNA (called crRNA-tracrRNA chimera or guide chimeric RNAs) is sufficient to direct Cas9 protein to sequence-specifically cleave target DNA sequences matching the crRNA using in vitro reconstitution of Streptococcus pyogenes (S. pyogenes) type II CRISPR system. However, the study was based on biochemical assays and did not show whether or not the Cas9-crRNA-tracrRNA system would work in the cells of eukaryotic organisms.
To explore the potential of RNA-programmed Cas9 for genome-editing applications in mammalian cells, Mali et al., “RNA-Guided Human Genome Engineering via Cas9” Science Express (Jan. 3, 2013) and Cong et al., “Multiplex Genome Engineering Using CRISPR/Cas Systems” Science Express (Jan. 3, 2013) independently engineer Cas9 and RNA components of the bacterial type II CRISPR system in human cells and/or mouse cells. Both labs were able to introduce precise double stranded break at endogenous genomic loci in human cells and/or mouse cells using human codon-optimized version of the S. pyogenes Cas9 proteins directed by short RNAs. The two labs designed and used different nucleic acid sequences to encode codon-optimized S. pyogenes Cas9 protein.
RNA-guided genome targeting defines a potential new class of genome engineering tools. What is needed in the art are efficient and versatile methods and tools for RNA-programmed genome engineering. Improved efficient systems using RNA-programmed Cas9 can be used, for example, to study biology by perturbing gene networks, and also for example, can be used to treat genetic diseases by repairing genetic defects or by reintroducing genes into cells.