AAV. Adeno-associated virus (AAV) is a small virus, which infects humans and several other primate species. AAV is not known to cause disease, and generally causes only a mild immune response. The virus infects both dividing and quiescent cells and can be engineered to persist in an extrachromosomal state without integrating into the genome of the host cell (Russel D W, Deyle D R (2010) Current Opinion in Molecular Therapy. 11: 442-447; Grieger J C, Samulski R J (2005) Advances in Biochemical Engineering/Biotechnology 99: 119-45P). These features make AAV a very attractive candidate for creating viral vectors for gene therapy. Recent human clinical trials using AAV for gene therapy in the retina have shown promise (Maguire A M, et al. (2008) New England Journal of Medicine 358: 2240-8). Moreover, AAV presents a well-known system with an established safety record with the completion of over sixty clinical trials. (Mitchell A M, Nicolson S C, Warischalk J K, and Samulski R J (2010) Curr Gene Ther. 10(5): 319-40).
Wild-type AAV has the ability to stably integrate into the host cell genome at a specific site (designated AAVS1) in the human chromosome 19. This feature makes it somewhat more predictable than other viral vectors such as retroviruses, which present the threat of random insertion and of mutagenesis. Gene therapy vectors based on AAV, however, generally eliminate this integrative capacity by removal of the rep and cap genes from the DNA of the vector. In their place, a gene of interest can be cloned under the control of a promoter between the viral inverted terminal repeats (ITRs) that aid in concatamer formation in the nucleus after the single-stranded vector DNA is converted by host cell DNA polymerase complexes into double-stranded DNA. AAV-based gene therapy vectors form episomal concatamers in the host cell nucleus. In non-dividing cells, these concatemers remain intact for the life of the host cell. In dividing cells, AAV DNA is lost through cell division, since the episomal DNA is not replicated along with the host cell DNA. Random integration of AAV DNA into the host genome is detectable but occurs at very low frequency.
AAV presents disadvantages as well. The cloning capacity of the vector is relatively limited and most therapeutic genes require the complete replacement of the virus's 4.7 kilobase genome. Large genes are, therefore, not suitable for use in a standard AAV vector. Options are currently being explored to overcome the limited coding capacity. The AAV ITRs of two genomes can anneal to form head to tail concatamers, almost doubling the capacity of the vector. Insertion of splice sites allows for the removal of the ITRs from the transcript, alleviating concatamer formation.
Because of AAV's specialized gene therapy advantages, researchers have created an altered version of AAV termed self-complementary adeno-associated virus (scAAV). Whereas AAV packages a single strand of DNA, and must wait for its second strand to be synthesized, scAAV packages two shorter strands that are complementary to each other. By avoiding second-strand synthesis, scAAV can express more quickly, but although as a caveat, scAAV can only encode half of the already limited capacity of AAV (McCarty D M, Monahan P E, Samulski R J (2001) Gene Therapy 8: 1248-54)
The AAV genome is built of single-stranded deoxyribonucleic acid (ssDNA), either positive- or negative-sensed, which is about 4.7 kilobase long. The genome comprises inverted terminal repeats (ITRs) at both ends of the DNA strand, and two open reading frames (ORFs): rep and cap. The former is composed of four overlapping genes encoding Rep proteins required for the AAV life cycle, and the latter contains overlapping nucleotide sequences of capsid proteins: VP1, VP2 and VP3, which interact together to form a capsid of an icosahedral symmetry (Carter, B J (2000) In DD Lassic & N Smyth Templeton. Gene Therapy: Therapeutic Mechanisms and Strategies. New York City: Marcel Dekker, Inc. pp. 41-59).
The Inverted Terminal Repeat (ITR) sequences comprise approximately 145 bases each. The first 125 nucleotides of the ITR sequence are palindromic, folding in on itself to create a T-shaped hairpin structure (Daya, Shyam (2008) Clin. Microbiol. Rev. 21(4) 583-593). The other 20 bases of the ITR remain unpaired and are known as the D sequence. The origin of replication is the ITR and serves as a primer for second-strand synthesis.
With regard to gene therapy, ITRs seem to be the only sequences required in cis next to the therapeutic gene: structural (cap) and packaging (rep) proteins can be delivered in trans. With this assumption, many methods have been established for the efficient production of recombinant AAV (rAAV) vectors containing a reporter, or therapeutic gene. However, it was also published that the ITRs are not the only elements required in cis for effective replication and encapsidation. Some research groups have identified a sequence designated cis-acting Rep-dependent element (CARE) inside the coding sequence of the rep gene. CARE was shown to augment amplification, when present in cis (Nony P, Tessier J, Chadeuf G, et al. (2001) Journal of Virology 75: 9991-4).
On the “left side” of the genome, the rep genes are transcribed from two promoters, p5 and p19, from which two overlapping messenger ribonucleic acids (mRNAs) of different length can be produced. Each of these contains an intron, which may or may not be spliced out. Given these possibilities generated by such a system, four various mRNAs, and consequently, four various Rep proteins with overlapping sequence can be synthesized. Their names depict their sizes in kilodaltons (kDa): Rep78, Rep68, Rep52 and Rep40 (Kyostio S R, et al. (1994) Journal of Virology 68: 2947-57). Rep78 and 68 can specifically bind the hairpin formed by the ITR in the self-priming act and cleave at a specific region, designated terminal resolution site, within the hairpin. They were also shown to be necessary for the AAVS1-specific integration of the AAV genome. All four Rep proteins bind ATP and possess helicase activity. As demonstrated, Rep proteins upregulate the transcription from the p40 promoter (mentioned below), but downregulate both p5 and p19 promoters.
The “right side” of a positive-sensed AAV genome encodes overlapping sequences of three capsid proteins, VP1, VP2 and VP3, which start from one promoter, designated p40. The molecular weights of these proteins are 87,72 and 62 kiloDaltons, respectively. All three are translated from one mRNA, the unspliced transcript producing VP1. After this mRNA is synthesized, it can be spliced in two different manners: either a longer or shorter intron can be excised resulting in the formation of two pools of mRNAs: a 2.3 kb- and a 2.6 kb-long mRNA pool. Generally, especially in the presence of adenovirus, the longer intron is preferred, so the 2.3-kb-long mRNA represents the so-called “major splice.” In this form, the first AUG codon that initiates synthesis of VP1 protein is cut out, resulting in a reduced overall level of VP1 protein synthesis. The first AUG codon that remains in the major splice is the initiation codon for VP3 protein. However, upstream of that codon in the same open reading frame lies an ACG sequence (encoding threonine, and serving as the initiation codon for VP2) surrounded by an optimal Kozak context. This contributes to a low level of synthesis of VP2 protein, which is actually VP3 protein with additional N terminal residues, as is VP1. Since the bigger intron is preferred to be spliced out, and since in the major splice the ACG codon is a much weaker translation initiation signal, the ratio at which the AAV structural proteins are synthesized in vivo is about 1:1:20, which is the same as in the mature virus particle (Rabinowitz J E, Samulski R J (2000) Virology 278: 301-8).
The unique fragment at the N terminus of VP1 protein possesses phospholipase A2 (PLA2) activity, likely required for releasing the AAV particles from late endosomes. VP2 and VP3 are crucial for correct virion assembly (Muralidhar S, Becerra S P, Rose J A (1994), Journal of Virology 68: 170-6). More recently, however, Warrington et al. have shown VP2 to be not only unnecessary for the complete virus particle formation and an efficient infectivity, but that VP2 can tolerate large insertions in its N terminus (Warrington K H, et al. (2004), Journal of Virology 78: 6595-609). In contrast. VP1 shows no such tolerance, probably because of the presence of the PLA2 domain (Id.). The AAV capsid is composed of 60 capsid protein subunits, VP1, VP2, and VP3, that are arranged in an icosahedral symmetry in a ratio of 1:1:10, with an estimated size of 3.9 MegaDaltons. The crystal structure of the VP3 protein was determined by Xie, Bue, et al. (Xie Q, Bu W, Bhatia S, et al. (2002) Proceedings of the National Academy of Sciences of the United States of America 99: 10405-10)
Currently, 12 AAV serotypes and nearly 100 variants have been identified in human and nonhuman primate populations. (Gao G, Zhong L, Danos 0 (2011) Methods Mol. Biol. 807:93-118). Serotypes can infect cells from multiple diverse tissue types. Tissue specificity, as determined by the capsid serotype and pseudotyping of AAV vectors to alter their tropism range, will likely impact to their efficacy and use in therapy.
Serotype 2 (AAV2) has been the most extensively examined to date. AAV2 presents a natural tropism towards skeletal muscles, neurons, vascular smooth muscle cells, and hepatocytes. Three cell receptors have been described for AAV2: heparan sulfate proteoglycan (HSPG), aVβ35 integrin, and fibroblast growth factor receptor 1 (FGFR-1). The first functions as a primary receptor, while the latter two have a co-receptor activity and enable AAV to enter the cell by receptor-mediated endocytosis.
Although AAV2 is the most popular serotype in various AAV-based research, it has been shown that other serotypes can be more effective as gene delivery vectors. For instance AAV6 appears much better in infecting airway epithelial cells, AAV7 presents very high transduction rate of murine skeletal muscle cells (similarly to AAV1 and AAV5), AAV8 is superb in transducing hepatocytes, and AAV1 and 5 were shown to be very efficient in gene delivery to vascular endothelial cells. In the brain, most AAV serotypes show neuronal tropism, while AAV5 also transduces astrocytes. AAV6, a hybrid of AAV1 and AAV2, also shows lower immunogenicity than AAV2. Serotypes can differ with the respect to the receptors they are bound to. For example AAV4 and AAV5 transduction can be inhibited by soluble sialic acids (of different form for each of these serotypes), and AAV5 was shown to enter cells via the platelet-derived growth factor receptor. Currently, rAAV8 and rAAV9 show the most prominent features relevant to therapeutic use relative to all other serotypes and under undisturbed physiological conditions. (Gao, G, Zhong L, and Danos O (2011) Methods Mol. Biol. 807:93-118).
There are several steps in the AAV infection cycle, from infecting a cell to producing new infectious particles. These are:
1. attachment to the cell membrane
2. receptor-mediated endocytosis
3. endosomal trafficking
4. escape from the late endosome or lysosome
5. translocation to the nucleus
6. uncoating
7. formation of double-stranded DNA replicative form of the AAV genome
8. expression of rep genes
9. genome replication
10. expression of cap genes, synthesis of progeny ssDNA particles
11. assembly of complete virions, and
12. release from the infected cell.
These steps may differ depending on the host cell type, which, in part, contributes to the defined and quite limited native tropism of AAV. Replication of the virus can also, even in regards to the same cell type, be dependent on the cell's cycle phase at the time of infection.
The characteristic feature of the adeno-associated virus is a deficiency in replication and thus, its inability to multiply in unaffected cells. The first factor described as providing successful generation of new AAV particles was the adenovirus, from which the AAV name originated. It was then shown that AAV replication is facilitated by selected proteins derived from the adenovirus genome, by other viruses such as HSV, or by genotoxic agents, such as UV irradiation or hydroxyurea. The minimal set of the adenoviral genes required for efficient generation of progeny AAV particles were discovered by Matsushita, Ellinger et al. (Matsushita T, Elliger S, Elliger C, et al. (1998) Gene Therapy 5: 938-45). This discovery paved the way for new production methods of recombinant AAV, which do not require adenoviral co-infection of the AAV-producing cells. In the absence of helper virus or genotoxic factors, AAV DNA can either integrate into the host genome, or persist in episomal form. In the former case integration is mediated by Rep78 and Rep68 proteins and requires the presence of ITRs flanking the region being integrated. In mice, the AAV genome has been observed persisting for long periods in quiescent tissues, such as skeletal muscles, in episomal form (a circular head-to-tail conformation).
Engineered Site-Specific Endonucleases. The present invention relates to the use of rAAV vectors to deliver engineered, site-specific endonucleases. Site-specific, rare-cutting endonucleases can be used to “edit” the genomes of living cells or organisms by targeting a double-stranded DNA break to a specific site in the genome that is then repaired by the cell's DNA repair machinery. This process can often result in DNA repair errors that, if they occur in the coding sequence of a gene, can disrupt or frameshift the gene and thereby disable (knock-out) the gene. Alternatively, chromosomal DNA breaks are highly recombinigenic and, so, site-specific endonucleases can be used to promote homologous recombination between the chromosomal DNA sequence and a transgenic sequence provided to the cell. This can result in, for example, the targeted insertion of a transgene or the repair of a mutant gene that is responsible for disease.
Methods for producing engineered, site-specific endonucleases are known in the art. For example, zinc-finger nucleases (ZFNs) can be engineered to recognize and cut pre-determined sites in a genome. ZFNs are chimeric proteins comprising a zinc finger DNA-binding domain fused to the nuclease domain of the FokI restriction enzyme. The zinc finger domain can be redesigned through rational or experimental means to produce a protein which binds to a pre-determined DNA sequence ˜18 basepairs in length. By fusing this engineered protein domain to the FokI nuclease, it is possible to target DNA breaks with genome-level specificity. ZFNs have been used extensively to target gene addition, removal, and substitution in a wide range of eukaryotic organisms (reviewed in Durai S, et al. (2005) Nucleic Acids Res 33, 5978).
Likewise, TAL-effector nucleases (TALENs) can be generated to cleave specific sites in genomic DNA. Like a ZFN, a TALEN comprises an engineered, site-specific DNA-binding domain fused to the FokI nuclease domain (reviewed in Mak, et al. (2013) Curr Opin Struct Biol. 23:93-9). In this case, however, the DNA binding domain comprises a tandem array of TAL-effector domains, each of which specifically recognizes a single DNA basepair. The large size of a TALEN makes it difficult to package in rAAV, limiting the utility of TALENs for compositions of the present invention comprising rAAV vectors. Thus, vectors created using lentiviruses and/or retroviruses present an attractive candidate when using ZFNs and TALENs.
Compact TALENs are an alternative endonuclease architecture that avoids the need for dimerization (Beurdeley, et al. (2013) Nat Commun. 4:1762). A Compact TALEN comprises an engineered, site-specific TAL-effector DNA-binding domain fused to the nuclease domain from the I-TevI homing endonuclease. Unlike FokI, I-Teel does not need to dimerize to produce a double-strand DNA break so a Compact TALEN is functional as a monomer. Thus, it is possible to co-express two Compact TALENs in the same cell. Moreover, the Compact TALEN is smaller in size, making it much more attractive in vector design. (Id.).
Engineered endonucleases based on the CRISPR/Cas9 system are also known in the art (Ran, et al. (2013) Nat Protoc. 8:2281-2308; Mali et al. (2013) Nat Methods. 10:957-63). A CRISPR endonuclease comprises two components: (1) a caspase effector nuclease, typically microbial Cas9; and (2) a short “guide RNA” comprising a ˜20 nucleotide targeting sequence that directs the nuclease to a location of interest in the genome. By expressing multiple guide RNAs in the same cell, each having a different targeting sequence, it is possible to target DNA breaks simultaneously to multiple sites in in the genome. The primary drawback of the CRISPR/Cas9 system is its reported high frequency of off-target DNA breaks, which could limit the utility of the system for treating human patients (Fu, et al. (2013) Nat Biotechnol. 31:822-6).
In the preferred embodiment of the invention, the DNA break-inducing agent is an engineered homing endonuclease (also called a “meganuclease”). Homing endonucleases are a group of naturally-occurring nucleases, which recognize 15-40 base-pair cleavage sites commonly found in the genomes of plants and fungi. They are frequently associated with parasitic DNA elements, such as group 1 self-splicing introns and inteins. They naturally promote homologous recombination or gene insertion at specific locations in the host genome by producing a double-stranded break in the chromosome, which recruits the cellular DNA-repair machinery (Stoddard (2006) Q. Rev. Biophys. 38: 49-95).
Homing endonucleases are commonly grouped into four families: the LAGLIDADG family, the GIY-YIG family, the His-Cys box family and the HNH family. These families are characterized by structural motifs, which affect catalytic activity and recognition sequence. For instance, members of the LAGLIDADG family are characterized by having either one or two copies of the conserved LAGLIDADG motif (see Chevalier et al. (2001) Nucleic Acids Res. 29(18): 3757-3774). The LAGLIDADG homing endonucleases with a single copy of the LAGLIDADG motif form homodimers, whereas members with two copies of the LAGLIDADG motif are found as monomers.
I-CreI (SEQ ID NO: 1) is a member of the LAGLIDADG family of homing endonucleases, which recognizes and cuts a 22 basepair recognition sequence in the chloroplast chromosome of the algae Chlamydomonas reinhardtii. Genetic selection techniques have been used to modify the wild-type I-CreI cleavage site preference (Sussman et al. (2004) J Mol. Biol. 342: 31-41; Chames et al. (2005) Nucleic Acids Res. 33: e178; Seligman et al. (2002) Nucleic Acids Res. 30: 3870-9; Arnould et al. (2006) J Mol. Biol. 355: 443-58). More recently, a method of rationally-designing mono-LAGLIDADG homing endonucleases capable of comprehensively redesigning I-CreI and other homing endonucleases to target widely-divergent DNA sites, including sites in mammalian, yeast, plant, bacterial, and viral genomes has been described (WO 2007/047859).
As first described in WO 2009/059195, I-CreI and its engineered derivatives are normally dimeric but can be fused into a single polypeptide using a short peptide linker that joins the C-terminus of a first subunit to the N-terminus of a second subunit (Li, et al. (2009) Nucleic Acids Res. 37:1650-62; Grizot, et al. (2009) Nucleic Acids Res. 37:5405-19.) Thus, a functional “single-chain” meganuclease can be expressed from a single transcript. By delivering genes encoding two different single-chain meganucleases to the same cell, it is possible to simultaneously cut two different sites. This, coupled with the extremely low frequency of off-target cutting observed with engineered meganucleases, makes them the preferred endonuclease for the present invention.
For many applications, it is necessary to deliver (a) gene(s) encoding engineered endonuclease(s) to the target cell or organism. For in vivo applications, rAAV is a preferred delivery vector. However, rAAV vectors have long persistence times in many cell types, particularly non-dividing cells. Such persistence can activate immune response within the cell and cause disruption. Genome editing using engineered endonucleases requires only a short burst of endonuclease expression such that the endonuclease protein accumulates to a sufficient intracellular concentration to cut its recognition sequence in the genome. Long-term expression of an endonuclease can result in unintended off-target DNA cutting or in an immune response directed toward cells expressing the foreign nuclease protein. Thus, there is a need for rAAV vectors encoding site-specific gene editing endonucleases in which the persistence time of the vector is limited.