Various therapies in gene therapy rely upon the expression of recombinant genes in heterologous systems. A variety of viral vectors have been described for delivery of immunogenic and therapeutic products to a host. One vector system which has been described in the literature as very attractive for long-term expression of a transgene product is a recombinant adeno-associated virus, due to its relatively low immunogenicity and the fact that it is not associated with any clinical sequelae in humans. Adeno-associated virus (AAV) is a small, non-enveloped human parvovirus that packages a linear strand of single stranded DNA genome that is 4.7 kb. The capsid of an AAV contains 60 copies (in total) of three viral proteins (VPs), VP1, VP2, and VP3, in a predicted ratio of 1:1:10-20, arranged with T=1 icosahedral symmetry [H-J Nam, et al., J Virol., 81(22): 12260-12271 (November 2007)]. The three VPs are translated from the same mRNA, with VP1 containing a unique N-terminal domain in addition to the entire VP2 sequence at its C-terminal region [Nam et al., cited above]. VP2 contains an extra N-terminal sequence in addition to VP3 at its C terminus.
Codon usage bias has been reported for numerous organisms, from viruses to eukaryotes. Since the genetic code is degenerate (i.e., each amino acid can be coded by on average three different codons), the DNA sequence can be modified by synonymous nucleotide substitutions without altering the amino acid sequence of the encoded protein. Such synonymous codon optimization has been performed for the purpose of optimizing expression in a desired host, as described in the scientific literature and in patent documents. See, U.S. Pat. Nos. 5,786,464 and 6,114,148. Much of the early work in this called optimization, focused on altering the rare codons in the target gene so that they more closely reflect the codon usage of the host without modifying the amino acid sequence of the encoded protein. Since the early published work in this area, a variety of different algorithms have been described for modifying coding sequences for expression in different bacterial and eukaryotic host cell species.
In 2004, Plotkin, et al, Proc Natl Acad Sci. USA, 1010:12588-12591 (2004) reported significant differences in synonymous codon usage between genes specifically expressed in different tissues. However, more recent work by Sémon et al, Mol Biol Evol, 23(3):523-529 (2006) re-evaluated that work and concluded that variability of synonymous codon usage between tissues is much smaller than variability within tissues. Sémon et al further report that the synonymous codon usage variability reported by Plotkin et al was due only to GC-content differences, which affects introns and intergenic regions as well as synonymous codon positions.
For a variety of reasons, including cost, efficiency, and safety, there remains a need in the art for vectors which expression higher levels of gene products in a target cell.