This invention relates to retrovirus vectors for gene therapy and other applications. A retrovirus vector infects cells at high efficiency and is capable of integrating a DNA copy of itself into the host genome. Such vectors are, in many ways, desirable for introduction and expression of exogenous DNA sequences in animal cells. Prolonged, stable expression of exogenously introduced genes is a requirement for experiments in animal cell culture, as well as whole-organism experiments, such as the production of transgenic mammals and human gene therapy.
One problem with retroviral vectors is that they may insert in a region of the host genome which will suppress expression of the genes carried by the vector. Position effects are known to cause wide variation in the expression of essentially identical constructs introduced into different genomic locations.
Retroviruses are several kilobase pairs (kbp) in length, and in the integrated provirus form, consist of an internal domain flanked by long terminal repeats (LTRs) of several hundred base pairs (bp). These LTRs contain promoter, polyadenylation, and termination signals which direct synthesis of the genomic RNA. The actual RNA genome contains half of the left LTR at one end (U5 at the 5' end) and half of the right LTR at the other end (U3 at the 3' end), along with a short repeat (R) present at each end. During replication, the complete LTR is regenerated at each end, to form the full-length DNA copy. See, e.g., H. E. Varmus and R. Swanstrom, RNA Tumor Viruses, 2nd Ed., pp. 369-512 (1984).
The internal domain contains the coding information for at least three genes: gag, pol, and env. These coincide with the first, second and third open reading frames (ORFs). The gag gene encodes structural proteins which bind to the RNA and form the nucleocapsid. The pol ORF abuts or overlaps that of gag, and the pol proteins are synthesized as readthrough products from the first ORF. This unusual means of limiting expression of the second ORF depends on suppression of a stop codon separating the first and second ORFs, or upon a frameshift event. The pol gene encodes a protease, a reverse transcriptase, and an endonuclease. The protease is responsible for processing of the polyprotein, while the reverse transcriptase copies the RNA genome of the retrovirus into a full-length double-stranded DNA copy. Synthesis of the first ("minus") DNA strand is primed by a tRNA which has a short region of complementarity to the retrovirus RNA. The endonuclease presumably cleaves both the DNA copy of the retrovirus at its termini, as well as the target site in the host genome. The envelope proteins encoded by the env gene are expressed from a spliced transcript.
The steps involved in the conventional production of a retrovirus vector are outlined in FIG. 1. The proteins required for virus particle formation, replication, and integration, are supplied from an integrated helper retrovirus which has a deletion in its RNA packaging signal (psi). Although the proteins encoded by the retrovirus are all required for non-defective replication-competent retroviruses, only relatively small regions, e.g., the termini, plus- and minus-strand primer binding sites, and psi, are required in cis on the RNA for it to be packaged and replicated within the retrovirus particle. These sequences, in addition to the gene of interest, are present in the retrovirus vector. DNA containing the vector sequences is transfected into the packaging line and the RNA is produced and packaged. As described above, the process of reverse transcription into DNA takes place within the virus nucleocapsid and the complete LTRs are regenerated. The outside ends of the LTR sequences, which themselves contain short inverted repeats, have been shown to be bound by the retrovirus endonuclease in vitro and to be required for integration.
For both packaging lines and vectors, murine leukemia viruses have been widely used. There are several reasons for this. First, their molecular biology is relatively well-understood. Second, they infect hematopoietic cells efficiently. Third, findings can be extended to whole animal models, and finally, amphotropic viruses are available which infect cultured cells from different mammalian species.
The murine psi-2 line was one of the first packaging lines to be developed for production of retrovirus vectors. See, e.g., R. Mann, et al., Cell 33: 153 (1983). The genomes of these cells have integrated copies of Moloney murine leukemia virus (MoMLV) which have deleted psi sequences. PA12 and psi-AM packaging lines are similar to psi-2 but have envelope proteins contributed from amphotropic viruses to increase host range. Because psi-2 has been shown to transmit helper virus (albeit at very low frequency), additional deficiencies have been introduced in subsequent versions of the helper viruses to further restrict the sequences which are packaged. See, e.g., A. D. Miller and C. Buttimore, Mol. Cell. Biol. 6: 2895 (1986).
There are now several variations of vectors in common use. One such vector, N2, is based on a murine leukemia virus and contains the complete retrovirus LTRs, the primer binding sites, the psi sequence, and a copy of the Tn5 neomycin-resistance gene (neoR) which is expressed from an internal (non-LTR) promoter. Conventional variations on this vector include the following: 1) substitution of different drug resistance markers; 2) expression of a second gene from a spliced message; and 3) modification of the LTRs to inactivate the LTR promoter at the 5' end once integration has occurred. Although these modifications do have technical advantages, they can also result in the production of decreased titers of a virus, thereby reducing their utility.
Despite the demonstrated advantages of retroviral vectors for introduction and expression of foreign genes in animal cells, problems remain with their use in human gene therapy. First, replication competent retroviruses have the ability to multiply in the host cell in which they are directed to integrate. This danger may be reduced by debilitating the vector via removal of internal sequences coding for proteins involved in packaging and replication. Use of debilitated retrovirus vectors, however, necessitates the use of helper cell lines which supply the missing functions but do not have the terminal sequences required for retroviral transposition. This raises the possibility that endogenous retrovirus transcripts from helper cells can be packaged with the vector sequences, or that vector and defective helper sequences can recombine to produce a virus capable of replication. If recombination does take place between the vector and the debilitated packaging virus, viable retrovirus may be produced which can be carried along at low frequency with the vector.
Second, retroviruses insert relatively randomly in the genome. Two problems are implicit in this property. Insertion of promoters carried on the retrovirus vector may activate cellular genes flanking the integration site. An example of the potential danger in this is activation of adjacent oncogenes. Insertion of retroviruses into coding sequences may also be deleterious. In most cases, the affected cells might simply be lost from the population; however, loss-of-function mutations may also promote oncogenesis. Thus, there are problems implicit in the relatively non-specific pattern of retrovirus insertion.
Finally, retroviral vectors inserted in different chromosomal locations are expressed at different levels. See, e.g., M. A. Eglitis, et al., Science 230: 1395-1398 (1985). These position effects may be attributable to the transcriptional activity of sequences flanking the insertion site. This property means that populations of cells derived from different progenitors, i.e., with vectors inserted at different positions, will have differential levels of vector expression. For example, retroviral vectors may insert in a region of the host genome which will suppress expression of the genes carried by the vector. While this may yield an "average" level of acceptable activity for gene transfers that are performed on large populations of cells, differential expression of independent insertions in transgenic animals could greatly complicate strategies designed to correct genetic defects expressed in different tissues.
Therefore, if one could design vectors which insert in a specific, innocuous location in the genome, disruption of host cell functions could be avoided. The present invention provides a heterologous vector incorporating features which allow it to integrate with position specificity, thereby providing significant advantages over current methodologies.