The present invention relates generally to retroviruses, pro-retroviral polynucleotides including pro-retroviral DNA, pro-retroviral-like DNA and more specifically to recombinant vectors derived therefrom for use in delivering genetic information to susceptible target plant cells.
Repetitive DNA sequences are a common feature of the genomes of higher eukaryotes. Repetitive DNA family members in animals and higher plants are tandemly repeated or interspersed with other sequences (Walbot and Goldberg, 1979; Flavell, 1980), and may constitute more than 50% of the genome (Walbot and Goldberg, 1979). Estimates of the proportion of repetitive DNA in the soybean genome range from 36% to 60% (Goldberg, 1978; Gurley et al., 1979).
High copy-number repeats on the order of 105 per haploid genome comprise only 3% of the soybean genome, whereas moderately repetitive sequences with copy-numbers in the 103 range occupy 30-40% of the genome (Goldberg, 1978). Electron micrographic examination of these moderately repetitive sequences demonstrate that they average about 2 kb in length; however, 4% of those observed exceed 11 kb (Pellegrini and Goldberg, 1979).
Most of the highly repetitive sequences in higher eukaryotic genomes are relatively short and are organized in tandem arrays. For example, the chromosomal region adjacent to the centromere in higher eukaryotes is composed of very long blocks of highly repetitive DNA, called satellite DNA, in which simple sequences are repeated thousands of times or more. Tandemly repeated elements found in the soybean genome also include the ribosomal RNA (rRNA)-encoding genes. The approximately 800 rDNA copies are organized as one or more clusters of tandemly repeated 8-kb or 9-kb units (Friedrich et al., 1979; Varsanyi-Breiner et al., 1979).
The genomes of most higher eukaryotes also contain highly repetitive sequences that are distributed evenly throughout the genome, interspersed with longer stretches of unique (or moderately repetitive) DNA. These interspersed repetitive DNA elements are variable in length, are recognizably related but not precisely conserved in sequence, and exhibit relatively small repeat frequencies (Lapitan, 1992).
The dispersal pattern of interspersed repetitive elements in higher eukaryotic genomes has led to the suggestion that they are, or once were, transposable elements known as transposons (Flavell, 1986; Lapitan, 1992). Transposons are genetic elements that can move from one chromosomal location to another, without necessarily altering the general architecture of the chromosomes involved. The existence of transposons has only found general acceptance within the last few decades. Genes were originally believed to have fixed chromosomal locations that only change as a result of chromosomal rearrangements resulting from illegitimate crossing-over between incompletely homologous short sections of DNA. Then, in the late 1940""s, McClintock""s pioneering experiments with maize showed that certain genetic elements regularly xe2x80x9cjumpxe2x80x9d, or transpose, to new locations in the genome (McClintock, 1984).
Transposable elements (TEs) reside in the genomes of virtually all organisms (Berg and Howe, 1989). TEs encode enzymes that bring about the insertion of an identical copy of themselves into a new DNA site. Transposition events involve both recombination and replication processes that frequently generate two daughter copies of the original transposable element; one remains at the parental site, while the other appears at the target site (Shapiro, 1983).
Two major classes of eukaryotic TEs have been identified, which are distinguished by their mode of transposition (Finnegan, 1989). Class I elements transpose via the creation of an RNA intermediate that is then re reverse-transcribed to create a DNA copy that integrates at the target site. This class includes several families of retroelementsxe2x80x94retrotransposons and retrovirusesxe2x80x94including the copia elements of Drosophila melanogaster, the gypsy/Ty3 family, the Ty1 element of yeast, and the mammalian immunodeficiency and Rous sarcoma (RSV) retroviruses. Each of these retroelement families are characterized in part by the presence of long terminal repeats (LTRs) at their borders (Finnegan, 1989); however, this class also includes non-LTR-containing elements like Cin4 from maize (Schwarz-Sommer and Saedler, 1988) and the mammalian L1 family (Hutchinson et al. 1989).
The copia elements in D. melanogaster possess long terminal direct repeats. There are more than 11 families of copia-like elements; the members of each are well-conserved and are located at 5 to 100 different sites in the Drosophila genome. These elements are about 5000 base pairs (bp) long, with long terminal repeats (LTRs) several hundred bp in length that vary in both sequence and length between families. At the termini of each element are short imperfect inverted repeats of about 10 bp.
Insertion of copia into a new chromosomal site is accompanied by replication of a 3-6 bp stretch of target DNA; the length, but not the sequence, of the direct repeats that consequently appear immediately before and after the element is the same for all members of the same family. Copia elements have one long open reading frame (ORF) that encodes proteins homologous to those of RNA tumor viruses: homologies to reverse transcriptase, integrase, and nucleic acid-binding proteins suggest that these proteins function to create an RNA intermediate for copia transposition.
Class II elements, like the Drosophila melanogaster P element (Engels, 1989; Rio, 1990) and the maize Ac/Ds element (Federoff, 1989), transpose directly to new sites without the formation of an RNA intermediate. P elements reside at multiple sites in the Drosophila genome and are 0.5 to 1.4 kb in length, bounded by perfect inverted repeats of 31 bp. They represent internally deleted versions of a larger element of about 3 kb called a P factor, which occurs in one or a few copies only in so-called xe2x80x9cP strainsxe2x80x9d of Drosophila. Upon insertion into a new site in the genome, P elements create 8 bp duplications of the target sequence.
The Ac/Ds system in maize consists of Ds elements, which, like the P elements of Drosophila, are derived from a larger complete element called Ac. Ds elements exist in several different lengths, from 0.4 to 4 kb. Unlike P elements, Ds elements remain stationary within the chromosome unless an Ac element is also present. Ds elements contain perfect inverted repeats of 11 bp at their termini, flanked by 6-8 bp direct repeats of the target DNA. When a Ds (or Ac) element transposes, it leaves behind imperfect but recognizable duplications of the 6-8 bp target sequence.
As stated above, it appears likely that many interspersed repetitive DNA families are, or once were, transposons. In soybean, an interspersed repetitive DNA family whose structural characteristics clearly define it as a transposon family is the Tgm family. The Tgm family is related to the maize En/Spm transposons and consists of fewer than 50 members ranging in size from under 2 kb to greater than 12 kb (Rhodes and Vodkin, 1988).
Retroviruses are type I transposons consisting of an RNA genome that replicates through a DNA intermediate. Although the viral genome is RNA, the intermediate in replication is a double-stranded DNA copy of the viral genome called the provirus (Watson et al., 1987). The provirus resembles a cellular gene and must integrate into host chromosomes in order to serve as a template for transcription of new viral genomes (Varmus, 1982). New genomes are processed in the nucleus by unmodified cellular machinery.
The viral genome RNA looks like a cellular messenger RNA (mRNA), but does not serve as such following infection of a cell. Instead, an enzyme called reverse transcriptase (which is not present in the cell, but is instead carried by the virion) makes a DNA copy of the viral RNA genome, which then undergoes integration into cellular chromosomal DNA as a provirus. Integration of the viral DNA is precise with respect to the viral genome, but is semi-random with respect to the host cell genome, in that some sites are utilized more frequently than others (Shih et al., 1988). The integrated provirus serves as a template for production of new viral RNA genomes, which move to the cell membrane to assemble into virions. These bud from the cell membrane without killing the cell.
Retrovirus virions have icosahedral nucleocapsids surrounded by a proteinaceous envelope. The retroviral genome is diploid, and its general organization is well-known in the art. Typical retroviruses have three protein-encoding genes: gag (group-specific antigen) encodes a precursor polypeptide that is cleaved to yield the capsid proteins; pol is cleaved to yield reverse transcriptase and an enzyme involved in proviral integration; and env encodes the precursor to the envelope glycoprotein. A fourth type of retroviral gene, called tat, has been found at the 3xe2x80x2 end of the HTLV-I and -II genomes, which serves as a transcriptional enhancer. A few retroviruses have additional genes, such as onc, that give them the ability to rapidly induce certain types of cancer.
Retroviral genomes contain LTR sequences at both their 5xe2x80x2 and 3xe2x80x2 ends (Weiss, 1984). These sequences include signals needed for replication, transcription, and post-transcriptional processing of viral RNA transcripts. The LTRs are perfect direct repeats created by the addition of sequences (called U5 and U3, derived from the opposite ends of the viral genome) to each end of the viral genome during the creation of the double-stranded DNA intermediate. The U5 region appears to be essential for initiation of reverse transcription and in packaging of viral transcripts (Murphy and Goff, 1988). The U3 region contains a number of cis-acting signals for viral replication, and sequences responsible for much or all of the transcriptional control over viral genes.
Retroviral genomes also contain a primer binding site (PBS) near the 5xe2x80x2 end (Dahlberg et al., 1974). This sequence is complementary to the 3xe2x80x2 end of a cellular tRNA. The tRNA is stolen from the host cell during replication and serves as a primer for reverse transcription of the RNA genome soon after infection.
Once the provirus is integrated into cellular chromosomal DNA, it is stable and replicates along with the host cell DNA. Proviruses are never excised from the site of integration, although they may be lost as a result of deletions. Retrovirus infections usually do not harm the cell, and infected cells continue to divide, with the integrated provirus serving as a template to direct viral RNA synthesis.
Like all viruses, retroviruses have a specific requirement for interaction with a target cell-surface receptor molecule for infection. In all cases known (and suspected), this molecule is a protein that interacts specifically with a specific virion env protein. The best-studied of virion envelope protein-cell surface receptor interaction is that of HIV with the CD4 receptor on human T-cells (Dalgleish et al., 1984). The env protein appears to bind to a small region on the receptor not involved in cell-cell recognition or any other known function. Another retrovirus whose cellular receptor has been identified is Moloney murine leukemia virus (MMLV), which interacts with a cell surface protein that resembles a membrane pore or channel protein. Although the mechanism of interaction of many retroviruses is not yet well understood, it does appear that retroviruses interact with a wide variety of receptor types (Weiss, 1982).
Retroviruses have been studied intensely over the past several decades, mainly because of their ability to cause tumors in animals and to transform cells in culture. The ability of retroviruses to transform cells is based on at least two mechanisms. The first is that certain viruses have incorporated activated proto-oncogenes that upon mutation have acquired the ability to transform cellular growth. The second mechanism of transformation results from insertional mutagenesis upon integration of the viral genome. Because the viral LTRs have promoter and enhancer activities, insertion of an LTR sequence in either orientation adjacent to a cellular gene may lead to inappropriate expression of that gene. If the cellular gene is involved in regulation of cell growth, over- or under-expression or insertional mutagenesis of that gene may lead to uncontrolled growth of the cell.
Retroviral integration is thus potentially mutagenic. Integration of retrotransposons within exonic coding regions may inactivate those genes, while integration within introns or flanking regions may create novel regulatory patterns with significant developmental and evolutionary implications (McDonald, 1990; Robins and Samuelson, 1993; Schwarz-Sommer and Saedler, 1987; Weil and Wessler, 1990; White et al., 1994). Enhancers and trans(e activating sequences have been found in retroviral and retrotransposon LTRs (Boeke, 1989; Cavarec, et al, 1994; Choi and Faller, 1994; Lohning and Ciriacy, 1994; Mellentin-Michelotti et al., 1994; Varmus and Brown, 1989), and retrotransposon insertions between coding regions and enhancers disrupt gene expression (Cal and Levine, 1995; Georgiev and Corces, 1995; Geyer and Corces, 1992; White et al., 1994).
Element mobilization not only modifies target gene activity, it restructures genomic architecture (King, 1992, Lim and Simmons, 1994; McDonald, 1993; Shapiro, 1992). In fact, one of the major genomic differences between related taxonomic groups appears to be the identity and distribution of repetitive elements, not single-copy coding sequences (McDonald, 1993; Shapiro, 1992). White et al. (1994) have demonstrated that the flanking regions of many maize genes are embedded in sequences containing traces of retrotransposon DNA. Moreover, Palmgren (1994) has found that the BstI retroelement from maize encodes two conserved domains found in plant membrane H+-ATPases, suggesting that element acquisition of host sequences is not confined to vertebrate retroviruses.
McClintock (1984) has proposed that genetic variation, induced in part by transposable element-mediated insertional mutagenesis, is a directed response to conditions that create xe2x80x9cgenomic stress.xe2x80x9d Many TEs and retroviruses preferentially insert in transcriptionally active regions of the genome (Engels, 1989; Sandmeyer et al., 1990; Varmus and Brown, 1989). The Ty1 retrotransposon in yeast can be activated by growth in suboptimal temperatures (Paquin and Williamson, 1988) and by exposure to radiation (McEntee and Bradshaw, 1988). Similar observations have been made in Drosophila (McDonald et al., 1988; Strand and McDonald, 1985), maize (McClintock, 1984), and soybean (Sheridan and Palmer, 1977).
In plants, TEs are activated during the induction of tissue culture (Hirochika, 1993; Peschke and Phillips, 1991) and may contribute to somaclonal variation observed for a number of higher plant species including soybean (Amberger et al., 1992; Freytag et al., 1989; Graybosch et al., 1987; Roth et al., 1989). In maize, the activation of transposable elements is correlated with changes in the pattern of DNA methylation that occur during induction of cultures (Brettell and Dennis, 1991; Kaeppler and Phillips, 1993; Peschke et al., 1991), providing a well-characterized basis for gene activation.
In plants, most transposon-like sequences appear to be extinct (Grandbastien, 1992). Although a number of plant species harbor these sequences (Flavell et al., 1992; Grandbastien, 1992; Voytas et al., 1992), active transposition has only been demonstrated or directly implicated in tobacco (Grandbastien, et al., 1989; Pouteau et al., 1994) and maize (Johns et al., 1985). RNA transcripts and cDNAs from transposons have been recovered from tobacco (Pouteau, et al., 1994; Hirochika, 1993) and maize (Hu et al., 1995), and transposable element-related proteins have been detected in maize (Hu et al., 1995).
The stable introduction of foreign genes into plants represents one of the most significant developments in a continuum of advances in agricultural technology that includes modern plant breeding, hybrid seed production, farm mechanization, and the use of agrichemicals to provide nutrients and control pests. Genetic engineering has been applied to many species in efforts to improve production efficiency and environmental conservation. Genetic engineering complements plant breeding efforts by increasing the diversity of genes and germplasm available for incorporation into crops and shortening the time required for the production of new varieties and hybrids, while also providing opportunities to develop new agricultural products and manufacturing processes.
The first transgenic plants were tobacco plants transformed with a chimeric neomycin phosphotransferase gene carried on the Ti plasmid of Agrobacterium tumefaciens (Horsch et al., 1984). Agrobacterium-mediated Ti plasmid transfer has proved to be an efficient, versatile method of plant transformation. The range of plant species amenable to genetic engineering using Agrobacterium is fairly large. In those systems where Agrobacterium-mediated transformation is efficient, it is the method of choice because of the facile and defined nature of the gene transfer.
Few monocotyledonous plants appear to be natural hosts for Agrobacterium, however, although transgenic plants have been produced in asparagus and transformed tumors have been observed in yam. Many commercially valuable crop species, such as cereal grains (e.g., rice, maize, and wheat) are not efficiently transformed by Agrobacterium, despite extensive efforts made in this direction. This appears to be due to differences in the wound response; those species recalcitrant to Agrobacterium-mediated transformation probably do not express the required appropriate wound response (Potrykus, 1991).
Physical methods of gene delivery have been developed in order to transform plants not susceptible to Agrobacterium. These methods include biolistic projection (xe2x80x9cparticle gunxe2x80x9d), microinjection, electroporation, and lipofection (Potrykus, 1991). Most physical transformation experiments have utilized plant protoplasts as the recipient cells; however, other regenerable explants have been utilized, including leaves, stems, and roots. Many plant species have been successfully transformed with physical techniques, but some, notably legumes and cereals, have proved difficult to stably transform by these methods. The applicability of such physical methods to these plants is limited by the difficulties involved in regenerating plants from protoplasts, although some success in this regard has been achieved with some cereals and rice. Little success has been achieved with soybean or maize.
Little experimentation has been reported regarding the use of viral vectors for transformation of plants. Plant viruses exist in a variety of forms; they contain either DNA or RNA as their genetic material, have either rod- or polyhedral-shaped capsids, and can be transmitted either by insects, bacteria, or contact with wounded regions (Robertson, et al., 1983). Most known plant viruses contain single (+) strand RNA as their genetic material. (+) strand plant viruses can further be divided into those which possess a single RNA chain and those which have several RNA chains, each necessary for viral infectivity and which are separately encapsulated into separate virions. Cowpea mosaic virus, for example, contains two RNAs, one encoding several proteins including terminal protein and a protease, with the other chain encoding capsid proteins. There also exist segmented double-strand RNA plant viruses. The best-known of these is wound tumor virus (WTV) which contains 12 different segments and which can replicate in either insect or plant cells.
There are fewer plant DNA viruses. Only two known classes exist, one of which contains double strand DNA and which has a polyhedral capsid. The best understood of this class is cauliflower mosaic virus (CMV). The second class of DNA plant viruses are the geminiviruses that consist of paired capsids held together like twins with each capsid containing a circular single-stranded DNA of about 2500 nucleotides. In some cases, the two paired genomes are identical, while in other cases, the two bear almost no sequence relationship.
Early work with a DNA virus showed that a small bacterial antibiotic resistance gene integrated into such a virus could spread systemically throughout infected plants and confer resistance (Brisson, et al., 1984). It has been suggested that the small size of DNA viral genomes is prohibitory to the wide application of such vectors as useful transforming agents in plants. However, little has been done to follow up on this work.
Even less work has been performed in plants regarding the application of genetic engineering to the far larger group of plant RNA viruses (Ahlquist et al., 1987; Ahlquist and Pacha, 1990). It has been suggested that because the viral RNA does not integrate into the host genome, and is excluded from the meristems and offspring, the usefulness of such RNA viruses in plant transformation is limited at best (Potrykus, 1991).
In one aspect, the present invention provides retroviral and retroviral-like polynucleotides derived from a plant wherein such polynucleotides are capable of integration into the genome of a plant cell. The invention is also directed to other plant retroviral or retroviral-like polynucleotides obtainable by hybridization under stringent conditions (see, e.g., Sambrook et al.) with the retroviral or retroviral-like polynucleotides expressly disclosed herein. Also within the scope of this aspect of the invention are regulatory sequences comprising, for example, plant retroviral long terminal repeat (LTR) sequences that may be operably linked to a gene so as to modulate expression of the linked gene.
In a second aspect, the invention is directed to plant retroviral or retroviral-type elements capable of targeted integration into a specific region in the plant genome and further to methods for accomplishing such integration.
In a third aspect, the present invention is directed to vectors containing all or part of a regulatory sequence derived from a plant retrovirus or retrovirus-like polynucleotide, and to vectors comprising all or part of the retroviral or retroviral-like genome and a heterologous gene.
In a fourth aspect, the invention is directed to vectors containing one or more plant retroviral or retroviral-like regulatory sequences operably linked to a heterologous gene. A heterologous gene in the context of the present application refers to a gene or gene fusion or a part of a gene derived from a source other than the plant pro-retrovirus, or a cDNA, or a plant retroviral gene under the regulatory control of a promoter other than its natural promoter.
In a fifth aspect, the invention is directed to isolated purified proteins encoded by the polynucleotides disclosed herein, and to analogs, homologs, and fragments of such proteins that retain at least one biological property of the proteins.
In a sixth aspect, the invention is directed to isolated purified proteins produced by expression of a heterologous gene using the vectors of the present invention.
In a seventh aspect, the invention is directed to methods for using vectors comprising all or part of a plant proretroviral or retroviral genome and vectors comprising plant retroviral regulatory sequences operably linked to a heterologous gene to introduce a heterologous gene or a regulatory element into a plant genome, wherein the expression product of the gene comprises a polypeptide or an antisense RNA and wherein the regulatory element is a transcriptional regulatory element.
In an eighth aspect, the invention is directed to a plant retrovirus comprising a plant retroviral or retroviral-like polynucleotide, a capsid, and an envelope.
In a ninth aspect, the invention is directed to methods for producing a plant retrovirus, in which the plant retroviral polynucleotide is packaged in a capsid and envelope, preferably through the use of a packaging cell line, but alternatively by use of other vector systems or by in vitro constitution of the retroviral capsid and envelope.
In a tenth aspect, the invention is directed to plant cells that have been transformed by transduction of a plant retroviral polynucleotide or transformed by a plant retrovirus comprising a heterologous gene according to the methods of the present invention.