Advances in recombinant DNA technology and genetic engineering have provided a means for producing in bacteria specific proteins of commercial and economic importance. In many instances, the specific proteins sought to be produced in bacteria are of eucaryotic origin. In the course of developing such bacterial host factories for eucaryotic protein production, it has become increasingly evident that bacteria are unable to consistently provide the post-translational modifications such as proper protein folding, glycosylation and the like required for functional eucaryotic protein production. It is, therefore, desirable to develop eucaryotic cell systems which can provide these post-translational modifications and thus become efficient factories for functional and/or antigenically homologous eucaryotic protein production.
A key element in the genetic engineering of both eucaryotic and procaryotic cells to effect heterologous protein production is the development of defined vectors or host-vector systems. "Vectors" or "vector systems" are herein defined as nucleic acid (e.g. DNA and/or RNA) molecules capable of providing for the replication of a desired (e.g. heterologous) nucleic acid sequence or sequences in a host cell. "Host-vector" systems are herein understood to mean host cells capable of accepting a given vector molecule into its genome. The term "genome" is herein defined as including any and all DNA, e.g. chromosomal and episomal, contained within a given host cell or virus particle. A "gene" is herein defined as comprising all the DNA required for expression of a DNA sequence e.g. production of the protein or fragment thereof encoded in the gene.
While numerous vector systems have been developed for procaryotic hosts and for such eucaryotic hosts as yeast and various mammalian cell lines, few such systems have been described for plants. The term "plant" shall include whole plants, plant parts and individual plant cells unless otherwise specified. Enhanced or de novo protein production in plant hosts may be desirable due to the lower cost of plant cell culture systems as compared to mammalian cell culture systems and for increased production of plant secondary products. Plant secondary products can include such medically important plant products as shikonin, digitalis, vinblastine and vincristine. By means of genetic engineering, the production of such plant secondary products may be increased by providing and/or amplifying the rate limiting enzymes in such product production and/or by increasing the number of copies per cell of a gene coding for such enzymes.
To date, only two vector systems have been described which allow for the introduction of a given gene in higher plants to effect desired protein production. The first system employs a tumor inducing (Ti) plasmid or portion thereof found in the bacterium Agrobacterium. A portion of the Ti plasmid is transferred from the bacterium to plant cells when Agrobacterium infects plants and produces a crown gall tumor. This transferred DNA is hereinafter referred to as "transfer DNA" (T-DNA). The transfer DNA integrates into the plant chromosomal DNA and can be shown to express the genes carried in the transferred DNA under appropriate conditions. It has further been shown that whole plants regenerated from a single plant cell transformed with a transfer DNA carry the integrated DNA in all cells. These cells, however, generally carry only one to 5 copies of the transfer DNA and are thus limited in the amount of transfer DNA gene products which may be produced in the transformed plant cells. It is believed that by introducing multiple copies of a given DNA sequence or gene, greater levels of desired protein production may be achieved. Thus, it is desirable to develop a means for introducing or inducing more than about 5 copies of a given gene per host cell to effect increases in gene-specific products.
The second vector system employs cauliflower mosaic virus (CaMV) DNA as a vector for introduction of desired DNA sequences into plant cells. CaMV is a member of the caulimovirus group and contains a double-stranded DNA genome. To date, the CaMV system has only been applied to whole plants and requires infectious virus production. Thus the CaMV vector system is limited by three important factors. The first is host range, the second is a limitation on the size of the desired DNA sequences which may be carried in the CaMV DNA vectors and the third is resultant disease caused by the introduction of whole virus DNA. The resultant disease associated with the currently applied CaMV vector systems prevents its potential use in the stable transformation of whole plants to effect such improvements as increased plant resistance to herbicides, resistance to other disease factors, increased protein production, increased crop yield and the like. Furthermore, infection of whole plants requires that the CaMV DNA retain the necessary viral functions for infectivity, replication, movement throughout the whole plant and packaging into infectious virus particles. Thus, the maximum size of a desired DNA sequence which may be carried in CaMV DNA vectors has thus far been limited to 240 base pairs (bp), only enough DNA to encode a small peptide. A further limitation of the CaMV vector system is that the heterologous or foreign DNA so introduced is not seed transmitted.
Recently a new technique for introduction of desired (e.g. heterologous) DNA sequences into a given host has been developed. This technique is called electroporation. Presently, however, this technique is limited by a low frequency of introduction and, in plants, is only operable in protoplasts and hence cannot be practically employed to engineer plants where regeneration of whole plants from protoplasts is not possible.
It is, therefore, desirable to develop a plant vector system that is capable of carrying both small and large (e.g. greater than 250 bp) heterologous DNA sequences or genes, able to generate a high copy number of introduced DNA sequences, and exhibits a broad host range.
To date, only one other group of plant viruses has been identified which contains a DNA, rather than RNA genome. This group comprises the geminiviruses. Geminiviruses are plant viruses characterized by dumbbell-shaped twinned icosahedral particles (seen by electron micrograph). Some geminiviruses comprise two distinct circular single-stranded (ss) DNA genomes. Examples of such two genome or binary geminiviruses include tomato golden mosaic virus (TGMV) which has an "A" DNA and a "B" DNA, Hamilton (1981, 1982, 1983, and 1984); Stein (1983); and Bisaro (1982), and Cassava latent virus (CLV) which has a "1" DNA and a "2" DNA (Stanley and Gay, 1983). Other geminiviruses such as maize streak virus (MSV) are believed to have a single circular ssDNA genome; Donson (1984). Typically, two genome (binary) geminiviruses are transmitted by white flies, while single genome geminiviruses are transmitted by leaf hoppers. As a group, geminiviruses infect both monocotyledonous and dicotyledonous plants and thus exhibit a broad host range.
All geminivirus particles carry circular ssDNA. In infected plant cells, geminivirus DNA sequences have been detected as both ss and double-stranded (ds) DNA, in predominately circular form. In infected plants, such sequences exist in the plant cell nuclei, apparently as episomes, at several hundred copies per nuclei. Thus, unlike the transfer DNA (T-DNA) derived from the Ti plasmids of Agrobacterium, these geminivirus DNA sequences are not integrated into plant chromosomal DNA and generate multiple copies (e.g. more than 5) per infected cell. In infected plants, geminivirus particles released by an infected cell can infect other cells throughout the plant. In the two genome geminivirus systems such as TGMV, infectivity, replication and movement throughout the whole plant has thus far been shown to require the presence of both the A and B components. Other than reports that the two DNA genomes of binary geminiviruses are simultaneously required for complete and systemic infection of binary geminiviruses in whole plants, the precise mode of and requirements for geminivirus DNA replication itself in plants or plant cells has not clearly been elucidated.
Scientists have speculated about the possibility of using geminiviruses to create episomal DNA molecules which function as plasmids in plant cells (Buck, 1983). A "plasmid" is herein defined as an episomal DNA molecule capable of autonomous replication in a host cell.
In order for geminivirus DNA to be useful as a vector in plants, the DNA must be capable of autonomous replication in plants, be able to generate a high copy number in plants, be able to have inserted therein a heterologous DNA sequence or gene, be able to simultaneously replicate itself and the inserted gene or DNA sequence in plant cells, preferably, contain a marker for positive identification and/or selection of plants transformed with the vector, and, ideally, not cause disease symptoms. No one has heretofore taught how to modify geminivirus genomes so as to allow them to function in the foregoing manner.
The term "heterologous" as applied to a nucleic acid (e.g. DNA or RNA) sequence (molecule) or gene means a nucleic acid sequence or gene, respectively, at least a portion of which contains a nucleic acid sequence not naturally contained within a geminivirus genome. The term "heterologous" as applied to a protein means a protein at least a portion of which contains a protein sequence not naturally encoded by a geminivirus genome.