This invention relates to the directional assembly of large genomes, and more specifically, to the directional assembly of large viral genomes.
The genomes of viruses, bacteria, plants and other organisms (including humans) are being systematically cloned and sequenced. Methods are needed to directionally assemble smaller DNA subclones into full-length, functionally intact genomes or chromosomes of these organisms. Such methods could allow for the precise genetic manipulation of individual chromosomes in whole plants and animals and the construction of artificial chromosomes for gene therapy. Conventional approaches have generally not been successful because of the large size of the target nucleic acid and the inability to systematically assemble individual DNA clones into a full-length genome.
Presently known methods for genetically manipulating the genomes of many viruses, plants, animals, and bacteria generally use recombination or transduction methods to introduce foreign sequences or alter genes in the genomes of organisms. These methods can be problematic depending on the payload sequences being introduced and the biology of the organism. In addition, multiple genetic manipulations/recombination events may be required to construct the appropriate genotype.
Molecular genetic analysis of the structure and function of RNA virus genomes has been profoundly advanced by the availability of full-length cDNA clones, the source of infectious RNA transcripts that replicate efficiently when introduced into permissive cell lines. See P. Ahlquist, et al., Proc. Natl. Acad. Sci. USA 81, 7066-7070 (1984); J. C. Boyer et al., Virology 198, 415-426 (1994). Recombinant DNA technology has allowed the isolation of infectious cDNA clones from a variety of positive-stranded RNA viruses including picornaviruses, caliciviruses, alphaviruses, flaviviruses and arterioviruses, whose RNA genomes range in size from approximately 7-15 kb in length. See Agapov, E. V. et al., Proc. Natl. Acad. Sci. USA 95, 12989-12994 (1998); Davis, N. L., et al, Virology 171, 189-204(1989); Racaniello, V. R. et al., Science 214, 916-919(1981); Rice, C. M., et al., New Biol. 1, 285-296(1989); Rice, C. M., et al, J. Virology 61, 3809-3819 (1987); Sosnovtsev, S. et al., Virology 210, 383-390 (1995); Sumyoshi, H., et al., J. Virol. 66, 5425-5431 (1992); Van Dinten, L. C et al., Proc. Natl. Acad. Sci. USA 94, 991-996 (1997).
The order Nidovirales (the Nidoviruses) includes mammalian, positive polarity, single-stranded RNA viruses in the arteriviruses and coronavirus families. Cavanagh, D., et al., Arch. Virol. 128, 395-396 (1993); De Vries, A. A. F., et al., Semin. Virol. 8, 33-47 (1997). Coronaviridae (the coronavirus family) includes the coronavirus and torovirus genuses. See Cavanagh et al., supra; Snijder, E. J. et al., J. Gen. Virol. 74, 2305-2316 (1993). Despite significant size differences (13-32 Kb), the polycistronic genome organization and regulation of gene expression from a nested set of subgenomic mRNAs are similar for all members of the order. See De Vries et al., supra and Snijder et al., supra.
Coronaviridae contain a linear, single-stranded positive polarity RNA genome of about 27-32,000 nucleotides in length. As such, the family contains the largest known RNA viral genomes. Lai, M. M. C et al., Adv. Virus Res. 48, 1-100 (1997); Siddell, S. G. The Coronaviridae, an introduction, in The Coronaviridae (Plenum Press, New York. Pgs 1-10 (1995)). Transmissible gastroenteritis virus (TGE), a group I coronavirus, contains an approximately 28.5 Kb genomic RNA that is packaged into a helical nucleocapsid structure and is surrounded by an envelope that contains three virus specific glycoprotein spikes, including the S glycoprotein, membrane glycoprotein (M), and a small envelope glycoprotein (E). See Eleouet, J. F., et al, Virology 206, 817-822 (1995); Enjuanes, L. et al., Molecular basis of transmissible gastroenteritis coronavirus (TGE) epidemiology, in The Coronaviridae (S. G. Siddell, ed., pp 337-376. Plenum Press, New York (1995)); Rasschaert, D. et al, J. Gen. Virol. 68, 1883-1890 (1987); Risco, C., et al., J. Virol. 70, 4773-4777 (1996).
The TGE genome is polycistronic and encodes nine large open reading frames (ORFs) which are expressed from full length or subgenomic length mRNAs during infection. Sethna, P. B., et al., J. Virol. 65, 320-325 (1991); Sethna, P. B., et al, Proc. Natl. Acad. Sci. USA 86, 5626-5630 (1989). The 5xe2x80x2-most 20 Kb (approximately) encodes the RNA replicase genes that are encoded in two large ORF""s designated 1a and 1b, the latter of which is expressed by ribosomal frameshifting. Eleouet, J. F et al., supra. ORF1a encodes at least two viral proteases and several other nonstructural proteins, while ORF1b contains polymerase, helicase and metal binding motifs typical of an RNA polymerase. See Eleouet, et al., supra, Gorbalenya, A. E., et al,. Nucleic Acids Res. 17, 4847-4861 (1989). In the 3xe2x80x2-most 9 Kb (approximately) of the TGE genome, each of the downstream ORFs is preceded by a highly conserved intergenic sequence element, which directs the synthesis of each of the six or seven subgenomic RNAs. See Chen, C. M., et al., Virus Res. 38, 83-89 (1997); Eleouet et al., supra; Enjuanes, et al., supra; Tung, F. Y. T., et al, Virology 186, 676-683 (1992). These subgenomic mRNAs are arranged in a nested set structure from the 3xe2x80x2 end of the genome and contain a leader RNA sequence derived from the 5xe2x80x2 end of the genome. See Lai, M. M. C. et al., supra; McGoldrick, A., et al., Arch Virol. 4, 763-770 (1999); Sethna (1991), supra; Sethna (1989), supra. In addition to the viral mRNAs, full length and subgenomic length negative strand RNAs are implicated in mRNA synthesis. Almazan, F., et al., Proc. Natl. Acad. Sci. USA 97, 5516-5521 (2000). Another unique feature of coronavirus replication is the high RNA recombination frequencies associated with infection. Baric, R. S., et al., Virology 177, 646-656 (1990); Kuo, L., et al., J. Virol. 74, 1393-1406 (2000); Lai et al., supra.
The large size of the coronavirus genome, coupled with the inability to clone portions of the polymerase gene in microbial vectors, has hampered the ability to perform precise manipulations and reverse genetics in Coronaviridae. Recently, a full length cDNA clone of TGE was assembled in bacterial artificial chromosomes (BAC) vectors. See Almazan, F., et al., supra. However, the assembly of large RNA and DNA genomes using these BAC vector methods remains problematic.
The family of coronaviruses includes viruses that are responsible for severe economic losses in the swine, cattle and poultry industries and cause about 30% of the common colds in humans. In children and infants, human coronaviruses may cause more serious lower respiratory tract infections including bronchitis, bronchiolitis and pneumonia. Transmissible gastroenteritis virus (TGE) cause acute diarrhea in piglets often resulting in mortality rates approaching 100% and an estimated annual loss of greater than 30 million dollars per year in the US alone. Infectious bronchitis virus (IBV) cause severe lower respiratory tract infection in poultry resulting in approximately $20,000,000 losses each year. Since presently known TGE and IBV vaccines have not been effective at reducing the severity of disease, new methods are needed to efficiently engineer recombinant TGE viral vaccines and use these viruses to deliver other antigens from highly virulent pathogenic microorganisms of swine.
The unique replication strategy of coronaviruses makes them attractive candidate vectors to express multiple foreign genes. TGE vectors engineered to express multiple recombinant proteins or foreign antigens from highly pathogenic microorganisms may be effective at reducing overall economic losses from infectious agents in, for example, swine.
The present invention relates to a simple, systematic method for assembling functional full-length genomes of large RNA and DNA viruses. The invention is exemplified by, although not limited to, the assembly of full-length, functional coronavirus genomes. The present inventors have successfully assembled a full length infectious clone of transmissible gastroenteritis virus (TGE). Using a novel approach, six adjoining cDNA subclones that span the entire TGE genome were isolated. Each clone was engineered with unique flanking interconnecting junctions which dictate a precise, systematic assembly with only the correct adjacent cDNA subclones, resulting in an intact TGE cDNA construct of about approximately 28.5 Kb in length. Transcripts derived from the full-length TGE construct were found to be infectious, and progeny virions were serially passaged in permissive host cells. Viral antigen and subgenomic mRNA synthesis were evident during infection and throughout passage. Plaque-purified virus derived from the infectious construct was found to replicate efficiently in permissive host cells. The recombinant viruses were sequenced across the unique interconnecting junctions, conclusively demonstrating the unique marker mutations and restriction sites that were engineered into the component clones. Among other advantages, full-length infectious clones of TGE permit the precise genetic modification of the coronavirus genome.
Accordingly, a first aspect of the present invention is a method of assembling a recombinant viral genome by obtaining a set of subclones of the viral genome, wherein the termini of each subclones is a restriction site, and then ligating the subclones to form a recombinant viral genome. The genome is preferably a full-length viral genome that has the same activity (function) as the natural genome, and more preferably is an infectious viral genome (i.e., is able to infect permissive cells). In certain embodiments, the subclones comprise mutations (i.e., have sequences that are different from the wild type genome). In other embodiments, the assembled genome further comprises a heterologous nucleic acid. In a preferred embodiment, the viral genome is a coronavirus genome. Recombinant viral genomes produced by the present invention are an additional aspect of the present invention. Methods of infecting cells with genomes of the present invention are yet another aspect of the present invention. In preferred embodiments of these methods, the genomes are vectors that express heterologous nucleic acid in the cell.
The foregoing and other aspects of the present invention are explained in detail in the specification set forth below.