A. Alphaviruses
Togaviridae is a family of viruses that includes the genus alphavirus. Alphaviruses are enveloped viruses with a linear, positive-sense single-stranded RNA genome. Members of the alphavirus genus include at least 30 species of arthropod-borne viruses, including Aura (AURA), Babanki (BAB), Barmah Forest (BF), Bebaru (BEB), Buggy Creek, Cabassou (CAB), Chikungunya (CHIK), Eastern equine encephalitis (EEE), Everglades (EVE), Fort Morgan (FM), Getah (GET), Highlands J (HJ), Kyzylagach (KYZ), Mayaro (MAY), Middelburg (MID), Mucambo (MUC), Ndumu (NDU), O'nyong-nyong (ONN), Pixuna (PIX), Ross River (RR), Sagiyama (SAG), Salmon pancreas disease (SPDV), Semliki Forest (SF), Una (UNA), Venezuelan equine encephalitis (VEE), Western equine encephalitis (WEE) and Whataroa (WHA) virus (“The Springer Index of Viruses,” pgs. 1148-1155, Tidona and Darai eds., 2001, Springer, New York; Strauss and Strauss, Microbiol. Rev., 1994, 58, 491-562). Alphaviruses are evolutionarily differentiated based on nucleotide sequence of the nonstructural proteins, of which there are four (nsP1, nsP2, nsP3 and nsP4). The genus segregates into New World (American) and Old World (Eurasian/African/Australasian) alphaviruses based on geographic distribution. It is estimated that New World and Old World viruses diverged between 2,000 and 3,000 years ago (Harley et al., Clin. Microbiol. Rev., 2001, 14, 909-932).
Among the alphavirus species, there are seven distinct serocomplexes (SF, EEE, MID, NDU, VEE, WEE and BFV) into which members of the genus are sub-divided (Khan et al., J. Gen. Virol., 2002, 83, 3075-3084; Harley et al., Clin. Microbiol. Rev., 2001, 14, 909-932). Based on genomic sequence data from six of the seven serocomplexes, alphaviruses have been grouped into three large groups VEE/EEE, SFV and SIN. The VEE-EEE group is exclusively made up of New World viruses with a distribution in North America, South America and Central America. Members of this group include EEE, VEE, EVE, MUC and PIX. The SF group is primarily Old World, but contains one member (MAY) that is found in South America. Other members of the SF group include SF, MID, CHIK, ONN, RR, BF, GET, SAG, BEB and UNA. The SIN group is also primarily Old World, with the exception of AURA, which is a New World virus related to SIN and can be found in Brazil and Argentina. Other members of this group include SIN, WHA, BAB and KYZ. WEE, HJ and FM are considered recombinant viruses and are thus not included in any of the three groups. NDU and Buggy Creek are currently unclassified.
Many members of the alphavirus genus pose a significant health risk to humans, as well as horses, in many different geographic regions. EEE and WEE both cause a fatal encephalitis in humans and horses; however, EEE is more virulent with a mortality rate up to 50%, compared with 3-4% for WEE. VEE can also cause disease in humans and horses, but symptoms are typically flu-like and rarely lead to encephalitis. The geographic distribution for the encephalitis viruses is primarily in the Americas (“The Springer Index of Viruses,” pgs. 1148-1155, Tidona and Darai eds., 2001, Springer, New York; Strauss and Strauss, Microbiol. Rev., 1994, 58, 491-562).
The SIN group of Old World viruses, including RR, ONN and CHIK, have been associated with outbreaks of acute and persistent arthritis and arthralgia (joint pain) in humans. Epidemics of acute, debilitating arthralgia have been caused by ONN and CHIK in Africa and Asia. RR, which is the etiological agent of epidemic polyarthritis, is endemic to Australia and caused a major epidemic throughout the Pacific islands in 1979. The outbreak affected over 50,000 people on the island of Fiji. Other alphaviruses have been linked to acute and persistent arthralgia in northern Europe and South Africa. Although each virus induces a somewhat different disease, infection with RR, ONN or CHIK typically causes symptoms such as generalized to severe joint pain, fever, rash, headache, nausea, myalgia and lymphadenitis. It has been reported that arthralgia associated with alphavirus infection can persist for months or years. CHIK has also been associated with a fatal hemorrhagic condition (“The Springer Index of Viruses,” pgs. 1148-1155, Tidona and Darai eds., 2001, Springer, New York; Strauss and Strauss, Microbiol. Rev., 1994, 58, 491-562; Hossain et al., J. Gen. Virol., 2002, 83, 3075-3084).
Another alphavirus causing human disease and mortality is MAY, which is found in the Caribbean and South America. Mayaro virus infection causes fever, rash and arthropathy (diseases of the joint), and exhibits a mortality rate of up to 7% (“The Springer Index of Viruses,” pgs. 1148-1155, Tidona and Darai eds., 2001, Springer, New York).
B. Bioagent Detection
A problem in determining the cause of a natural infectious outbreak or a bioterrorist attack is the sheer variety of organisms that can cause human disease. There are over 1400 organisms infectious to humans; many of these have the potential to emerge suddenly in a natural epidemic or to be used in a malicious attack by bioterrorists (Taylor et al., Philos. Trans. R. Soc. London B. Biol. Sci., 2001, 356, 983-989). This number does not include numerous strain variants, bioengineered versions, or pathogens that infect plants or animals.
Much of the new technology being developed for detection of biological weapons incorporates a polymerase chain reaction (PCR) step based upon the use of highly specific primers and probes designed to selectively detect individual pathogenic organisms. Although this approach is appropriate for the most obvious bioterrorist organisms, like smallpox and anthrax, experience has shown that it is very difficult to predict which of hundreds of possible pathogenic organisms might be employed in a terrorist attack. Likewise, naturally emerging human disease that has caused devastating consequence in public health has come from unexpected families of bacteria, viruses, fungi, or protozoa. Plants and animals also have their natural burden of infectious disease agents and there are equally important biosafety and security concerns for agriculture.
An alternative to single-agent tests is to do broad-range consensus priming of a gene target conserved across groups of bioagents. Broad-range priming has the potential to generate amplification products across entire genera, families, or, as with bacteria, an entire domain of life. This strategy has been successfully employed using consensus 16S ribosomal RNA primers for determining bacterial diversity, both in environmental samples (Schmidt et al., J. Bact., 1991, 173, 4371-4378) and in natural human flora (Kroes et al., Proc Nat Acad Sci (USA), 1999, 96, 14547-14552). The drawback of this approach for unknown bioagent detection and epidemiology is that analysis of the PCR products requires the cloning and sequencing of hundreds to thousands of colonies per sample, which is impractical to perform rapidly or on a large number of samples.
Conservation of sequence is not as universal for viruses, however, large groups of viral species share conserved protein-coding regions, such as regions encoding viral polymerases or helicases. Like bacteria, consensus priming has also been described for detection of several viral families, including coronaviruses (Stephensen et al., Vir. Res., 1999, 60, 181-189), enteroviruses (Qberste et al., J. Virol., 2002, 76, 1244-51); Oberste et al., J. Clin. Virol., 2003, 26, 375-7); Oberste et al., Virus Res., 2003, 91, 241-8), retroid viruses (Mack et al., Proc. Natl. Acad. Sci. U. S. A., 1988, 85, 6977-81); Seifarth et al., AIDS Res. Hum. Retroviruses, 2000, 16, 721-729); Donehower et al., J. Vir. Methods, 1990, 28, 33-46), and adenoviruses (Echavarria et al., J. Clin. Micro., 1998, 36, 3323-3326). However, as with bacteria, there is no adequate analytical method other than sequencing to identify the viral bioagent present.
In contrast to PCR-based methods, mass spectrometry provides detailed information about the molecules being analyzed, including high mass accuracy. It is also a process that can be easily automated. DNA chips with specific probes can only determine the presence or absence of specifically anticipated organisms. Because there are hundreds of thousands of species of benign pathogens, some very similar in sequence to threat organisms, even arrays with 10,000 probes lack the breadth needed to identify a particular organism.
There is a need for a method for identification of bioagents which is both specific and rapid, and in which no culture or nucleic acid sequencing is required. Disclosed in U.S. Pre-Grant Publication Nos. 2003-0027135, 2003-0082539, 2003-0228571, 2004-0209260, 2004-0219517 and 2004-0180328, and in U.S. application Ser. Nos. 10/660,997, 10/728,486, 10/754,415 and 10/829,826, all of which are commonly owned and incorporated herein by reference in their entirety, are methods for identification of bioagents (any organism, cell, or virus, living or dead, or a nucleic acid derived from such an organism, cell or virus) in an unbiased manner by molecular mass and base composition analysis of “bioagent identifying amplicons” which are obtained by amplification of segments of essential and conserved genes which are involved in, for example, translation, replication, recombination and repair, transcription, nucleotide metabolism, amino acid metabolism, lipid metabolism, energy generation, uptake, secretion and the like. Examples of these proteins include, but are not limited to, ribosomal RNAs, ribosomal proteins, DNA and RNA polymerases, RNA-dependent RNA polymerases, RNA capping and methylation enzymes, elongation factors, tRNA synthetases, protein chain initiation factors, heat shock protein groEL, phosphoglycerate kinase, NADH dehydrogenase, DNA ligases, DNA gyrases and DNA topoisomerases, helicases, metabolic enzymes, and the like.
To obtain bioagent identifying amplicons, primers are selected to hybridize to conserved sequence regions which bracket variable sequence regions to yield a segment of nucleic acid which can be amplified and which is amenable to methods of molecular mass analysis. The variable sequence regions provide the variability of molecular mass which is used for bioagent identification. Upon amplification by PCR or other amplification methods with the specifically chosen primers, an amplification product that represents a bioagent identifying amplicon is obtained. The molecular mass of the amplification product, obtained by mass spectrometry for example, provides the means to uniquely identify the bioagent without a requirement for prior knowledge of the possible identity of the bioagent. The molecular mass of the amplification product or the corresponding base composition (which can be calculated from the molecular mass of the amplification product) is compared with a database of molecular masses or base compositions and a match indicates the identity of the bioagent. Furthermore, the method can be applied to rapid parallel analyses (for example, in a multi-well plate format) the results of which can be employed in a triangulation identification strategy which is amenable to rapid throughput and does not require nucleic acid sequencing of the amplified target sequence for bioagent identification.
The result of determination of a previously unknown base composition of a previously unknown bioagent (for example, a newly evolved and heretofore unobserved virus) has downstream utility by providing new bioagent indexing information with which to populate base composition databases. The process of subsequent bioagent identification analyses is thus greatly improved as more base composition data for bioagent identifying amplicons becomes available.
The present invention provides methods of identifying unknown viruses, including viruses of the Togaviridae family and alphavirus genus. Also provided are oligonucleotide primers, compositions and kits containing the oligonucleotide primers, which define alphaviral identifying amplicons and, upon amplification, produce corresponding amplification products whose molecular masses provide the means to identify alphaviruses at the sub-species level.