Many RNA viruses do not have a single, representative genome but instead form a xe2x80x9cquasispeciesxe2x80x9dxe2x80x94a set of related viral variants that coexist in field populations and even within single infected individuals (reviewed in Holland, et al. 1992 Curr Top Microbiol Immunol 176:1-20, Smith, et al. 1997 J Gen Virol 78:1511-1519, Domingo, et al. 1985 Gene 40:1-8, Domingo, et al. 1995 Molecular Basis of Virus Evolution 181-191, Duarte, et al. 1994 Infect Agents Dis 3:201-214). The emergence of immunologically distinct members of a viral quasispecies through mutation and subsequent immune selection is called xe2x80x9cantigenic drift.xe2x80x9d Antigenic drift is thought to be important in HIV infection and the continuing seasonal influenza epidemics,especially because immunity generated against one viral variant rapidly selects for escape variants. Attributed to antigenic drift are the moderately high failure rate and the short-lived efficacy of influenza vaccines (Wilson and Cox 1990 Annu Rev Immunol 8:737-771), the failure of synthetic foot-and-mouth disease virus vaccines (Taboga, et al. 1997 J Virol 71:2606-2614), and the current failure of recombinant HIV vaccines to provide complete protection against field strains of the virus (Berman, et al. 1997 J Inf Dis 176:384-397).
If vaccination against a viral quasispecies is to be effective, either ubiquitous, unvarying viral targets must be identified or, alternately, all advantageous viral variants of one or more antigenic regions must be identified and included in a vaccine.
An object of the present invention is to provide a method of determining the advantageous variants found in a viral population given aligned nucleotide sequences of antigenic proteins or protein regions of that viral population. Once these advantageous variants are identified, they may be used drug targeting and in vaccine design applications.
The algorithm used to identify the advantageous variants is as follows for each amino acid position: 1) Identified as an advantageous variant of the viral population is the most common (consensus) amino acid. 2) Replacement variants, those viral variants that differ in the amino acid sequence from the consensus, that are found to have significantly high replacement to silent mutation ratios are determined to be advantageous to the virus. 3) Conversely, replacement variants with significantly low replacement to silent mutation ratios are recognized as providing selective disadvantage to the virus and so are excluded from further consideration. 4) Replacement variants where the nucleotide replacement to silent mutation ratio is unable to classify the variant as significantly advantageous or disadvantageous are provisionally identified as advantageous variants; the selective advantage or disadvantage of these variants cannot be determined with the given sequence data set, so advantage or disadvantage must be determined experimentally. A reasonable subset of variants may be selected by including the 2H+"sgr" most common variants (where H is the Shannon information content and "sgr" is its standard error of its estimation).
The identified advantageous viral variants may then be used for purposes including but not limited to: 1) specifying components of vaccines to be used in conjunction with appropriate vaccination vectors and techniques know in the art; 2) identifying appropriate targets for small molecule or other anti-viral compounds; 3) using constructed viral variant panels to screen for broadly neutralizing monoclonal antibodies, screen for broadly neutralizing anti-viral compounds, and/or determine the neutralization spectrum of anti-viral compounds or antibodies.
Examples are given for influenza A hemagglutinin 3 (SEQ ID NO: 1) and HIV-1 gp120. (SEQ ID NO: 2-SEQ ID NO: 6).