1. Field of the Invention
The present invention is in the field of virology. The invention relates to the nucleotide sequences of the genomes of 11 molecular clones for non-subtype B isolates of human immunodeficiency virus type 1 (HIV-1), and nucleic acids derived therefrom. This invention also relates to peptides encoded by and/or derived from the nucleic acid sequences of these molecular clones, and host cells containing these nucleic acid sequences and peptides. The invention also relates to diagnostic methods, kits and immunogens which employ the nucleic acids, peptides and/or host cells of the invention.
2. Description of the Related Art
A critical question facing current AIDS vaccine development efforts is to what extent HIV-1 genetic variation has to be considered in the design of candidate vaccines (11,21,42,72). Phylogenetic analyses of globally circulating viral strains have identified two distinct groups of HIV-1, a major M group and an O group (33,45,61,62). Within the M group, ten sequence subtypes (A-J) have been proposed (29,30,45,72). Sequence variation among viruses belonging to these different lineages is extensive, with envelope amino acid sequence variation ranging from 24% between different subtypes to 47% between the two different groups. Given this extent of diversity, the question has been raised whether immunogens based on a single virus strain can be expected to elicit immune responses effective against a broad spectrum of viruses, or whether vaccine preparations should include mixtures of genetically divergent antigens and/or be tailored toward locally circulating strains (11, 21, 42, 72). This is of particular concern in developing countries where multiple subtypes of HIV-1 are known to co-circulate and where subtype B viruses, which have been the source for most current candidate vaccine preparations (10, 21), are rare or nonexistent (5, 24, 40, 72).
Although the extent of global HIV-1 variation is well defined, little is known about the biological consequences of this genetic diversity and its impact on cellular and humoral immune responses in the infected host. In particular, it remains unknown whether subtype specific differences in virus biology exist that need to be considered for vaccine design. Only a comprehensive analysis of genetically defined representatives of the various groups and subtypes will address the question of whether certain variants differ in fundamental viral properties and whether such differences will need to be incorporated into vaccine strategies. Obviously, such studies require well-characterized reference reagents, in particular full length and replication competent molecular clones that can be used for functional and biological studies.
Full-length reference sequences representing the various subtypes are also urgently needed for phylogenetic comparisons. Until about 1994, it was generally thought that individuals do not become infected with multiple distinct HIV-1 strains, and so the possibility that recombination between divergent viruses could contribute to the evolution of HIV-1 was not widely considered. However, recent analyses of subgenomic (23,52,54,58) as well as full-length HIV-1 sequences (7,18,53,60) identified a surprising number of HIV-1 strains which clustered in different subtypes in different parts of their genome. All of these originated from geographic regions where multiple subtypes co-circulated and are the results of co-infections with highly divergent viruses (52,60,62).
Recombinant viruses can be detected because their phylogenetic affinities vary depending on the region of genome analyzed. A useful initial approach is to examine the extent of sequence divergence/similarity between a new sequence and a bank of reference sequences of different subtypes, for example as a diversity plot (18), or using the RIP program (75); if the extent of relative similarity to different subtypes varies along the sequence, this may indicate that the sequence is a recombinant. However, fuller investigation must involve a phylogenetic approach, comparing trees derived by analyses of different regions of the genome, and assessing the confidence of phylogenetic clustering by a statistical approach such as the bootstrap. A thorough analysis would involve taking a window of sequence of a certain size, and moving this window along the genome in steps of a defined size, generating perhaps hundreds of trees for visual examination in the process. There are at least two short cuts. One is to analyze only a few windows, defining selected regions according to the output of the diversity analysis. Another is to not examine the entire phylogenetic tree of all subtypes, but to focus on one particular phylogenetic question. Thus, if the initial analyses suggest that a sequence may be a recombinant between two particular subtypes, it is possible to ask simply what is the bootstrap value for the clustering of the new sequence with one or another particular subtype, and plot these values as a function of position along the genome; this is the basis of the “bootscanning” approach (57). Once the subtypes putatively involved in the recombination event have been identified, and the crossover points have been approximately localized, more precisely defined breakpoints can be determined, and their statistical significance assessed, using informative site analysis (19, 52, 53).
Detailed phylogenetic characterization revealed that most of the recombinant viruses have a complex genome structure with multiple points of crossover (7,18,53,60). Some recombinants, like the “subtype E” viruses, which are in fact A/E recombinants (7,18), have a wide-spread geographic dissemination and are responsible for much of the Asian HIV-1 epidemic (69,70). In other areas, recombinants appear to be generated with increasing frequencies as many randomly chosen isolates exhibit evidence of mosaicism (4,8,31,66,71).
Since recombination provides the opportunity for evolutionary leaps with genetic consequences that are far greater than the steady accumulation of individual mutations, the impact of recombination on viral properties must be monitored. Full-length non-recombinant reference sequences for all major HIV-1 groups and subtypes are thus needed to map and characterize the extent of inter-subtype recombination.
Non-subtype B viruses cause the vast majority of new HIV-1 infections worldwide. Although their geographic dissemination is carefully monitored, their immunogenic and biological properties remain largely unknown, in part because well-characterized virological reference reagents are lacking. In particular, full length clones and sequences are rare, since subtype classification is frequently based on small PCR-derived viral fragments. There are currently only five full length, non-recombinant molecular clones available for viruses other than subtype B (45), and these represent only three of the proposed (group M) subtypes (A, C and D). Moreover, only three clones (all derived from subtype D viruses) are replication competent and thus useful for studies requiring functional gene products (45,48,65). Given the unknown impact of genetic variation on correlates of immune protection, subtype specific reagents are critically needed for phylogenetic, immunological and biological studies.