The principal etiological agent responsible for causing what has come to be known as acquired immunodeficiency syndrome (AIDS) is a non-transforming retrovirus belonging to the lentivirus genus (1). This virus, referred to as Human Immunodeficiency Virus type I (HIV-1), is now widely disseminated and constitutes a serious threat to health and productivity worldwide. Virtually all industrialized countries, as well as many in the developing world, now mandate the testing of blood donations to prevent the further transmission of this virus and the spread of disease through the use of contaminated blood and blood products. A related, genetically distinct, but less wide-spread and pathologically less aggressive virus capable of inducing similar disease was reported in 1986, and is referred to as HIV-2 (2). While HIV-2 is found primarily in West Africa and is less widely disseminated than HIV-1, many countries require the screening of blood donations for antibodies to this virus as well. An HIV-1 is disclosed in EP-178 978, while an HIV-2 is disclosed in EP-0 239 425.
One feature which is characteristic of human immunodeficiency viruses is their sequence variability. The genomes of all HIVs encode the enzyme reverse transcriptase. This enzyme, which the virus requires to convert its RNA genome into its double-stranded DNA equivalent prior to integration into host cell DNA, is essential for virus replication. Unlike many polymerases, this Mg.sup.+2 -dependent enzyme lacks a 3'.fwdarw.5' exonuclease activity which normally serves a proofreading function. As a consequence, this enzyme tends to be error-prone. Within any HIV-infected individual, many naturally occurring sequence variants of the virus can be found, not all of which are viable. This observation has given rise to the notion of the quasi-species, a term used to describe a particular strain of HIV infecting an individual as a collection of all its closely related, naturally occurring sequence variants (3).
In addition to naturally occurring sequence variants within an infected individual, phylogenetic analyses of HIV-1 strains collected from all over the world have demonstrated that these strains can be grouped into at least 9 types (A-I) based on the similarity of their sequences (4). The differences between the types are greater than the differences observed between individual virus variants within a single infected person, or the differences between other variants belonging to the same type. The geographical distribution of these HIV-1 types varies significantly, with certain types being prevalent in one particular geographic region but rare or absent in another. Collectively, these HIV-1 types may be considered to form a group, which is usually referred to as group M (major).
In 1987, a highly divergent variant of HIV-1 was isolated that was immunologically easily distinguishable from commonly encountered HIV strains (5). This variant is described in EP-0 345 375, U.S. Pat. Nos. 5,304,466, and 5,567,603. This virus (ANT70) was antigenically closer to HIV-1 reference strains than it was to HIV-2, but was nevertheless clearly very different. The sequence of the entire provirus was subsequently determined (6). While the genome organization of this virus confirmed that this isolate was an HIV-1, a comparison of its sequence with those of many other reference strains showed that this virus was highly divergent, and phylogenetic analyses placed this isolate in its own unique branch of the HIV phylogenetic tree.
In 1991, a second, highly divergent HIV-1 strain (MVP5180) was isolated and described (7). This isolate was disclosed in EP-0 591 914, and was found to cluster phylogenetically with ANT70. The genetic distance between these two isolates was approximately as great as the distance between the virus types belonging to group M. Together, these two isolates defined a new group of HIV-1 isolates. Because these isolates clustered outside the normal cluster of conventional HIV-1 isolates, they represented a new group, usually referred to as group O (outlier).
In 1992, a third person was identified in France who was infected with a group O strain (8). The sequence of the immunologically important viral env gene was determined, and is described in WO 96/12809. Subsequently, several additional group O-infected patients were identified in France (9), and the sequence of portions of the viral env proteins of these isolates was also determined. These sequences have been described in PCT/FR96/00294. An analysis of all of the available sequences showed that they cluster together in the branch of the HIV-1 phylogenetic tree corresponding to group O. Unlike group M, there seems to be little evidence for the existence of discrete virus types within the outlier group. With the exception of the French VAU isolate, virtually all of the group O isolates to date share a link to West-Central Africa. In this portion of Africa, it has been estimated that between 5% and 8% of all cases of HIV-1 infection are caused by group O variants, however, these percentages are strongly dependent on the specific geographical region (10, 11).
While there seems to be no significant differences between group M and group O strains in terms of pathology or disease progression, the detection of antibodies produced in response to a group O infection can be unreliable when the antigens used for serological testing are derived exclusively from group M strains (12, 13). Although antibodies produced to group O antigens will often cross-react with the corresponding group M antigens, the sensitivity for anti-group O antibodies can be significantly improved by incorporating a group O antigen into the test.
Although the existence of HIV-1 groups and types is well-established, an increasing number of isolates have been identified that cannot be conveniently assigned to a specific HIV-1 group M type. Through sequence analysis it has been possible to demonstrate that these isolates are the products of recombination between viruses belonging to two or more different types. In some cases, multiple recombination events must have occurred, giving rise to "mosaic" genomes. Multiple types have been shown to coexist within a single patient, and there have been reports of multiple group M types coexisting in a patient together with a group O strain (14, 15). Since the genomes of group M and group O strains also share regions which are very highly conserved, legitimate recombination could presumably occur between these viruses as well.
A preferred antigen for the detection of antibodies produced in response to HIV infection is the transmembrane portion of the viral envelope protein. This protein, referred to as gp41, is cleaved from a gp160 precursor in the infected cell by a cellular protease. This protein contains the viral fusion peptide at its N-terminus, which the virus needs in order to fuse with and penetrate a new host cell. It also provides an anchor for the surface envelope glycoprotein gp120, which is responsible for recognizing CD4 molecules and co-receptors for the virus on the surface of susceptible cells. The interaction between gp120 and gp41 is, however, non-covalent and somewhat labile. The gp41 protein is itself anchored in the viral or host cell membrane via a hydrophobic membrane-spanning region.
Little is known of the detailed three-dimensional structure of this protein. A limited amount of structural information concerning the extracellular domain of this protein is available from the Brookhaven Protein Data Base. However, the immunologically most relevant portion of gp41 is absent, probably because it is too mobile to give rise to reflections. A comparison of viral gp41 amino acid sequences corresponding to this immunologically important region reveals the presence of several extremely highly conserved amino acids in what is presumably the top of a tight disulfide-stabilized loop, suggesting that these amino acids serve an essential structural and functional role.