The present invention is the isolation and purification of a newly discovered gene of the AIDS virus, HTLV-III, which encodes a protein which is immunogenic and recognized by sera of some HTLV-III seropositive people. Furthermore, the gene is highly conserved among all known HTLV-III isolates and exhibits a polymorphism at the 3' end which distinguishes molecular clones of the HTLV-IIIB cell line from viral genomes of related viruses (i.e., other HTLV-III isolates, LAV, ARV, etc.). Also, the gene or the gene product(s) may be suitable targets for antiviral therapy.
Four distinct isolates of acquired immune deficiency syndrome (AIDS) virus have been previously characterized in detail:
HTLV-III.sub.RF was obtained from a 25-year-old black Haitian man who immigrated to the United States in 1980. HTLV-III.sub.B denotes a group of very related viruses obtained in 1983 from several different patients with AIDS or ARC (AIDS related complex) from the New York City area in 1983. LAV-1a was obtained from a biopsied lymph node of a French homosexual man with lymphadenopathy syndrome who had over 50 different sexual partners per year and had travelled to many countries including the United States. ARV-2 was isolated from the peripheral blood of a homosexual man from San Francisco in 1984, one month before the diagnosis of AIDS was established. Representative clones comprising the full-length genomes of each of these viruses have been described and the nucleotide sequence published (see, for example, Ratner et al., Nature, Vol. 313, pages 277-284,, 1985).
Human T-cell Lymphotropic Virus Type III (HTLV-III), the etiological agent of Acquired Immune Deficiency Syndrome (AIDS), is now known generically as human immunodeficiency virus (HIV), is a member of the retrovirus family. However, the complexity of its genomic structure is unprecedented among retroviruses. In addition to the three structural genes (gag, pol, and env) in common with other retroviruses, four additional genes have already been identified (sor, 3'orf, and tat-III). Two of these, sor and 3'orf, were originally identified as open reading frames by nucleotide sequence studies and have been verified to encode serologically reactive proteins of 23 kd and 27 kd, respectively. Both of these gense appear to be dispensible for production of infectious cytopathic virions, although mutants lacking the sor gene are greatly compromised in the level of virus production. The transactivator gene (tat-III) was identified functionally by the capacity of its product to enhance expression of genes linked to the HTLV-III long terminal repeat (LTR). It is transcribed from three discontiguous segments of the HTLV-III genome into a 2.0 kb mRNA. The resultant protein (p14) is requisite for virus replication, and its level of expression directly correlates with the level of virus proteins produced, but not necessarily viral mRNA expression. Recently, a seventh viral gene product was found to be engendered by the same spliced mRNA as tat-III, but utilizes an alternate reading frame [Feinberg et al, Cell, 46:807-817 (1986); and Sodroski et al, Nature, 321:412-417 (1986)]. It is believed that the function of this gene is either to reverse an intrinsic block on the translation of HTLV-III gag and env mRNA into proteins, or to regulate the relative amounts of HTLV-III genomic and spliced subgenomic mRNA. If the former function is correct, the gene will be named art (anti-repressor transactivator); if the latter is correct, the gene will be named trs (trans-acting regulator of splicing). While the major effects of both tat-III and art/trs are post-transcriptional, HTLV-III-infected cells also synthesize a transcriptional activator specific for transcription from its own LTR (Okamoto and Wong-Staal, Cell, in press). This viral factor, whose location on the viral genome is still unknown, may be distinct from tat-III.
Inspection of the nucleotide sequences of diverse HTLV-III isolates revealed at least two other open reading frames that can potentially encode proteins of 80-100 amino acids. One of these, referred to as R, is the subject of the present invention, and is highly conserved among HTLV-III isolates.
The HTLV-III genome is unusually complex for a retrovirus, possessing in addition to the replicative genes (gag, pol, and env) at least three extra genes (sor, tat, and 3'orf). Of these, the transactivator gene of HTLV-III (tat-III) has been determined and shown to be critical to virus replication. The sor and 3'orf genes, originally identified as open reading frames, have been shown to encode proteins which are immunogenic in vivo, but the function of these genes are, as yet, unknown.