HIV-1 is an RNA virus of the family Retroviridiae. The HIV genome encodes at least nine proteins which are divided into three classes: the major structural proteins Gag, Pol and Env, the regulatory proteins Tat and Rev, and the accessory proteins Vpu, Vpr, Vif and Nef. The HIV genome exhibits the 5′LTR-gag-pol-env-LTR3′ organization of all retroviruses.
The HIV envelope glycoprotein gp120 is the viral protein that is used for attachment to the host cell. This attachment is mediated by binding to two surface molecules of helper T cells and macrophages, known as CD4 and one of the two chemokine receptors CCR-5 or CXCR-4. The gp120 protein is first expressed as a larger precursor molecule (gp160), which is then cleaved post-translationally to yield gp120 and gp41. The gp120 protein is retained on the surface of the virion by linkage to the gp41 molecule, which is inserted into the viral membrane.
The gp120 protein is the principal target of neutralizing antibodies, but unfortunately the most immunogenic regions of the proteins (V3 loop) are also the most variable parts of the protein. Therefore, the use of gp120 (or its precursor gp160) as a vaccine antigen to elicit neutralizing antibodies is thought to be of limited use for a broadly protective vaccine. The gp120 protein does also contain epitopes that are recognized by cytotoxic T lymphocytes (CTL). These effector cells are able to eliminate virus-infected cells, and therefore constitute a second major antiviral immune mechanism. In contrast to the target regions of neutralizing antibodies some CTL epitopes appear to be relatively conserved among different HIV strains. For this reason gp120 and gp160 maybe useful antigenic components in vaccines that aim at eliciting cell-mediated immune responses (particularly CTL).
Non-envelope proteins of HIV-1 include for example internal structural proteins such as the products of the gag and pol genes and other non-structural proteins such as Rev, Nef, Vif and Tat (Green et al., New England J. Med, 324, 5, 308 et seq (1991) and Bryant et al. (Ed. Pizzo), Pediatr. Infect. Dis. J., 11, 5, 390 et seq (1992).
HIV Nef is an early protein, that is it is expressed early in infection and in the absence of structural protein.
The Nef gene encodes an early accessory HIV protein which has been shown to possess several activities. For example, the Nef protein is known to cause the down regulation of CD4, the HIV receptor, and MHC class I molecules from the cell surface, although the biological importance of these functions is debated. Additionally Nef interacts with the signal pathway of T cells and induces an active state, which in turn may promote more efficient gene expression. Some HIV isolates have mutations in this region, which cause them not to encode functional protein and are severely compromised in their replication and pathogenesis in vivo.
The Gag gene is translated as a precursor polyprotein that is cleaved by protease to yield products that include the matrix protein (p17), the capsid (p24), the nucleocapsid (p9), p6 and two space peptides, p2 and p1.
The Gag gene gives rise to the 55-kilodalton (kD) Gag precursor protein, also called p55, which is expressed from the unspliced viral mRNA. During translation, the N terminus of p55 is myristoylated, triggering its association with the cytoplasmic aspect of cell membranes. The membrane-associated Gag polyprotein recruits two copies of the viral genomic RNA along with other viral and cellular proteins that triggers the budding of the viral particle from the surface of an infected cell. After budding, p55 is cleaved by the virally encoded protease (a product of the pol gene) during the process of viral maturation into four smaller proteins designated MA (matrix [p17]), CA (capsid [p24]), NC (nucleocapsid [p9]), and p6.
In addition to the 3 major Gag proteins, all Gag precursors contain several other regions, which are cleaved out and remain in the virion as peptides of various sizes. These proteins have different roles e.g. the p2 protein has a proposed role in regulating activity of the protease and contributes to the correct timing of proteolytic processing.
The p17 (MA) polypeptide is derived from the N-terminal, myristoylated end of p55. Most MA molecules remain attached to the inner surface of the virion lipid bilayer, stabilizing the particle. A subset of MA is recruited inside the deeper layers of the virion where it becomes part of the complex which escorts the viral DNA to the nucleus. These MA molecules facilitate the nuclear transport of the viral genome because a karyophilic signal on MA is recognized by the cellular nuclear import machinery. This phenomenon allows HIV to infect non-dividing cells, an unusual property for a retrovirus.
The p24 (CA) protein forms the conical core of viral particles. Cyclophilin A has been demonstrated to interact with the p24 region of p55 leading to its incorporation into HIV particles. The interaction between Gag and cyclophilin A is essential because the disruption of this interaction by cyclosporin A inhibits viral replication.
The NC region of Gag is responsible for specifically recognizing the so-called packaging signal of HIV. The packaging signal consists of four stem loop structures located near the 5′ end of the viral RNA, and is sufficient to mediate the incorporation of a heterologous RNA into HIV-1 virions. NC binds to the packaging signal through interactions mediated by two zinc-finger motifs. NC also facilitates reverse transcription.
The p6 polypeptide region mediates interactions between p55 Gag and the accessory protein Vpr, leading to the incorporation of Vpr into assembling virions. The p6 region also contains a so-called late domain which is required for the efficient release of budding virions from an infected cell.
The Pol gene encodes two proteins containing the two activities needed by the virus in early infection, the RT and the integrase protein needed for integration of viral DNA into cell DNA. The primary product of Pol is cleaved by the virion protease to yield the amino terminal RT peptide which contains activities necessary for DNA synthesis (RNA and DNA-dependent DNA polymerase activity as well as an RNase H function) and carboxy terminal integrase protein. HIV RT is a heterodimer of full-length RT (p66) and a cleavage product (p51) lacking the carboxy terminal RNase H domain.
RT is one of the most highly conserved proteins encoded by the retroviral genome. Two major activities of RT are the DNA Pol and Ribonuclease H. The DNA Pol activity of RT uses RNA and DNA as templates interchangeably and like all DNA polymerases known is unable to initiate DNA synthesis de novo, but requires a pre existing molecule to serve as a primer (RNA).
The RNase H activity inherent in all RT proteins plays the essential role early in replication of removing the RNA genome as DNA synthesis proceeds. It selectively degrades the RNA from all RNA-DNA hybrid molecules. Structurally the polymerase and ribo H occupy separate, non-overlapping domains with the Pol covering the amino two thirds of the Pol.
The p66 catalytic subunit is folded into 5 distinct subdomains. The amino terminal 23 of these have the portion with RT activity. Carboxy terminal to these is the RNase H Domain.
WO 03/025003 describes DNA constructs encoding HIV-1 p17/24 Gag, Nef and RT, wherein the DNA sequences may be codon optimized to resemble highly expressed human genes. These constructs are useful in DNA vaccines.
Fusion proteins containing multiple HIV antigens have been suggested as vaccine candidates for HIV, for example the Nef-Tat fusion as described in WO 99/16884. However, fusion proteins are not straightforward to produce; there can be difficulties in expressing them because they do not correspond to native proteins. There can be difficulties at the transcription level, or further downstream. Also, they may not be straightforward to formulate into a pharmaceutically acceptable composition. Notably, the majority of approaches to HIV vaccines that involve multiple antigens fused together, are DNA or live vector approaches rather than polypeptide fusion proteins.