1. HIV Infection
Human Immunodeficiency Virus-1 (HIV-1) is the etiological agent of acquired human immune deficiency syndrome (AIDS) and related disorders. HIV-1 is an RNA virus of the Retroviridae family and exhibits the 5xe2x80x2LTR-gag-pol-env-LTR3xe2x80x2 organization of all retroviruses. In addition, HIV-1 comprises a handful of genes with regulatory or unknown functions, including the tat and rev genes. The env gene encodes the viral envelope glycoprotein that is translated as a 160-kilodalton (kDa) precursor (gp160) and then cleaved by a cellular protease to yield the external 120-kDa envelope glycoprotein (gp120) and the transmembrane 41-kDa envelope glycoprotein (gp41). Gp120 and gp41 remain associated and are displayed on the viral particles and the surface of HIV-infected cells. Gp120 binds to the CD4 receptor present on the surface of helper T-lymphocytes, macrophages and other target cells. After gp120 binds to CD4, gp41 mediates the fusion event responsible for virus entry.
Infection begins when gp120 on the viral particle binds to the CD4 receptor on the surface of T4 lymphocytes or other target cells. The bound virus merges with the target cell and reverse transcribes its RNA genome into the double-stranded DNA of the cell. The viral DNA is incorporated into the genetic material in the cell""s nucleus, where the viral DNA directs the production of new viral RNA, viral proteins, and new virus particles. The new particles bud from the target cell membrane and infect other cells.
Destruction of T4 lymphocytes, which are critical to immune defense, is a major cause of the progressive immune dysfunction that is the hallmark of HIV infection. The loss of target cells seriously impairs the body""s ability to fight most invaders, but it has a particularly severe impact on the defenses against viruses, fungi, parasites and certain bacteria, including mycobacteria.
HIV-1 kills the cells it infects by replicating, budding from them and damaging the cell membrane. HIV-1 may kill target cells indirectly by means of the viral gp120 that is displayed on an infected cell""s surface. Since the CD4 receptor on T cells has a strong affinity for gp120, healthy cells expressing CD4 receptor can bind to gp120 and fuse with infected cells to form a syncytium. A syncytium cannot survive.
HIV-1 can also elicit normal cellular immune defenses against infected cells. With or without the help of antibodies, cytotoxic defensive cells can destroy an infected cell that displays viral proteins on its surface. Finally, free gp120 may circulate in the blood of individuals infected with HIV-1. The free protein may bind to the CD4 receptor of uninfected cells, making them appear to be infected and evoking an immune response.
Infection with HIV-1 is almost always fatal, and at present there are no cures for HIV-1 infection. Effective vaccines for prevention of HIV-1 infection are not yet available. Because of the danger of reversion or infection, live attenuated virus probably cannot be used as a vaccine. Most subunit vaccine approaches have not been successful at preventing HIV infection. Treatments for HIV-1 infection, while prolonging the lives of some infected persons, have serious side effects. There is thus a great need for effective treatments and vaccines to combat this lethal infection.
2. Vaccines
Vaccination is an effective form of disease prevention and has proven successful against several types of viral infection. Determining ways to present HIV-1 antigens to the human immune system in order to evoke protective humoral and cellular immunity, is a difficult task. To date, attempts to generate an effective HIV vaccine have not been successful. In AIDS patients, free virus is present in low levels only. Transmission of HIV-1 is enhanced by cell-to-cell interaction via fusion and syncytia formation. Hence, antibodies generated against free virus or viral subunits are generally ineffective in eliminating virus-infected cells.
Vaccines exploit the body""s ability to xe2x80x9crememberxe2x80x9d an antigen. After first encounters with a given antigen the immune system generates cells that retain an immunological memory of the antigen for an individual""s lifetime. Subsequent exposure to the antigen stimulates the immune response and results in elimination or inactivation of the pathogen.
The immune system deals with pathogens in two ways: by humoral and by cell-mediated responses. In the humoral response lymphocytes generate specific antibodies that bind to the antigen thus inactivating the pathogen. The cell-mediated response involves cytotoxic and helper T lymphocytes that specifically attack and destroy infected cells.
Vaccine development with HIV-1 virus presents problems because HIV-1 infects some of the same cells the vaccine needs to activate in the immune system (i.e., T4 lymphocytes). It would be advantageous to develop a vaccine which inactivates HIV before impairment of the immune system occurs. A particularly suitable type of HIV vaccine would generate an anti-HIV immune response which recognizes HIV variants and which works in HIV-positive individuals who are at the beginning of their infection.
A major challenge to the development of vaccines against viruses, particularly those with a high rate of mutation such as the human immunodeficiency virus, against which elicitation of neutralizing and protective immune responses is desirable, is the diversity of the viral envelope proteins among different viral isolates or strains. Because cytotoxic T-lymphocytes (CTLs) in both mice and humans are capable of recognizing epitopes derived from conserved internal viral proteins, and are thought to be important in the immune response against viruses, efforts have been directed towards the development of CTL vaccines capable of providing heterologous protection against different viral strains.
It is known that CD8+ CTLs kill virally-infected cells when their T cell receptors recognize viral peptides associated with MHC class I molecules. The viral peptides are derived from endogenously synthesized viral proteins, regardless of the protein""s location or function within the virus. Thus, by recognition of epitopes from conserved viral proteins, CTLs may provide cross-strain protection. Peptides capable of associating with MHC class I for CTL recognition originate from proteins that are present in or pass through the cytoplasm or endoplasmic reticulum. In general, exogenous proteins, which enter the endosomal processing pathway (as in the case of antigens presented by MHC class II molecules), are not effective at generating CD8+ CTL responses.
Most efforts to generate CTL responses have used replicating vectors to produce the protein antigen within the cell or they have focused upon the introduction of peptides into the cytosol. These approaches have limitations that may reduce their utility as vaccines. Retroviral vectors have restrictions on the size and structure of polypeptides that can be expressed as fusion proteins while maintaining the ability of the recombinant virus to replicate, and the effectiveness of vectors such as vaccinia for subsequent immunizations may be compromised by immune responses against the vectors themselves. Also, viral vectors and modified pathogens have inherent risks that may hinder their use in humans. Furthermore, the selection of peptide epitopes to be presented is dependent upon the structure of an individual""s MHC antigens and, therefore, peptide vaccines may have limited effectiveness due to the diversity of MHC haplotypes in outbred populations.
3. DNA Vaccines
Benvenisty, N., and Reshef, L. [PNAS 83, 9551-9555, (1986)] showed that CaPO4-precipitated DNA introduced into mice intraperitoneally (i.p.), intravenously (i.v.) or intramuscularly (i.m.) could be expressed. The i.m. injection of DNA expression vectors without CaCl2 treatment in mice resulted in the uptake of DNA by the muscle cells and expression of the protein encoded by the DNA. The plasmids were maintained episomally and did not replicate. Subsequently, persistent expression has been observed after i.m. injection in skeletal muscle of rats, fish and primates, and cardiac muscle of rats. The technique of using nucleic acids as therapeutic agents was reported in WO90/11092 (Oct. 4, 1990), in which naked polynucleotides were used to vaccinate vertebrates.
It is not necessary for the success of the method that immunization be intramuscular. The introduction of gold microprojectiles coated with DNA encoding bovine growth hormone (BGH) into the skin of mice resulted in production of anti-BGH antibodies in the mice. A jet injector has been used to transfect skin, muscle, fat, and mammary tissues of living animals. Various methods for introducing nucleic have been reviewed. Intravenous injection of a DNA:cationic liposome complex in mice was shown by Zhu et al., [Science 261:209-211 (Jul. 9, 1993) to result in systemic expression of a cloned transgene. Ulmer et al., [Science 259:1745-1749, (1993)] reported on the heterologous protection against influenza virus infection by intramuscular injection of DNA encoding influenza virus proteins.
The need for specific therapeutic and prophylactic agents capable of eliciting desired immune responses against pathogens and tumor antigens is met by the instant invention. Of particular importance in this therapeutic approach is the ability to induce T-cell immune responses which can prevent infections or disease caused even by virus strains which are heterologous to the strain from which the antigen gene was obtained. This is of particular concern when dealing with HIV as this virus has been recognized to mutate rapidly and many virulent isolates have been identified [see, for example, LaRosa et al., Science 249:932-935 (1990), identifying 245 separate HIV isolates]. In response to this recognized diversity, researchers have attempted to generate CTLs based on peptide immunization. Thus, Takahashi et al., [Science 255:333-336 (1992)] reported on the induction of broadly cross-reactive cytotoxic T cells recognizing an HIV envelope (gp160) determinant. However, those workers recognized the difficulty in achieving a truly cross-reactive CTL response and suggested that there is a dichotomy between the priming or restimulation of T cells, which is very stringent, and the elicitation of effector function, including cytotoxicity, from already stimulated CTLs.
Wang et al. reported on elicitation of immune responses in mice against HIV by intramuscular inoculation with a cloned, genomic (unspliced) HIV gene. However, the level of immune responses achieved in these studies was very low. In addition, the Wang et al., DNA construct utilized an essentially genomic piece of HIV encoding contiguous Tat/rev-gp160-Tat/rev coding sequences. As is described in detail below, this is a suboptimal system for obtaining high-level expression of the gp160. It also is potentially dangerous because expression of Tat contributes to the progression of Kaposi""s Sarcoma.
WO 93/17706 describes a method for vaccinating an animal against a virus, wherein carrier particles were coated with a gene construct and the coated particles are accelerated into cells of an animal. In regard to HIV, essentially the entire genome, minus the long terminal repeats, was proposed to be used. That method represents substantial risks for recipients. It is generally believed that constructs of HIV should contain less than about 50% of the HIV genome to ensure safety of the vaccine; this ensures that enzymatic moieties and viral regulatory proteins, many of which have unknown or poorly understood functions have been eliminated. Thus, a number of problems remain if a useful human HIV vaccine is to emerge from the gene-delivery technology.
The instant invention contemplates any of the known methods for introducing polynucleotides into living tissue to induce expression of proteins. However, this invention provides a novel immunogen for introducing HIV and other proteins into the antigen processing pathway to efficiently generate HIV-specific CTLs and antibodies. The pharmaceutical is effective as a vaccine to induce both cellular and humoral anti-HIV and HIV neutralizing immune responses. In the instant invention, the problems noted above are addressed and solved by the provision of polynucleotide immunogens which, when introduced into an animal, direct the efficient expression of HIV proteins and epitopes without the attendant risks associated with those methods. The immune responses thus generated are effective at recognizing HIV, at inhibiting replication of HIV, at identifying and killing cells infected with HIV, and are cross-reactive against many HIV strains.
4. Codon Usage and Codon Context
The codon pairings of organisms are highly nonrandom, and differ from organism to organism. This information is used to construct and express altered or synthetic genes having desired levels of translational efficiency, to determine which regions in a genome are protein coding regions, to introduce translational pause sites into heterologous genes, and to ascertain relationship or ancestral origin of nucleotide sequences.
The expression of foreign heterologous genes in transformed organisms is now commonplace. A large number of mammalian genes, including, for example, murine and human genes, have been successfully inserted into single celled organisms. Standard techniques in this regard include introduction of the foreign gene to be expressed into a vector such as a plasmid or a phage and utilizing that vector to insert the gene into an organism. The native promoters for such genes are commonly replaced with strong promoters compatible with the host into which the gene is inserted. Protein sequencing machinery permits elucidation of the amino acid sequences of even minute quantities of native protein. From these amino acid sequences, DNA sequences coding for those proteins can be inferred. DNA synthesis is also a rapidly developing art, and synthetic genes corresponding to those inferred DNA sequences can be readily constructed.
Despite the burgeoning knowledge of expression systems and recombinant DNA, significant obstacles remain when one attempts to express a foreign or synthetic gene in an organism. Many native, active proteins, for example, are glycosylated in a manner different from that which occurs when they are expressed in a foreign host. For this reason, eukaryotic hosts such as yeast may be preferred to bacterial hosts for expressing many mammalian genes. The glycosylation problem is the subject of continuing research.
Another problem is more poorly understood. Often translation of a synthetic gene, even when coupled with a strong promoter, proceeds much less efficiently than would be expected. The same is frequently true of exogenous genes foreign to the expression organism. Even when the gene is transcribed in a sufficiently efficient manner that recoverable quantities of the translation product are produced, the protein is often inactive or otherwise different in properties from the native protein.
It is recognized that the latter problem is commonly due to differences in protein folding in various organisms. The solution to this problem has been elusive, and the mechanisms controlling protein folding are poorly understood.
The problems related to translational efficiency are believed to be related to codon context effects. The protein coding regions of genes in all organisms are subject to a wide variety of functional constraints, some of which depend on the requirement for encoding a properly functioning protein, as well as appropriate translational start and stop signals. However, several features of protein coding regions have been discerned which are not readily understood in terms of these constraints. Two important classes of such features are those involving codon usage and codon context.
It is known that codon utilization is highly biased and varies considerably between different organisms. Codon usage patterns have been shown to be related to the relative abundance of tRNA isoacceptors. Genes encoding proteins of high versus low abundance show differences in their codon preferences. The possibility that biases in codon usage alter peptide elongation rates has been widely discussed. While differences in codon use are associated with differences in translation rates, direct effects of codon choice on translation have been difficult to demonstrate. Other proposed constraints on codon usage patterns include maximizing the fidelity of translation and optimizing the kinetic efficiency of protein synthesis.
Apart from the non-random use of codons, considerable evidence has accumulated that codon/anticodon recognition is influenced by sequences outside the codon itself, a phenomenon termed xe2x80x9ccodon context.xe2x80x9d There exists a strong influence of nearby nucleotides on the efficiency of suppression of nonsense codons as well as missense codons. Clearly, the abundance of suppressor activity in natural bacterial populations, as well as the use of xe2x80x9cterminationxe2x80x9d codons to encode selenocysteine and phosphoserine require that termination be context-dependent. Similar context effects have been shown to influence the fidelity of translation, as well as the efficiency of translation initiation.
Statistical analyses of protein coding regions of E. coli have demonstrate another manifestation of xe2x80x9ccodon context.xe2x80x9d The presence of a particular codon at one position strongly influences the frequency of occurrence of certain nucleotides in neighboring codons, and these context constraints differ markedly for genes expressed at high versus low levels. Although the context effect has been recognized, the predictive value of the statistical rules relating to preferred nucleotides adjacent to codons is relatively low. This has limited the utility of such nucleotide preference data for selecting codons to effect desired levels of translational efficiency.
The advent of automated nucleotide sequencing equipment has made available large quantities of sequence data for a wide variety of organisms. Understanding those data presents substantial difficulties. For example, it is important to identify the coding regions of the genome in order to relate the genetic sequence data to protein sequences. In addition, the ancestry of the genome of certain organisms is of substantial interest. It is known that genomes of some organisms are of mixed ancestry. Some sequences that are viral in origin are now stably incorporated into the genome of eukaryotic organisms. The viral sequences themselves may have originated in another substantially unrelated species. An understanding of the ancestry of a gene can be important in drawing proper analogies between related genes and their translation products in other organisms.
There is a need for a better understanding of codon context effects on translation, and for a method for determining the appropriate codons for any desired translational effect. There is also a need for a method for identifying coding regions of the genome from nucleotide sequence data. There is also a need for a method for controlling protein folding and for insuring that a foreign gene will fold appropriately when expressed in a host. Genes altered or constructed in accordance with desired translational efficiencies would be of significant worth.
Another aspect of the practice of recombinant DNA techniques for the expression by microorganisms of proteins of industrial and pharmaceutical interest is the phenomenon of xe2x80x9ccodon preferencexe2x80x9d. While it was earlier noted that the existing machinery for gene expression is genetically transformed host cells will xe2x80x9coperatexe2x80x9d to construct a given desired product, levels of expression attained in a microorganism can be subject to wide variation, depending in part on specific alternative forms of the amino acid-specifying genetic code present in an inserted exogenous gene. A xe2x80x9ctripletxe2x80x9d codon of four possible nucleotide bases can exist in 64 variant forms. That these forms provide the message for only 20 different amino acids (as well as transcription initiation and termination) means that some amino acids can be coded for by more than one codon. Indeed, some amino acids have as many as six xe2x80x9credundantxe2x80x9d, alternative codons while some others have a single, required codon. For reasons not completely understood, alternative codons are not at all uniformly present in the endogenous DNA of differing types of cells and there appears to exist a variable natural hierarchy or xe2x80x9cpreferencexe2x80x9d for certain codons in certain types of cells.
As one example, the amino acid leucine is specified by any of six DNA codons including CTA, CTC, CTG, CTT, TTA, and TTG (which correspond, respectively, to the mRNA codons, CUA, CUC, CUG, CUU, UUA and UUG). Exhaustive analysis of genome codon frequencies for microorganisms has revealed endogenous DNA of E. coli most commonly contains the CTG leucine-specifying codon, while the DNA of yeasts and slime molds most commonly includes a TTA leucine-specifying codon. In view of this hierarchy, it is generally held that the likelihood of obtaining high levels of expression of a leucine-rich polypeptide by an E. coli host will depend to some extent on the frequency of codon use. For example, a gene rich in TTA codons will in all probability be poorly expressed in E. coli, whereas a CTG rich gene will probably highly express the polypeptide. Similarly, when yeast cells are the projected transformation host cells for expression of a leucine-rich polypeptide, a preferred codon for use in an inserted DNA would be TTA.
The implications of codon preference phenomena on recombinant DNA techniques are manifest, and the phenomenon may serve to explain many prior failures to achieve high expression levels of exogenous genes in successfully transformed host organisms-a less xe2x80x9cpreferredxe2x80x9d codon may be repeatedly present in the inserted gene and the host cell machinery for expression may not operate as efficiently. This phenomenon suggests that synthetic genes which have been designed to include a projected host cell""s preferred codons provide a preferred form of foreign genetic material for practice of recombinant DNA techniques.
5. Protein Trafficking
The diversity of function that typifies eukaryote cells depends upon the structural differentiation of their membrane boundaries. To generate and maintain these structures, proteins must be transported from their site of synthesis in the endoplasmic reticulum to predetermined destinations throughout the cell. This requires that the trafficking proteins display sorting signals that are recognized by the molecular machinery responsible for route selection located at the access points to the main trafficking pathways. Sorting decisions for most proteins need to be made only once as they traverse their biosynthetic pathways since their final destination, the cellular location at which they perform their function, becomes their permanent residence.
Maintenance of intracellular integrity depends in part on the selective sorting and accurate transport of proteins to their correct destinations. Over the past few years the dissection of the molecular machinery for targeting and localization of proteins has been studied vigorously. Defined sequence motifs have been identified on proteins which can act as xe2x80x98address labelsxe2x80x99. A number of sorting signals have been found associated with the cytoplasmic domains of membrane proteins.
Synthetic polynucleotides comprising a DNA sequence encoding a peptide or protein are provided. The DNA sequence of the synthetic polynucleotides comprise codons optimized for expression in a nonhomologous host. The invention is exemplified by synthetic DNA molecules encoding HIV env as well as modifications of HIV env. The codons of the synthetic molecules include the projected host cell""s preferred codons. The synthetic molecules provide preferred forms of foreign genetic material. The synthetic molecules may be used as a polynucleotide vaccine which provides effective immunoprophylaxis against HIV infection through neutralizing antibody and cell-mediated immunity. This invention provides polynucleotides which, when directly introduced into a vertebrate in vivo, including mammals such as primates and humans, induces the expression of encoded proteins within the animal.