The present invention relates to the determination and control of bimolecular interactions and, more particularly, to the exploitation of these interactions for the production of new pharmaceuticals, such as vaccines, and diagnostic or research assays, such as antibody-based assays.
Bimolecular interactions are important for a variety of biological processes, including pathological processes. Such interactions typically involve the recognition of a three-dimensional structure, such as a protein, carbohydrate or drug ligand, by another such structure. Nature performs with ease many such interactions, which so far have proven largely refractory to analysis. Such difficulty has had a negative impact on the fields of vaccine and drug development in particular, which have had to rely on a trial-and-error approach, in the absence of defined rules for the production of novel vaccines and other pharmaceuticals. However, such trial-and-error approaches are costly and inefficient. Clearly, new approaches are needed in these fields.
The problem of vaccine and drug development, which is associated with bimolecular interactions, can be narrowed to the interaction between specific epitopes on the two molecules involved. In the case of two proteins, these epitopes can be composed of particular peptides, or of peptides and carbohydrates. For a drug and its receptor, these epitopes may consist of peptides on the receptor, and functional groups on the drug. These different materials would appear to indicate that these different types of bimolecular interactions would require different systems for study. However, as described below, all of these different epitopes can be mimicked by peptides. Thus, a single system for screening large numbers of peptides could be employed to explore all of these different types of epitope interactions, since all of these interactions could be represented by different types of peptide libraries. For example, peptides which mimic a carbohydrate could be found using a random peptide library, which contains all possible peptides of a given length. Alternatively, an antigen library could be used to represent peptides derived from the primary sequence of an antigen, such as a protein, for example. If such an antigen library represented all possible peptides of a given-length contained within the protein, the library could be said to represent a complete pepscan of the antigen.
Such a complete pepscan could be found in a reference by Baughn et al. [Baughn, R. E., Demecs, M., Taber, L. H. and D. M. Musher, Infection and Immunity, 1996, 64:2457–2466] for the 15-kDa lipoprotein of Treponema pallidum, which causes syphilis. Overlapping decapeptides (ten amino acids) were synthesized, each of which overlapped the next by nine amino acids, and were offset by one amino acid, so that a complete set of decapeptides was obtained. These were then screened with sera from syphilitic rabbits in an ELISA (enzyme-linked immunosorbent assay), to find those peptides which reacted with antibodies against syphilis. The limitations to such an approach are immediately obvious, particularly since the synthesis of such a large number of peptides is both tedious and difficult to manage. Clearly, producing complete pepscans by peptide synthesis limits the approach for small proteins. Indeed, Baughn et al. note that their choice of protein was strongly influenced by size.
Thus, a new approach to the exploration of bimolecular interactions, and by extension to the fields of vaccine and drug development, is required. This approach uses combinatorial phage display peptide libraries to quickly sort through a huge number of peptides to find those peptides of interest, by a screening assay which functionally selects for a particular behavior in a peptide, as described by G. P. Smith and J. K. Scott [Scott, J. K. and G. P. Smith, Science, 1990, 249:386–390 and Smith, G. P. and J. K. Scott, Methods in Enzymol., 1993, 217:228–257]. For example, to find peptides which bind a particular protein, a phage display peptide library can be affinity-purified using that protein, and then reinfected into bacteria to make more phage containing those peptides of interest. Thus, two problems are solved simultaneously. First, a huge number of peptides can be screened in a single assay. Second, those peptides of interest can be enriched simply by infecting bacteria with the phage containing those peptides, and using the biological machinery of the bacteria to make more phages of interest. Thus, combinatorial phage display peptide libraries can do quickly and easily what artificial laboratory techniques cannot.
Such phage display peptide libraries are typically constructed in the following manner. Phages consist of DNA surrounded by coat proteins, which enable the phage to infect host bacteria and replicate themselves, producing many copies of the phage. To exploit this property, DNA sequences coding for the peptide of interest are inserted into the gene coding for a phage coat protein. As long as these insertions do not interfere with the life cycle of the phage, these modified phages will have coat proteins which display the foreign peptide. Filamentous phages are the preferred vectors, because two of their coat proteins can be easily modified to display foreign peptides, and thus foreign epitopes, on their surface. In general, these modifications are well tolerated. However, even if the modifications are not tolerated, the phage can still be rescued by a variety of techniques, including co-infection with a wild-type phage, known in the art as a helper phage.
The two coat proteins of the filamentous phage of types such as M13, fd and f2 are known as pIII and pVIII. There are only five copies of pIII on the phage coat, while there are about 2700 copies of pVIII on the coat. However, pIII can generally tolerate large insertions of up to a few hundred amino acids in length, while pVIII can tolerate only five or six amino acid insertions. As noted above, other techniques can be used to rescue phage with pVIII proteins containing larger insertions.
There are two divergent methods for selecting the group of peptides which are to be inserted into the phages to form the phage library. The first type of peptide group is selected according to a known DNA or protein sequence, and forms a series of overlapping peptides. These protein-derived peptides can be used to represent a protein epitope or an entire protein. The second type of peptide group is a partial or complete set of random oligonucleotides. The first group is clearly most useful for a defined problem; for example, the mapping of a particular epitope. The second group is clearly useful when an amino acid sequence for an epitope is unknown, discontinuous in the primary amino acid sequence, or as in the case for complex carbohydrate epitopes, non-existent. In the last case, peptides, called mimotopes, have been found which mimic a selected carbohydrate epitope.
The use of each of these groups of peptides can be most easily demonstrated with reference to the field of vaccine development. Vaccinology is based upon the discovery of epitopes within the pathogen of interest which can be used to elicit an immune response which can neutralize that pathogen. Once these epitopes have been found, they can be presented to the immune system as an active vaccine, to prime the immune system against future infection, in most cases, without causing any infection or pathology themselves. Alternatively, antibodies which bind these epitopes can be isolated and administered as a passive vaccine. Thus, vaccinology depends upon the screening of large numbers of epitopes, in the hope of finding such “neutralizing epitopes”, and as such is clearly amenable to the phage display library approach.
One example is the screening of random peptide phage libraries with purified antibodies or sera from humans or animals which have been challenged with a particular pathogen or with an antigen of that pathogen. For example, sera from human patients immunized against a hepatitis B viral antigen, an envelope protein from the virus, were used to screen a random library of nonapeptides (peptides of nine amino acids) inserted into the coat protein pVIII. Phages were selected which contained a nonapeptide that was both an antigenic and an immunogenic mimic of the actual viral antigen [Folgori, A. et al., EMBO, 1994, 13:2236–2243]. Such an approach has also been used in diseases where specific antigens are not known, in an effort to map those antigenic epitopes which react with antibodies in sera which have been raised against the pathogen itself.
The use of protein or antigen derived phage libraries represents a more pathogen-specific approach. The advantage of the antigen derived approach is that the peptides presented by the phage are all related to the pathogen of interest, unlike the random peptide approach, in which many of the peptides will not represent any portion of the pathogen. Thus, a higher proportion of the phages will contain peptides with potentially useful information.
An example of the use of antigen derived phage libraries to map an antigen is given by Wang et al. [Wang et al., J. Immun. Methods, 1995, 178:1–12] for the bluetongue virus outer capsid protein VP5. Bluetongue virus infects sheep and cattle. VP5 is a known antigen for this virus and is 526 amino acids in length. The VP5 gene was partially digested using DNAase I, an enzyme which cuts DNA relatively randomly. The resulting DNA fragments were sorted by size, and those of about 100–200 bp were inserted into the phage pIII gene, and expressed in a phage display library. This library was screened with a monoclonal antibody to find those peptides of the VP5 protein which bind to that antibody and two different peptides were found. As estimated by the authors, this library contained about 200 different peptides, including those peptides representing the vector itself. Thus, only about 70 peptides represented the actual antigen. However, in order for every possible peptide of 30–70 amino acids contained within the VP5 protein to be represented, at least 450 different peptides would be required. Thus, this library did not even completely represent a single known antigen with overlapping peptides. A much more extensive library would be needed to represent all overlapping peptides of a given length within the antigen, thus generating a complete pepscan.
Phage libraries do not need to be used simply for mapping epitopes, however. Perham and colleagues [Greenwood, J., Willis, A. E. and R. N. Perham, J. Mol. Biol., 1991, 220:821–827; and Willis, A. E., Perham, R. N. and D. Wraith, Gene, 1993, 128:79–83] have suggested that peptide epitopes displayed by phage can act as antigens. Discrete peptides obtained from the major surface protein of the malaria parasite Plasmodium falciparum were injected into mice, and were successfully immunogenic, causing specific antibodies to be raised against these discrete peptides. However, the peptides used were few in number, and no suggestion was made that an entire phage library could be used as a vaccine.
An alternative to the use of phages to display peptides which correspond to inserted DNA fragments is to use the DNA fragments themselves in a DNA vaccine. DNA vaccines are, as their name suggests, composed of DNA which can stimulate antigen-specific immunity within an animal. The DNA in question must code for the antigen of interest. Such DNA vaccines include the expression library immunization system (ELI) or naked DNA vaccines. As noted in Ulmer et al. [Ulmer et al., Curr. Op. Immunol., 1996, 8:531–536], naked DNA vaccines have been shown to be effective against influenza virus in animals. A particular bonus of these naked DNA vaccines is that they can elicit cellular as well as antibody responses, which many conventional vaccines cannot.
The expression library immunization system (ELI) also uses naked DNA coding for those proteins expressed by a particular pathogen, as described by Barry et al. [Barry et al., Nature, 1995, 377:632–635]. Alternatively, proteins expressed in bacteria have been used to present recombinant proteins to the immune system, as described by Mougneau et al. [Mougneau et al., Science, 1995, 268:563–566]. However, both of these methods are somewhat limited in power as compared to complete pepscans, since complete pepscans can cover substantially every possible continuous epitope of a pathogen, while these other methods only present specific proteins from a pathogen. Furthermore, as noted by Barry et al., large amounts of naked DNA were required for immunization.
Phage can also be used to present peptides for non-vaccine related interactions. Random peptide phage libraries have been used to target organs in vivo by Pasqualini and Ruoslahti [Pasqualini, R. and E. Ruoslahti, Nature, 1996, 380:364–366]. In their experiments, an entire random peptide phage library was injected into mice, and the mice were sacrificed 1–4 minutes later. Thus, although phage carrying specific peptides successfully targeted particular organs, the time frame did not permit any immunogenic effect to be observed, nor was such a potential effect even mentioned or intended by the authors.
Random peptide phage libraries have an additional advantage over more specific complete pepscans of an antigen. Random phage libraries can be used to map discontinuous epitopes, while a complete pepscan can be used to map continuous epitopes of an antigen, because of the nature of the group of peptides represented. A continuous epitope can be defined as one in which the antigenic residues reside within a short sequence of amino acids, less than from about ten to about fifteen amino acids. This sequence should definitely not be any longer than the length of the average peptide. Thus, only those epitopes which are continuous will be mapped. However, many epitopes have been shown to be discontinuous. These epitopes are composed of peptides derived from different positions in the primary sequence, but which are adjacent in the three-dimensional structure of the protein. An antigen-derived peptide display library of relatively short peptides does not contain peptides which represent these epitopes.
An example of the importance of discontinuous epitopes is in the study of HIV, or human immunodeficiency virus. HIV causes almost invariably fatal disease in humans; so far, no cure or vaccine has been found. An important step in HIV infection is the binding of an HIV envelope protein, gp120, to the T-cell receptor CD4. CD4 may also be important in post-binding events in HIV infection. Indeed, it has been suggested that CD4 changes conformation in response to HIV binding, and that this altered conformation may also be responsible for post-binding infection events. Thus, the interaction of gp120 with CD4 is a natural target for vaccine design.
Part of the difficulty in finding such a vaccine, however, is that the major neutralizing epitope, the V3 region of gp120, is hypervariable, so that antibodies raised against this region tend to be specific for one type of HIV and not others. Antibodies have been found with much broader specificity, several of which clearly bind to discontinuous gp120 epitopes. Thus, such discontinuous epitopes are obvious targets for mapping, as an aid to vaccine design.
As noted above, random peptide phage libraries have been used to map discontinuous epitopes in a variety of systems. Cortese et al. [Cortese et al., Tibtech, 1994, 6:73–80] review a number of discontinuous epitopes which have been found using random peptide phage libraries. However, screening such libraries with sera has not always produced significant results, probably because of the low or incomplete representation of all discontinuous epitopes. In order to overcome this problem, a refinement of the random peptide phage library approach uses constrained peptides, in which amino acids inserted around the random peptide define a particular structure for the peptide to assume, for example a loop structure. However, this approach forces specific structures to be selected, if all random peptides are to be screened. Alternatively, the sequence of the peptide can be held constant, and those surrounding amino acids which determine the structure of the peptide can be varied. In either case, a great deal of the power of random peptide phage libraries, namely the ability to search a broad group of epitopes, is reduced. Thus, more refined discontinuous epitope mapping approaches are clearly needed, which combine the power of random peptide phage libraries with the specificity of antigen-derived phage libraries.
Carbohydrate-protein interactions have also been studied using random peptide phage libraries because of the ability of such libraries to potentially represent discontinuous epitopes of the carbohydrates themselves. These interactions are important for a number of biological processes, including lymphocyte migration and binding of the hemagglutinin protein of the human influenza virus to erythrocyte glycoproteins, an important step in infection by the virus. However, these interactions have typically been difficult to study, because of the difficulty in synthesizing complex carbohydrate ligands. To solve this problem, peptides can be found which mimic carbohydrate ligands. These peptides are also called “mimotopes” because of their mimicry of the carbohydrate epitopes. For example, Oldenburg et al. [Oldenburg et al., PNAS, 1992, 89:5393–5397] used a random octapeptide (eight amino acid) phage library and screened these phage for the ability of the peptide to bind the carbohydrate-binding protein concanavalin A. A group of peptides were found which bound to the protein, although many of these peptides had no obvious sequence homology.
The interactions of many different molecules with proteins, including carbohydrates and drugs, could be much more easily elucidated if the three-dimensional structure, or tertiary native conformation, of the protein were known. Currently, these structures have generally been determined by using X-ray crystallography. However, as its name suggests, this method requires the protein to be capable of forming usable crystals, which are non-trivial to prepare. Indeed, many proteins do not form satisfactory crystals at all, including the vast majority of membrane-spanning proteins, such as neurotransmitter receptors. To overcome this barrier, a number of attempts have been made to use algorithms to predict the three-dimensional structure of a protein from its primary amino acid sequence, as described in Protein Folding, ed. by N. Jaenicke, p. 167–181, Amsterdam, Holland (1980), or Computer-Assisted Modeling of Receptor-Ligand Interactions, Theoretical Aspects and Applications to Drug Design, ed. by R. Rein and A. Golombek, 1989, Alan R. Liss, New York for example. Commercially available algorithms include those from MSI, United Kingdom, including Quanta, Delphi and Charmm. However, these algorithms have generally failed to adequately predict the three-dimensional structure of the protein, simply because there are many theoretical structures or conformers to examine, and the rules of protein folding are not completely known.
A compromise between these two approaches has been the use of laboratory experiments to obtain information about the protein itself, which can then be used to place constraints on such protein structure determinations. Such information can be obtained by NMR (Nuclear Magnetic Resonance), which provides information about the interactions of atoms within the protein in the form of distance constraints, although the distances between atoms must be relatively short (less than about 5 Å). However, NMR suffers from lack of specificity; that is, the interactions of too many atoms are all presented simultaneously, making it difficult to decipher the behavior of individual atoms. Furthermore, NMR requires vast amounts of highly purified protein and is also only suitable for water soluble protein.
Alternatively, electron diffraction techniques can be used for a small number of proteins, specifically those membrane-spanning proteins which are highly concentrated within the membrane, such as bacteriorhodopsin.
One method which might be more generally applicable was described in a study by E. Haas [E. Haas, Computer-Assisted Modeling of Receptor-Ligand Interactions, Theoretical Aspects and Applications to Drug Design, ed. by R. Rein and A. Golombek, 1989, Alan R. Liss, New York, p. 157–170]. This method uses the interaction of fluorescent dye molecules attached to particular residues within the protein to obtain information about the structure of the protein. However, this method requires that the protein be purified and labelled with dye molecules, which are both non-trivial procedures.
Once these constraints have been obtained, algorithms are available which use this information to predict the three-dimensional structure of a protein, or at the very least to eliminate those theoretical structures which are not compatible with the experimental evidence. Obviously, as the number of constraints is increased, the predictive ability of these algorithms will be improved correspondingly. Furthermore, it has been noted that longer range distance constraints, or constraints between pairs of residues which are relatively further apart along the primary amino acid sequence, are more useful than short range distance constraints, such as those calculated by NMR [Wako, H. and H. A. Scheraga, Macromolecules, 1981, 14:961–969].
Clearly, such algorithms could be improved by finding constraints which both more accurately reflect partial structures of the protein, and which are more easily measured in a laboratory.
There is thus a widely recognized need for, and it would be highly advantageous to have, a system for the discovery of discontinuous epitopes, to be used as vaccines, for drug design, for diagnostic purposes and for the elucidation of three-dimensional protein structure. Specifically, it would be advantageous to have a system to map discontinuous epitopes which is both complete, yet more specific than random phage libraries. It would also be advantageous to develop DNA vaccines which exploit the concept of overlapping peptides and/or discontinuous epitopes in a variety of expression systems. Finally, it would be advantageous to use discontinuous epitopes for preparing antibodies, as components of diagnostic tools, for preparing passive vaccines and for elucidating three-dimensional protein structure.