A significant recent development in pharmaceutical drug discovery and design has been the development of combinatorial chemistry to create chemical libraries of potential new drugs. Chemical libraries are intentionally created collections of different molecules; these molecules can be made by organic synthetic methods or biochemically. In the latter case, the molecules can be made in vitro or in vivo.
Combinatorial chemistry is a synthetic strategy in which the chemical members of the library are made according to a systematic methodology by the assembly of chemical subunits. Each molecule in the library is thus made up of one or more of these subunits. The chemical subunits may include naturally-occurring or modified amino acids, naturally-occurring or modified nucleotides, naturally-occurring or modified saccharides or other molecules, whether organic or inorganic. Typically, each subunit has at least two reactive groups, permitting the stepwise construction of larger molecules by reacting first one then another reactive group of each subunit to build successively more complex and potentially diverse molecules.
By creating synthetic conditions whereby a fixed number of individual building blocks, for example, the twenty naturally-occurring amino acids, are made equally available at each step of the synthesis, a very large array or library of compounds can be assembled after even a few steps of the synthesis reaction. Using amino acids as an example, at the first synthetic step the number of resulting compounds (N) is equal to the number of available building blocks, designated as b. In the case of the naturally-occurring amino acids, b=20. In the second step of the synthesis, assuming that each amino acid has an equal opportunity to form a dipeptide with every other amino acid, the number of possible compounds N=b.sup.2 =20.sup.2 =400.
For successive steps of the synthesis, again assuming random, equally efficient assembly of the building blocks to the resulting compounds of the previous step, N=b.sup.x where x equals the number of synthetic assembly steps. Thus it can be seen that for random assembly of only a decapeptide the number of different compounds is 20.sup.10 or 1.02.times.10.sup.13. Such an extremely large number of different compounds permits the assembly and screening of a large number of diverse candidates for a desired enzymatic, immunological or biological activity.
Biologically synthesized combinatorial libraries have been constructed using techniques of molecular biology in bacteria or bacteriophage particles. For example, U.S. Pat. No. 5,270,170 and 5,338,665 to Schatz describe the construction of a recombinant plasmid encoding a fusion protein created through the use of random oligonucleotides inserted into a cloning site of the plasmid. This cloning site is placed within the coding region of a gene encoding a DNA binding protein, such as the lac repressor, so that the specific binding function of the DNA binding protein is not destroyed upon expression of the gene. The plasmid also contains a nucleotide sequence recognized as a binding site by the DNA binding protein. Thus, upon transformation of a suitable bacterial cell and expression of the fusion protein, the protein will bind the plasmid which produced it. The bacterial cells are then lysed and the fusion proteins assayed for a given biological activity. Moreover, each fusion protein remains associated with the nucleic acid which encoded it; thus through nucleic acid amplification and sequencing of the nucleic acid portion of the protein:plasmid complexes which are selected for further characterization, the precise structure of the candidate compound can be determined. The Schatz patents are incorporated herein by reference.
In other biological systems, for example as described in Goedell et al., U.S. Pat. No. 5,223,408, nucleic acid vectors are used wherein a random oligonucleotide is fused to a portion of a gene encoding the transmembrane portion of an integral protein. Upon expression of the fusion protein it is embedded in the outer cell membrane with the random polypeptide portion of the protein facing outward. Thus, in this sort of combinatorial library the compound to be tested is linked to a solid support, i.e., the cell itself. A collection of many different random polypeptides expressed in this way is termed a display library because the cell which produced the protein "displays" the drug on its surface. Since the cell also contains the recombinant vector encoding the random portion of the fusion protein, cells bearing random polypeptides which appear promising in a preliminary screen can be lysed and their vectors extracted for nucleic acid sequencing, deduction of the amino acid sequence of the random portion of the fusion protein, and further study. The Goedell patent is incorporated herein by reference.
Similarly, bacteriophage display libraries have been constructed through cloning random oligonucleotides within a portion of a gene encoding one or more of the phage coat proteins. Upon assembly of the phage particles, the random polypeptides also face outward for screening. As in the previously described system, the phage particles contain the nucleic acid encoding the fusion protein, so that nucleotide sequence information identifying the drug candidate is linked to the drug itself. Such phage expression libraries are described in, for example, Sawyer et al., 4 Protein Engineering 947-53 (1991); Akamatsu et al., 151 J. Immunol. 4651-59 (1993), and Dower et al., U.S. Pat. No. 5,427,908. These patents and publications are incorporated herein by reference.
While synthesis of combinatorial libraries in living cells has distinct advantages, including the linkage of the compound to be tested with a nucleic acid capable of amplification by the polymerase chain reaction or another nucleic acid amplification method, there are clear disadvantages to using such systems as well. The diversity of a combinatorial library is limited by the number and nature of the building blocks used to construct it; thus modified or R-amino acids or atypical nucleotides may not be able to be used by living cells (or by bacteriophage or virus particles) to synthesize novel peptides and oligonucleotides. There is also a limiting selective process at play in such systems, since compounds having lethal or deleterious activities on the host cell or on bacteriophage infectivity or assembly processes will not be present or may be negatively selected for in the library. Importantly, only peptide or oligonucleotide compounds are made in such systems; thus the diversity of the library is restricted to peptide and polynucleotide macromolecules composed of naturally-occurring monomeric units.
Other approaches to creating molecularly diverse combinatorial libraries employ chemical synthetic methods to make use of atypical or non-biological building blocks in the assembly of the compounds to be tested. Thus, Zuckermann et al., 37 J. Med. Chem. 2678-85 (1994), describe the construction of a library using a variety of N-(substituted) glycines for the synthesis of peptide-like compounds termed "peptoids". The substitutions were chosen to provide a series of aromatic substitutions, a series of hydroxylated side substitutions, and a diverse set of substitutions including branched, amino, and heterocyclic structures. This publication is incorporated by reference herein.
Other workers have used small bi- or multifunctional organic compounds instead of, or in addition to, amino acids for the assembly of libraries or collections compounds of medical or biological interest.
Using chemical synthetic methodologies to create large diverse libraries of potentially useful compounds permits the synthesis of compounds joined to a solid support of some kind. However, the use of such synthetic methods requires the ability, after synthesis, to identify the structure of the rare members of the library which are able to pass a screening process. Thus, such libraries must be rationally designed so as to permit such identification. This task becomes virtually overwhelming as the number of possible compounds grows multiplicatively.
In attempting to consider this latter point, a number of attempts have been made to devise post-screening methods of "addressing" the specific compounds that the screening process indicates as candidates for further study. One class of such addressable libraries employs a strategy of linking the individual peptides of the library with the nucleic acids encoding them. Examples of such systems, such as the use of biological entities such as bacteriophage displaying the compounds of the library or plasmid-binding proteins fused to member compounds of the library have been described above. However, this methodology is not limited to biological systems, and can be employed by the co-polymerization of the test compound and a corresponding nucleotide sequence onto a single solid support.
Another strategy involves chemically synthesizing the combinatorial libraries on solid supports in a methodical and predetermined fashion, so that the placement of each library member gives information concerning the synthetic structure of that compound. Examples of such methods are described, for example, in Geysen, U.S. Pat. No. 4,833,092, in which compounds are synthesized on functionalized polyethylene pins designed to fit a 96 well microtiter dish so that the position of the pin gives the researcher information as to the compound's structure. Similarly Hudson et al., PCT Publication No. W094/05394, describe methods for the construction of combinatorial libraries of biopolymers, such as polypeptides, oligonucleotides and oligosaccharides, on a spatially addressable solid phase plate coated with a functionalized polymer film. In this system the compounds are synthesized and screened directly on the plate. Knowledge of the position of a given compound on the plate yields information concerning the nature and order of building blocks comprising the compound. Similar methods of constructing addressable combinatorial libraries may be used for the synthesis of compounds other than biopolymers.
Another approach has been the use of large numbers of very small derivatized beads, which are divided into as many equal portions as there are different building blocks. In the first step of the synthesis, each of these portions is reacted with a different building block. The beads are then thoroughly mixed and again divided into the same number of equal portions. In the second step of the synthesis each portion, now theoretically containing equal amounts of each building block linked to a bead, is reacted with a different building block. The beads are again mixed and separated, and the process is repeated as desired to yield a large number of different compounds, with each bead containing only one type of compound.
This methodology, termed the "one-bead one-compound" method, yields a mixture of beads with each bead potentially bearing a different compound. Thus, in this method the beads themselves cannot be considered "addressable" in the same sense as in the solid phase supports and arrays described above, or as in the cellular or phage libraries. However, the compounds displayed in the surface of each bead can be tested for the ability to bind with a specific compound, and, if those (typically) few beads are able to be identified and separated from the other beads, a presumable pure population of compounds can be recovered and analyzed. Of course, this latter possibility depends upon the ability to load and extract enough information concerning the compounds on the surface of each bead to be susceptible to meaningful subsequent analysis. Such information may simply be in the form of an adequate amount of the compound of interest to be able to determine its structure. For example, in the case of a peptide, enough of the peptide must be synthesized on the bead to be able to perform peptide sequencing and obtain the amino acid sequence of the peptide.
For synthetic chemical libraries, not limited to the one-bead one-compound method, in which the compounds of interest are not naturally-occurring peptides or oligonucleotides, analysis can be a tedious and difficult undertaking. In these cases, a code made from easily synthesized and analyzed "tag" molecules (for example, amino acids or other small multifunctional molecules, such as halogenated aromatics) can be co-synthesized with the compounds comprising the library. After a screening procedure, the tag can be "uncoded" to elucidate the structure of the compounds of interest. The code can be relatively arbitrary, so that the structure of any test compound made of building blocks, in which the building block members are able to be designated as corresponding, for example, to an amino acid (or dipeptide, tripeptide etc.), can be determined in this way.
As described above, the construction of combinatorial libraries provides researchers the opportunity to construct a vast number of potential chemical candidates to answer basic and applied structure-function questions, such as, without limitation: the relationship between a ligand and its receptor, a given antibody and its antigen and an enzyme and substrate. However, the ability to generate large libraries of potential drug compounds overwhelms most available screening methods. Thus, a bottleneck of this emerging and powerful technology remains adequate high-throughput screening procedures to identify the few compounds which are potential candidates for further study from among the thousands, millions or billions of other compounds in the library.
When the combinatorial library is to be screened for the presence of therapeutic or diagnostic agents, candidate compounds are generally initially screened for their ability to bind to a particular member of biological binding partners. By "binding partners" is meant that two or more compounds are able to join under appropriate biological or in vitro conditions to form a specific complex. Examples of such binding partners are, without limitation, antibody and antigen, ligand and receptor, and enzyme and substrate. At times, either ligand or receptor, or both may be comprised of a complex of more than one compound or polypeptide chain. For example, in the case of tumor necrosis factor .alpha. (TNF.alpha.), the soluble ligand TNF appears to bind to its receptor in the form of a TNF homotrimer; each TNF trimer can bind three copies of the receptor and clustering of the TNF receptor is thought to be required for it to exert its biological effects. Each and all polypeptide chains involved in the binding of the TNF trimer to the clustered receptors are considered individual binding partners.
One common screening method currently applied consists of coating a solid support, such as the wells of a microtiter dish, with the specific molecule for which a binding partner is sought. The library member compounds are then labeled, plated onto the solid support, and allowed to bind the library members. After a wash step, the binding partner complexes are then detected by detection of the label joined to the bound library members. This type of procedure is particularly well suited to combinatorial libraries wherein the member compounds are provided in a solution or medium. This method can be somewhat labor intensive and, in order to achieve the high throughput required to screen such large numbers of test compounds, may as a first step require screening pools of test compounds, followed by one or more rescreening step in order to specifically identify the compound of interest. The situation can also be reversed, so that the library members are allowed to coat individual wells and are probed with the specific molecule.
In cases wherein the combinatorial library is to contain antibody analogs or peptides targeted to a given epitope, the library members may contain a portion of an antibody recognized by a secondary antibody able to be detected, for example in an enzyme-linked immunological assay (ELISA) or by virtue of being directly or indirectly labeled, for example with a radionuclide, a chemiluminescent compound, a fluor, and enzyme or dye.
Tawfik et al ., 90 Proc. Natl . Acad. Sci. 373-77 (1993) describe a method of screening a library of antibodies (in this case, from a hybridoma library generated using a mimic of the transition state intermediate of an enzymatic reaction) for the presence of rare antibodies having a desired catalytic activity. The screening compound, in this case the enzyme substrate, was immobilized on 96 well microtiter dishes. Supernatants from each clone were placed into separate wells under conditions promoting the enzymatic reaction. The products of the enzymatic reaction, still immobilized to the microtiter dish, were assayed by the use of product-specific monoclonal antibodies. Again, this type of screening process is quite labor-intensive and may necessitate repetitive screening of pools of test compounds in order to achieve high throughput of large libraries.
In the cellular or phage display libraries and "one-bead one-compound" synthetic libraries described above the library members can be screened for the ability to bind a specific binding partner (e.g., a receptor) which is labeled with a detectable fluor, such as fluorescein or phycoerythrin. Because each particle (for example, a cell or a bead) displays only one species of test compound, the fluorescently labeled particles can be detected and sorted using a fluorescence activated cell sorter (FACS). An enriched population of positive beads or particles can then be rescreened, if necessary, and individually analyzed. This strategy can be employed using cells displaying the test compounds or beads on which the test compounds are synthesized. However, this method also suffers from a lack of ease of use, and is time intensive.
Whether screening is by the panning procedure previously described or by binding of labels to the solid phase bound test compounds, a common screening procedure is by competitive binding of the test compounds in the presence of a detectable control ligand, often the natural ligand for the specific binding partner to which the test compounds are intended to be directed. Again, this method can be quite labor-intensive and requires the generation of a standard curve and correlation of the data obtained from the competition experiments with the standard curve in order to generate meaningful data. Thus, competition assays are unable to yield easily interpreted and rapid results in an initial screen of thousands or millions of different library members.
ELISA and similar assay formats are useful when the library members are derivatives of antibodies and contain variable regions directed against known antigens. However, these methods may not be as useful in a non-competitive (i.e., direct) format where neither the specific binding partner nor the desired test compounds are antibodies or contain an available epitope against which a secondary antibody can be easily generated.
Biochemical tools have been generated consisting of chimeric peptides containing portions of a peptide ligand and specific domains of an antibody. Such agents have been devised mainly as therapeutic aids to the delivery of drugs within a patient's body. Especially in the case of peptide drugs, such as soluble agonists of cytokines and other such agents, therapeutic agents or drugs often have a short systemic half-life which reduces the stability of such drugs in vivo. This reduced stability may, in some cases, be counteracted by higher or more frequent dosages, but this may lead to such undesirable consequences as drug tolerance, toxic effects, and high cost of the drug to the patient.
One strategy for overcoming these shortcomings, particularly with regard to the use of systemic biochemical angonists, has been the use of fusion peptides, which have a longer half life in the circulatory system. These fusion peptides generally contain a binding partner, such as a cytokine receptor, fused to part of an immunoglobulin chain. The immunoglobulin chain acts as molecular camouflage, reducing the opportunity for the binding partner to be recognized as a "foreign" antigen by the organism.
Thus, Shin, et al., 92 Proc Nat'l Acad. Sci. 2820-24 (1995) employed fusion peptides made by constructing recombinant vectors having the gene encoding human transferrin fused, in frame, to the 3' end of a chimeric mouse-human IgG3 gene encoding variable and constant regions. The resulting fusion molecules were able to bind antigen (dansyl) and the purified transferrin receptor, and were able to enter the brain parenchyma of rats using the transferrin receptor for transport from the circulatory system. The remaining variable region of the antibody could contain other optional specificities, thus the site is available for secondary targeting of the molecule, such as for therapeutic purposes, once across the blood-brain barrier.
Evans and coworkers, 180 J. Exp. Med. 2173-79 (1994), using molecular cloning techniques, reported the construction of a fusion protein containing extracellular portions of the p75 high affinity receptor or, alternatively the p55 low affinity receptor, specific for tissue necrosis factor alpha (TNF.alpha.-R) fused to a constant region of human IgG. The soluble, non-fusion forms of the TNF receptors are known to be rapidly degraded in vivo. Cells were transformed with vectors expressing portions of heavy immunoglobulin chain fused to each of TNF receptors. The fusion peptide was more stable than the soluble receptor in serum. Moreover, the fusion peptides were secreted as dimers containing two heavy chains bound by disulfide linkages. The dimers were able to bind the TNF trimers (a naturally-occurring conformation of TNF.alpha.) in two separate areas and thus with higher affinity than is possible when the fusion peptide is in the soluble monomeric form.
Other fusion proteins containing a ligand or receptor and an antibody portion have been used in the search for effective therapeutic agonists to humoral agents. In Fountoulakis et al., 270 J. Biol. Chem. 3958-64 (1995) the extracellular domain of the human interferon .gamma. receptor was expressed as a fusion protein with the IgG hinge, C.sub.H 2 and C.sub.H 3 domains, and was shown to bind interferon, compete for interferon binding to the cell surface receptor of tissue culture cells, and inhibit interferon-mediated antiviral activity. Due to the immunoglobulin portion of the fusion protein, the protein was expressed in Chinese Hamster ovary cells as a disulfide-linked homodimer. The dimer was able to bind interferon more strongly than the soluble receptor monomer.
In Pitti, et al., 31 Molec. Immunol. 1345-51 (1994) the human interleukin-1 (IL-1) receptor was expressed in transfected human cells as a fusion protein containing the hinge and Fc regions of the IgG heavy chain. This fusion peptide was reported to have an extended pharmacological half-life in the circulatory system of mice and to bind IL-1.
Crowe et al., 168 J. Immunol. Meth. 79-89 (1994) expressed a gene containing coding sequences of the extracellular domain of the human lymphotoxin a receptor fused to a gene segment encoding the constant portion of human IgG heavy chain. The fusion protein was cloned into a baculovirus vector and expressed in both insect cells and African green monkey kidney cells as a dimer. The IgG portion of the fusion peptide was used as a ligand for affinity purification of the fusion peptide, and also enabled disulfude facilitated dimerization of the fusion peptides to provide a high-affinity ligand for lymphotoxin.
These latter five references are incorporated by reference herein.