1. Field of the Invention
This invention relates to the fields of molecular diversity and combinatorial chemistry, which includes general strategies for obtaining ligands that bind to a preselected receptor starting from a collection, or library, of compounds presented to the receptor as a mixture. This invention relates to a method for increasing the power of combinatorial strategies by increasing the number of compounds that can be explored in a combinatorial experiment. This invention provides a method for preparing molecules (ligands) that bind to a preselected receptor, where the receptor extracts two or more ligand fragments, and where the ligand fragments are joined covalently while bound to the receptor. More specifically, it describes a method whereby a biological receptor is presented with a mixture, or library, of compounds that represent widely different types of structures, where each of these compounds bears one or more reactive groups that can form a covalent bond by reaction with a reactive group from another compound in the library, where two or more compounds in the library are joined either while bound to the receptor itself, or in solution, where the composite ligand is then trapped by the receptor. The product of the receptor-assisted synthesis is isolated as a complex with the receptor, and then released and analyzed.
2. Description of the Related Art
Background
The majority of pharmaceutical agents are compounds that exert their biological activity by binding to a biological macromolecule, referred to here as a receptor. The discovery of ligands that can bind to a preselected receptor and thereby exert a biological effect is therefore a central task of medicinal chemists seeking to develop new human pharmaceutical agents.
Classically, ligands are discovered by three strategies. The first involves screening of a collection of chemicals whose structures have no deliberate connection with the structure or biology of the target receptor. This process is referred to as "random screening".
The second strategy requires information about the structure of the natural ligand for a receptor. Development of new ligands then is based on the deliberate synthesis of specific analogs of the natural ligand in the hope of discovering a ligand that retains or has increased affinity for the receptor, together with bioavailability, stability, and other properties desired for a human pharmaceutical.
The third strategy requires information about the structure of the receptor itself, obtained by crystallography, spectroscopy, or modeling. With this information, ligands are designed by the design of structures that are complementary to the binding site of the ligand.
The deficiencies of these three approaches are well known by those familiar with the art. Random input screening often requires examination of tens of thousands of compounds before a single ligand has a chance of being identified. Analogs of the natural ligands often resemble the natural ligand in terms of bioavailability, stability, or other properties; often, these properties are undesirable in a human pharmaceutical. Further, while tremendous strides have been made in the science of molecular recognition over the past decade, it is still not possible to design a ligand for a receptor, even given a high resolution experimental structure for the receptor itself.
The difficulties in screening can be illustrated by the enormous number of compounds that are possible. For example, a single small molecule built from only carbon and hydrogen with a molecular mass of 282 and the molecular formula C.sub.20 H.sub.42 can occur in 366,319 distinct isomers. Adding a single oxygen to this molecule to yield the molecular formula C.sub.20 H.sub.42 O increases the number of possible compounds to over 10.sup.7. The number of possible molecules rapidly becomes astronomical as still more atom types are introduced, or as different ratios of atom types are permitted, or as larger molecules are considered. In principle, each of these might have a distinct affinity for a biological receptor.
One possible method to circumvent these problems comes under the title of "combinatorial chemistry". In a combinatorial experiment, a collection, or library, of molecules with diverse structures is presented to the receptor. The receptor binds to only a few molecules in the library. The binding is then used to extract from the mixture those molecules that bind, or to otherwise identify those molecules that bind, and the structures of the molecules are determined.
The execution of a combinatorial experiment requires the design of several interacting elements. The library must be designed, including both its degeneracy (how many different compounds are in the library) and its diversity (the range of different types of structures that a library contains). Methods must be designed to analyze the structure of the compound that binds to the receptor.
The decisions to chose a particular library diversity and library degeneracy depend, however, on an estimate of the probability that a library with a specified diversity and degeneracy will contain at least one compound that binds as a ligand to a preselected receptor with an affinity sufficient that the ligand can be detected by a receptor binding assay. Likewise, assessing the utility of a combinatorial tool depends on this estimate.
(a) Estimating the Degeneracy of a Useful Library
We now estimate the degeneracy of a library needed to provide useful ligands. To be useful, a library in general must contain at least one compound that binds to the preselected receptor with a disassociation constant of ca. 1 micromolar (.mu.M) or less. Assuming that the library samples structural diversity randomly, we ask how big the library must be to contain at least one ligand that binds with this affinity. This question is difficult to answer analytically. However, a biological analog to the combinatorial chemistry can be found in the immune system, which has evolved over several hundred million years to solve the complementary problem: to provide a library of receptors that contains at least one that binds to the general ligand with ca. 1 micromolar affinity. The immune system creates a combinatorial pool of receptors (antibodies) that has a high probability of including at least one that binds to any particular ligand. Thus, the immune system solves a problem that is in many respects the reciprocal of the problem that must be solved in a combinatorial experiment.
If we assume that the task of generating combinatorial pool of receptors to find one that binds tightly to a single ligand, and the task of generating a combinatorial pool of ligands to find one that binds tightly to a single receptor, are governed by similar statistics, we may use the statistics of the primary immune response to estimate the degeneracy that a combinatorial library must have to containing a ligand with this affinity. The immune system has 10.sup.7 to 10.sup.8 mature B cells. When challenged for the first time with a ligand (the antigen), this repertoire contains some antibodies that bind the antigen with disassociation constants in the range of 100 nM to 10 .mu.M. This is the "primary response" of the immune system. This suggests that a combinatorial library of 10.sup.8 molecules is needed if one is to have a good chance of finding a ligand with a disassociation constant on the order of 1 .mu.M within that library. This suggests that for a combinatorial library of small molecules to be useful, it must have this degeneracy.
The diversity of the library generated by the immune system is also defined. Antibodies are built from 20 natural amino acids containing a specific, and limited, range of functional groups (hydroxyl groups, amino groups, carboxyl groups, amide groups, and four types of aromatic groups). This suggests that for a combinatorial library of small molecules to be useful, it must have a similar range of functional group diversity.
(b) Designing the Receptor Binding Assay
Once a library of 10.sup.8 degeneracy is chosen, certain constraints are placed on the design of a receptor binding assay. A typical receptor binding assay can conveniently recover a ligand only when either the concentration of the ligand or the concentration of the receptor is at or greater than the disassociation constant of the ligand-receptor complex. Further, a receptor binding assay must be based on the receptor-ligand interaction changing the behavior of either the receptor or the ligand in a way that is detectable. The choice of which component of a receptor binding assay to have in excess is determined in part by what detection system will be used to detect a ligand. Finally, the receptor-binding assay must in some way produce either enough ligand or enough of an associated tag to permit subsequent chemical analysis to determine the structure of the high affinity ligand.
If 10.sup.8 ligands must be present to have at least one with a disassociation constant (K.sub.d) of 10.sup.-6 M, the total concentration of a ligand pool, the sum of the concentrations of each of the components in a library, free in solution must be 100 M for the single compound that is a ligand to be at a concentration equal to the disassociation constant. This is, of course, not possible in any general way (pure water has a concentration of only 55 M).
Thus, the total concentration of the library is roughly inversely proportional to the average molecular weight of the components of the library, and is limited by the solubility of the library components. For an average mass of 200 daltons, a maximum concentration (representing a situation whether the compounds in the library constitute 100% of the solution) is 5 molar. A total library concentration greater than ca. 1 molar will only in exceptional cases be feasibly presented in solution to a receptor; the maximum feasible total concentration of the average library is more likely to be ca. 100 mM.
These facts make difficult the presentation of a useful library to a receptor in soluble form, where the library components saturate the receptor. In order to have the appropriate diversity, the library must contain 10.sup.8 molecules. Yet the practical total concentration of the library cannot be higher than 1M (more preferably 100 mM). Thus, the concentration of each component in the library can be only 10 nM (more preferably 1 nM). This concentration is well below the disassociation constant of the most tightly bound ligand likely to be found in a library of this size.
One might, of course, immobilize a receptor on an affinity support and pass the library through a column of the support. Compounds that are ligands will be retarded as they pass through the column, as in a standard affinity chromatography experiment, with the degree of retardation depending on the effective concentration of the receptor. However, to separate the forward running ligand from the lagging non-ligands in the column, either the ligand pool must be highly concentrated, or the column must have large dimensions.
It is possible to present the library in soluble form and the receptor in concentrations higher than the disassociation constant. Under these circumstances, all of the most tightly bound ligand will be recovered with the receptor. However, the amount recovered will be limited by the amount of the ligand in the volume used in the combinatorial experiment. This volume can, of course, be made arbitrarily large. However, this requires arbitrarily large amounts of receptor.
Calculations of this sort have convinced many skilled in the art that the only practical implementation of the combinatorial strategy requires that the library of ligands of random structure be presented to the receptor in immobilized form, either on a surface or on small beads, each bearing multiple copies of a single compound in the library. The receptor is presented to the library in a soluble form at a concentration higher than the disassociation constant. Most commonly, the receptor bears a fluorescent tag. The beads bearing the ligand therefore bind fluorescent receptor, can be identified by their fluorescence under ultraviolet light, and can be separated manually or by a cell sorter. The number of beads that can be examined in this manner is theoretically infinite. However, even assuming automatic sorting of the beads (using a cell sorter) at a rate of 100 per second, a library of greater than 10.sup.8 compounds is not manageable, as it requires more than 10 days to sort through the beads. Presenting large numbers of beads to a receptor in high concentration also requires substantial amounts of receptor. Finally, artifacts associated with ligands covalently linked to a support are commonly encountered, where the receptor binds the supported ligand but not to the same ligand free in solution.