1. Technical Field
The field of this invention concerns combinatorial chemistry which involves syntheses having a plurality of stages, with each stage involving a plurality of choices, where large numbers of products having varying compositions are obtained.
2. Background of the Invention
There is substantial interest in devising facile methods for the synthesis of large numbers of diverse compounds which can then be screened for various possible physiological or other activities. Typically such a synthesis involves successive stages, each of which involves a chemical modification of the then existing molecule. For example, the chemical modification may involve the addition of a unit, e.g. a monomer or synthon, to a growing sequence or modification of a functional group. By employing syntheses where the chemical modification involves the addition of units, such as amino acids, nucleotides, sugars, lipids, or heterocyclic compounds where the units may be naturally-occurring, synthetic, or combinations thereof, one may create a large number of compounds. Thus, even if one restricted the synthesis to naturally-occurring units or building blocks, the number of choices would be very large, 4 in the case of nucleotides, 20 in the case of the common amino acids, and essentially an unlimited number in the case of sugars.
One disadvantage heretofore inherent in the production of large numbers of diverse compounds, where at each stage of the synthesis there are a significant number of choices, is the fact that each individual compound will be present in a minute amount. While a characteristic of a particular compound, e.g. a physiological activity, may be determinable, it is usually impossible to identify the chemical structure of the particular compound present.
Moreover, physiologically-active compounds have historically been discovered by assaying crude broths using Edisonian or stochastic techniques, where only a relatively few compounds are assayed at a time, or where a limited number of structurally similar homologs of naturally-occurring physiologically-active compounds are assayed. Two major problems have been associated with the use of such crude broths, namely, the necessity to purify the reaction mixture into individual component compounds and the time-consuming effort required to establish the structure of the compound once purified.
To address these disadvantages and problems, techniques have been developed in which one adds individual units as part of a chemical synthesis sequentially, either in a controlled or a random manner, to produce all or a substantial proportion of the possible compounds which can result from the different choices possible at each sequential stage in the synthesis. However, for these techniques to be successful it is necessary for the compounds made by them to be amenable to methods which will allow one to determine the composition of a particular compound so made which shows a characteristic of interest.
One such approach involves using a chip which allows for separate analysis at physically separate sites on the surface of the chip (Fodor et al., Science 251: 767 [1991]). By knowing which reactant is added sequentially at each such site, one can record the sequence of events and thus the series of reactions. If one then subjects the chip to a screening method for a particular desired characteristic and detects the characteristic one can readily determine the compound synthesized at the site which demonstrates that characteristic.
Another such technique involves the theoretical synthesis of oligonucleotides in parallel with the synthesis of oligopeptides as the compounds of interest (Brenner and Lerner, PNAS USA [1992]81: 5381-5383).
Further techniques are also disclosed in the following publications: Amoto, Science (1992) 257, 330-331, discusses the use of cosynthesized DNA labels to identify polypeptides. Lam, et al., Nature (1991) 354, 82-84, describe a method for making large peptide libraries. Houghton, et al., Nature (1991) 354, 84-86, and Jung and Beck-Sickinger, Angew. Chem. Tnt. Ed. Engl. (1992) 91, 367-383, describe methodology for making large peptide libraries. Kerr et al., J. Amer. Chem. Soc., (1993) 115, 2529-31, teach a method of synthesizing oligomer libraries encoded by peptide chains. Finally, international applications WO 91/17823 and WO 92/09300 concern combinatorial libraries.
However, since methods such as the preceding typically require the addition of like moieties, there is substantial interest in discovering methods for producing compounds which are not limited to sequential addition of like moieties. Such methods would find application, for example, in the modification of steroids, antibiotics, sugars, coenzymes, enzyme inhibitors, ligands and the like, which frequently involve a multi-stage synthesis in which one would wish to vary the reagents and/or conditions to provide a variety of compounds. In such methods the reagents may be organic or inorganic reagents, where functionalities may be introduced or modified, side groups attached or removed, rings opened or closed, stereochemistry changed, and the like. (See, for example, Bunin and Ellman, JACS 114, 10997 [1992].) For such a method to be viable, however, there needs to be a convenient way to identify the structures of the large number of compounds which result from a wide variety of different modifications. Thus, there is a need to find a way whereby the reaction history may be recorded, and desirably, the structures of the resultant compounds identified.
Finally, as the size of a library of compounds so synthesized increases, known techniques of structure elucidation and product segregation introduce substantial inefficiencies and uncertainties which hinder the accurate determination of the structure of any compound identified as being of interest. Thus, there is a substantial need for new methods which will permit the synthesis of complex combinatorial chemical libraries which readily permit accurate structural determination of individual compounds within the library which are identified as being of interest.
Many of the disadvantages of the previously-described methods as well as many of the needs not met by them are addressed by the present invention which, as described more fully hereinafter, provides myriad advantages over these previously-described methods.
Methods and compositions are provided for encoded combinatorial libraries, whereby at each stage of the synthesis, a support, such as a particle, upon which a compound is being synthesized is uniquely tagged to define a particular event, usually chemical, associated with the synthesis of the compound on the support. The tagging is accomplished using identifier molecules which record the sequential events to which the supporting particle is exposed during synthesis, thus providing a reaction history for the compound produced on the support.
Each identifier molecule is characterized by being stable under the synthetic conditions employed, by remaining associated with the supports during the synthesis, by uniquely defining a particular event during the synthesis which reflects a particular reaction choice at a given stage of the synthesis, by being distinguishable from other components that may be present during assaying, and by allowing for detachment of a tag component which is discernible by a convenient, analytical technique.
The identifiers of this invention are used in combination with one another to form a binary or higher order encoding system permitting a relatively small number of identifiers to be used to encode a relatively large number of reaction products. For example, when used in a binary code N identifiers can uniquely encode up to 2N different compounds.
Moreover, the identifiers of this invention need not be bound serially through a previous identifier but rather are individually bound to the substrate, either directly or through the product being synthesized. The identifiers are not sequencable. Furthermore, the identifiers contain a cleavable member or moiety which permits detachment of a tag component which can be readily analyzed.
Conveniently, the combinatorial synthesis employs definable solid supports upon which reactions are performed and to which the identifiers are bound. The individual solid supports or substrates carrying the final product compounds may be screened for a characteristic of interest and the reaction history determined by analyzing the associated identifier tags.