There is substantial interest in devising facile methods for the synthesis of large numbers of diverse compounds which can then be screened for various possible physiological or other activities. Typically such a synthesis involves successive stages, each of which involves a chemical modification of the then existing molecule. For example, the chemical modification may involve the addition of a unit, e.g. a monomer or synthon, to a growing sequence or modification of a functional group. By employing syntheses where the chemical modification involves the addition of units, such as amino acids, nucleotides, sugars, lipids, or heterocyclic compounds where the units may be naturally-occurring, synthetic, or combinations thereof, one may create a large number of compounds. Thus, even if one restricted the synthesis to naturally-occurring units or building blocks, the number of choices would be very large, 4 in the case of nucleotides, 20 in the case of the common amino acids, and essentially an unlimited number in the case of sugars.
One disadvantage heretofore inherent in the production of large numbers of diverse compounds, where at each stage of the synthesis there are a significant number of choices, is the fact that each individual compound will be present in a minute amount. While a characteristic of a particular compound, e.g. a physiological activity, may be determinable, it is usually impossible to identify the chemical structure of the particular compound present.
Moreover, physiologically-active compounds have historically been discovered by assaying crude broths using Edisonian or stochastic techniques, where only a relatively few compounds are assayed at a time, or where a limited number of structurally similar homologs of naturally-occurring physiologically-active compounds are assayed. Two major problems have been associated with the use of such crude broths, namely, the necessity to purify the reaction mixture into individual component compounds and the time-consuming effort required to establish the structure of the compound once purified.
To address these disadvantages and problems, techniques have been developed in which one adds individual units as part of a chemical synthesis sequentially, either in a controlled or a random manner, to produce all or a substantial proportion of the possible compounds which can result from the different choices possible at each sequential stage in the synthesis. However, for these techniques to be successful it is necessary for the compounds made by them to be amenable to methods which will allow one to determine the composition of a particular compound so made which shows a characteristic of interest.
One such approach involves using a chip which allows for separate analysis at physically separate sites on the surface of the chip (Fodor et al., Science 251: 767 [1991]). By knowing which reactant is added sequentially at each such site, one can record the sequence of events and thus the series of reactions. If one then subjects the chip to a screening method for a particular desired characteristic and detects the characteristic one can readily determine the compound synthesized at the site which demonstrates that characteristic.
Another such technique involves the theoretical synthesis of oligonucleotides in parallel with the synthesis of oligopeptides as the compounds of interest (Brenner and Lerner, PNAS USA [1992] 81: 5381-5383).
Further techniques are also disclosed in the following publications: Amoto, Science (1992) 257, 330-331, discusses the use of cosynthesized DNA labels to identify polypeptides. Lam, et al., Nature (1991) 354, 82-84, describe a method for making large peptide libraries. Houghton, et al., Nature (1991) 354, 84-86, and Jung and Beck-Sickinger, Angew. Chem. Int. Ed. Engl. (1992) 91, 367-383, describe methodology for making large peptide libraries. Kerr et al., J. Amer. Chem. Soc., (1993) 115, 2529-31, teach a method of synthesizing oligomer libraries encoded by peptide chains. Finally, international applications WO 91/17823 and WO 92/09300 concern combinatorial libraries.
However, since methods such as the preceding typically require the addition of like moieties, there is substantial interest in discovering methods for producing compounds which are not limited to sequential addition of like moieties. Such methods would find application, for example, in the modification of steroids, antibiotics, sugars, coenzymes, enzyme inhibitors, ligands and the like, which frequently involve a multi-stage synthesis in which one would wish to vary the reagents and/or conditions to provide a variety of compounds. In such methods the reagents may be organic or inorganic reagents, where functionalities may be introduced or modified, side groups attached or removed, rings opened or closed, stereochemistry changed, and the like. (See, for example, Bunin and Ellman, JACS 114, 10997 [1992].) For such a method to be viable, however, there needs to be a convenient way to identify the structures of the large number of compounds which result from a wide variety of different modifications. Thus, there is a need to find a way whereby the reaction history may be recorded, and desirably, the structures of the resultant compounds identified.
Finally, as the size of a library of compounds so synthesized increases, known techniques of structure elucidation and product segregation introduce substantial inefficiencies and uncertainties which hinder the accurate determination of the structure of any compound identified as being of interest. Thus, there is a substantial need for new methods which will permit the synthesis of complex combinatorial chemical libraries which readily permit accurate structural determination of individual compounds within the library which are identified as being of interest.
Many of the disadvantages of the previously-described methods as well as many of the needs not met by them are addressed by the present invention which, as described more fully hereinafter, provides myriad advantages over these previously-described methods.