A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyrights whatsoever.
The present invention generally relates to the field of combinatorial chemistry and in particular relates to encoding the library products of combinatorial synthesis.
Recent trends in the area of research for novel chemical and especially pharmacological agents have been concentrated on the preparation of so-called xe2x80x9cchemical librariesxe2x80x9d as potential sources of new leads for drug discovery. Chemical libraries are intentionally created collections of differing molecules which can be prepared either synthetically or biosynthetically and screened for biological activity in a variety of different formats. One can have libraries of soluble molecules; libraries of compounds tethered to resin beads, silica chips or other solid supports; or recombinant peptide libraries displayed on bacteriaphage or other biological display vectors. Chemical libraries are advantageously made by using techniques from the field of combinatorial chemistry. The field of combinatorial chemistry is a synthetic strategy which leads to large chemical libraries. Combinatorial chemistry can be defined as the systematic and repetitive covalent connection of a set of different building blocks of varying structures to each other to yield a large array of diverse molecular entitites.
Traditionally, new medicinal chemical lead structures have originated from the isolation of natural products from microbiological fermentations, plant extracts, and animal sources; from screening of pharmaceutical company compound databases; and more recently through the application of both mechanism-based and structure-based approaches to rational drug design. All of these methods are relatively expensive. Recent cost studies suggest that the average cost of creating a new molecular entity in a major pharmaceutical company is around $7,500 per compound, using the traditional chemical synthesis technology that requires more or less constant hands-on manipulation of reagents and apparatus and the attention of a chemist. Furthermore, the advent of high throughput automated techniques has made possible the robotized screening of in excess of hundreds of thousands of individual compounds per year, per drug target. The availability of this capability, combined with the relatively high cost of more traditional hand crafted chemistry has caused a global shift in emphasis toward the concept of mass production, which is an industrial concept that can be put into being using the approach of combinatorial chemistry.
The inefficiencies of hand crafted chemistry are thus largely addressed by a switch to the concept of using combinatorial chemical technologies for rapidly synthesizing compound collections. Thus, by employing a building block approach and by systematically assembling these blocks in many combinations using chemical procedures it is possible to create chemical libraries as vast populations of molecules. An essential starting point for the generation of molecular diversity is an assortment of small, reactive molecules which may be considered chemical building blocks. The universe of structural diversity accessible through assembly of even a small set of building-block elements is potentially very large, and unleashing the power inherent in the building block approach is crucial to the success of the combinatorial method. The building block argument is easily illustrated as follows. Theoretically, the number of possible different individual compounds, N prepared by an ideal combinatorial synthesis is determined by two factors; the number of blocks available for each step xe2x80x9cBxe2x80x9d and the number of synthetic steps in the reaction scheme, s. If an equal number of building blocks are used in each reaction step, then N=Bs. If the number of blocks that one desires for each step varies (e.g. b, c, d in a three-step synthesis), then N=bcd.
From the above, it can be seen that a relatively conservative combinatorial synthesis procedure involving 20 blocks in a three step synthesis process will produce 203=8000 compounds. This relatively generous production output then raises the next question, which is, how will the compounds be identified? For example, a typical combinatorial synthesis technique is that of the split synthesis. As an example of split synthesis in the solid state synthesis of peptides, a batch of resin support (typically small resin beads) is divided into n fractions, coupling a single monomer amino acid to each aliquot in a separate reaction, and then throroughly mixing all the resin particles together. Repeating this protocol for a total of x cycles can produce a stochastic collection of up to nx different molecules, as governed by a hypergeometric distribution. To ensure representation of the majority of possible ligands one needs to begin with a multiplicity of beads. A typical value would be ten times as many beads as the desired number of ligands. Theoretically a set of every possible combination of the building blocks exists in the aliquots. In order to determine the composition of a particular compound which is found to be of interest, one could proceed with direct ligand structural analysis, preferably on a mass spectrometer, on a species-by-species basis. A typical combinatorial synthesis now typically takes place on a reaction plate having from 96 to 2,304 reaction wells. One identifies the product compounds of true interest by a positive response in an appropriate assay. However, even after assays are run that will greatly reduce the number of compounds as having been non-active in the assay, the problem with a conventional mass spectrometer analytical approach is that many individual analysis trials are required, there may only be very small quantities of material available after running a combinatorial synthesis, and overall turn around time may be quite lengthy. A need exists, then to somehow label compounds as they are going through their combinatorial steps. Where compounds are, for example, tethered to resin beads, prior art solutions to the problem have included attaching chemical identifier tags to the beads coincident with each block coupling step in the synthesis. The different chemical properties of each tag would then convey which building block was coupled in a particular step of the synthesis, and the overall structure of a ligand on any bead could be deduced by xe2x80x9creadingxe2x80x9d the set of tags on that bead, in effect having encoded the bead.
Tags should ideally have a highly discrete information content, be amenable to very high sensitivity detection and decoding, and must be stable to reagents used in the ligand synthesis. Prior art tags attached onto beads have included nucleotides, peptides, or a combined series of hydrocarbon homologs and polychlorinated aromatics. Single stranded oligonucleotides are built on resin beads upon which peptide synthesis is being performed and which are subsequently amplified through polymerase chain reaction and sequenced. Another technique is one where orthogonally differentiated diamine linkers are used in the construction of soluble chimeric peptides comprising a xe2x80x9cbindingxe2x80x9d strand and a xe2x80x9ccodingxe2x80x9d strand. As amino acid monomer building blocks are coupled to the binding strand, this is recorded by building an amino acid code onto the xe2x80x9ccoding xe2x80x9d strand. The sequence of the coding strand is then resolved by Edman degradation. One problem with this approach is that it requires an extra chemical step for every step taken in the construction of the library. Another problem with this approach is that it requires requires orthogonal synthetic procedures for building up a tag in conjunction with synthesis of ligands, i.e., it requires the addition of like moieties, whereas there is very great interest in discovering methods for producing compounds which are not limited to sequential addition of like moieties. Such methods would find application, for example in the modification of steroids, antibiotics, sugars, coenzymes, enzyme inhibitors, ligands, and the like, which frequently involve a multi-stage synthesis in which one would wish to vary the reagents and/or reactions conditions to provide a variety of compounds. In such methods the reagents may be organic or inorganic reagents, where functionalities may be introduced or modified, side groups attached or removed, rings opened or closed, stereochemistry changed, and the like. For such a method to be viable, however, there needs to be a convenient way to identify the structures of the large number of compounds which result from a wide variety of different modifications.
A technique that is useful for the screening of nonsequenceable organic molecules prepared by multistep combinatorial synthesis uses a series of gas chromatographically resolvable halocarbon derivatives as molecular tags which, when appended to reactive groups on the bead surface, can constitute a binary code that reflects the chemical history of any member of a library. Instead of the oligonucleotide or peptide coding approaches where the order of assembly of the chemical building blocks for any library member is preserved in the sequence of a single cognate tagging molecule, the binary strategy uses a uniquely defined mixture of tags to represent each building block at each particular step of the synthesis. Thus a set of N tags can be used to encode the combinatorial synthesis of a library of 2N different members. After assembly, the tags are photolysed and analyzed by electron capture capillary gas chromatography.
In all cases, the use of reporter tags complicates synthetic strategies, increases the risk of side reactions and by-products, and yields only indirect evidence of structure. Thus, there is a need to find a way whereby a compound""s reaction history may be recorded, and the structure of the resulting compound identified.
The use of 13C site-specific labels on the ligand itself has also been used, in connection with 13C NMR spectroscopy as a method of monitoring progress in solid state combinatorial synthesis.
Yet another method is that of using a chip which allows for separate analysis at physically separate sites on the surface of the chip. By knowing what reactant is added sequentially at each such site, one can record the sequence of events and thus the series of reactions. If one then subjects the chip to a screening method for a particular desired characteristic and detects the characteristic one can readily determine the compound synthesized at the site which demonstrates that characteristic.
A discrete sample-by-sample analysis will yield a great deal of extraneous information, as everything in the sample will be analyzed. However, in analyzing the results of a combinatorial synthesis, it is desirable to be able to track in linear terms, since all that is being tracked by a linear method is what is being added to the construct of interest in the synthesis, which will omit the presence of solvents, resin bits, side reactions and impurities.
In view of the above needs and shortcomings of the prior art, it is a primary object of the present invention to reduce the amount of time needed to read and de-code the products of a combinatorial synthesis. It is another object of the present invention to provide a method of encoding combinatorial constructs that does not require orthogonal chemistry, that is, chemistry that has been carefully selected so as not to be interfering with the chemistry that is being executed as part of the combinatorial synthesis itself. Another object of the invention is to minimize the amount of capital investment needed to develop a coding strategy that requires no more than what is needed to initially develop a set of appropriate solid support links.
Unlike the methods of the prior art, the present invention embodies a method of isotopically rather than chemically encoding a monomer to read a synthetic history. Readable differences in the encoding moieties therefore rely on physical differentiation, rather than chemical differentiation. The isotopically encoded monomer is, however, chemically bonded to the ligand of interest during synthesis, in contrast to prior art identification methods in which differentiable isotopes are physically mixed into and interspersed with bulk chemicals or commodities to identify their manufacturing source. The invention likewise does not rely on tagging a molecule, like other prior art approaches.