This invention concerns reagents of the kind which comprise a product which is built up using stepwise reactions, often chemical reactions, and associated tag moieties which track the synthetic pathway and/or the reagents used. The product will often be an oligomer and the tags define the identity and position of at least one monomer residue in the oligomer. Such reagents are useful in assay methods in which they can generate much more information than can be generated by a simple labelled analyte. Sets and libraries of such reagents can be created by combinatorial chemistry and are valuable for screening large numbers of compound e.g. for biological activity. In preferred systems according to the invention, positively charged tag groups are generated for analysis by mass spectrometry by cleavage, e.g. photocleavage of neutral molecules.
WO 95/04160 describes a reagent which comprises:    a) an analyte moiety comprising at least two analyte residues, and linked to    b) a tag moiety comprising one or more reporter groups adapted for detection by mass spectrometry, wherein a reporter group designates an analyte residue, and the reporter group at each position of the tag moiety is chosen to designate an analyte residue at a defined position of the analyte moiety. A plurality of such reagents, each comprising a different analyte moiety, provides a library of reagents which may be used in assay methods involving a target substance. Analysis of the tag moieties indicates the nature of the analyte moiety bound to the target substance.
WO 94/08051 describes a system used to make simultaneously a library of all oligomers each attached to a bead. Any individual bead made by a split and mix process carries a unique chemical product, and this is true of each bead which goes through the same synthetic pathway. Two coupling steps are used at each point in the process: one step affects the synthon or ligand; the other alters the structure of a tag which is also carried on the bead. Tags are designed to identify the steps through which the bead has been taken.
It is an object of this invention to provide a set or library of labelled compounds which may be synthesised on a support and may be used either attached to or separated from that support.
In one aspect the invention provides a method of making a set of labelled compounds, by the use of a support and a set of labels, which method comprises the steps:    a) at least one first or intermediate step comprising dividing the support into lots, performing a different chemical reaction on each lot of the support so as either to modify that lot of the support or to couple a chemical moiety to that lot of the support, tagging a fraction of each lot of the support with a different label, and combining the said lots of the support, and    b) at least one intermediate or final step comprising dividing the support into lots, performing a different chemical reaction on each lot of the support, so as either to modify that lot of the support or to couple a chemical moiety to that lot of the support, tagging a fraction of each lot of the support with a different label, whereby each different label is linked to a chemical moiety coupled to the support in a different step and forms with that chemical moiety a labelled compound which is separable from the support, and combining the said lots of the support.
The method uses a support which is repeatedly divided into lots which are then recombined. The support may be a massive support e.g. a flat sheet or silicon chip or microtitre plate which is divided e.g. by masking into regions for performing the different chemical reactions. The support may be a polymeric material which is soluble in some solvents and not in others, and which is separated into lots or recombined e.g. by precipitation or dissolution. Most usually the support will be particulate, e.g. pins or fibres or capillaries or preferably beads. Derivatised beads for performing combinatorial chemistry by a split and mix strategy are commercially available and can be used here. A preferred particulate support comprises beads having cleavable linkers, wherein each cleavable linker has one group for defined chemical procedures e.g. oligomer synthesis and another group for labelling. By this means it is possible at the end of the synthesis, to recover the labelled chemical products e.g. oligomers into solution.
The method of the invention involves performing at least one step a) and at least one step b), most usually at least three steps in all. Each step involves performing a reaction, generally but not necessarily a chemical reaction. An example of such a reaction might be the removal of a protective group so as to leave a primary amine or hydroxyl or carboxylic acid group. Most usually, the chemical reaction involves coupling a chemical moiety to the support. The chemical moiety will usually be an organic chemical group, for example as described in WO 94/08051. While successive chemical moieties may be attached to the support through separate linkers, more usually, successive chemical moieties will be joined to each other to form a chain extending from the support. Preferably the chemical moieties are monomer units which are built up to form oligomer chains.
In a preferred method according to the invention, the set of labelled compounds is a library of ns labelled oligomers, where n is the number of different monomer units and s is the number of monomer units in each labelled oligomer, wherein step a) is performed once to couple a different monomer unit to each lot of the support, and step b) is performed s−1 times.
The oligomer may be for example an oligonucleotide or an oligopeptide. When the oligomer is an oligonucleotide or analogue, then n is generally 4. When the oligomer is an oligopeptide, then n is generally about 20 when only natural amino acids are used. But the principles of the invention are equally applicable to other oligomers formed from other polymerisable monomers. The value of s is not critical, and may typically be from 2-100 e.g. 3-20 or more.
The fraction of each lot that is tagged in each step is generally less than 50%. Preferably from 0.25% to 25% of each lot of the support is tagged in each step with a different label. Preferably the support has cleavable linkers, wherein each cleavable linker has at least one group for chemical reaction e.g. chemical synthesis and another group for labelling. Preferably each resulting labelled compound comprises a single label and at least one chemical moiety.
The method involves the use of a set of up to and including n×s different labels. Although the nature of the labels is not critical, it is a preferred feature of the invention that each different label be distinguishable by the analytical procedure used to detect the labels. Groups used as labels should be much more stable to acidic (or other chemical) treatment involved in oligomer synthesis compared to the protecting groups commonly used (e.g. DMT groups to provide 5′ or 3′-protection in nucleotide synthons). Preferred labels are those in which a charged group, preferably a positively charged group is generated by cleavage e.g. photocleavage of a neutral molecule for analysis by mass spectrometry. Examples of such preferred labels are discussed below.
In a preferred embodiment, a split and mix strategy requires a solid support carrying cleavable linkers with three arms—one to attach to the solid support through a cleavable bond; one to initiate synthesis of a chemical product e.g. oligomer; and a third for attachment of the tags. The sites for coupling of synthon and tag monomers will optionally be protected by removable groups. The process can be illustrated by the synthesis of oligomers on a particulate solid support.
At each stage in the synthetic route, the particles of the support are first combined and mixed, and then divided into n lots, where n is the number of different monomers—4 in the case of natural nucleotides—and each monomer is coupled to its site on one lot of the support. A unique tag representing the monomer just added and its position in the sequence is then coupled to a fraction of the support, corresponding approximately to the number of monomers in the final oligomer (i.e. 1/ s for an oligomer with s monomer units). Alternatively, a tag may be coupled to a fraction of the support before or simultaneous with, rather than after the monomer which it represents. Partial coupling may be achieved in a number of different ways. For example, (i) a protecting group on the site may be partially removed; (ii) the coupling may be taken to a fraction of completion; (iii) a fraction of the support may be removed and coupling taken to completion before the fraction is returned to the pool. As the coupling steps proceed, the oligomer is extended one unit at a time, and the tags are added one at a time. The end result is a mixture of molecules on each particle; each molecule will carry the same sequence of monomers in the oligomer, but a fraction, 1/s for s-mers, will carry the tag added at any of the s coupling steps.
An example of this embodiment is shown in FIG. 1 of the accompanying drawings. At A, is illustrated a solid support in the form of a bead derivatised with cleavable linkers each having two arms. At B, one of the two arms of each linker has been reacted with a trityl group carrying a succinimidyl substituent. At C, the other branch of each linker has been reacted with a nucleotide residue shown as G; and one portion of the NHS groups has been substituted by a label R1. At D, oligonucleotide synthesis has continued by formation of dimer chains GT; and a second portion of the NHS groups has been substituted by a second label R2. At E, oligonucleotide synthesis has continued by formation of chains GTT; and a third portion of the NHS groups has been substituted by a label R3. At F, ammonolysis of the beads has given rise to a pool of oligonucleotides of the same sequence, in which each one is attached to a different tag. At G, photolysis has detached three derivatised trityl groups for analysis by mass spectrometry. The split and mix approach ensures that all the oligonucleotides attached to any bead have the same s-mer sequence; and that the bead also carries a total of s different labels, each of which indicates the position and identity of one monomer residue of the oligomer.
An alternative way of partial coupling is to cap the extension of a fraction of the chemical compounds e.g. oligomers with a stable tag group at each extension step. For example, in the case of oligonucleotide synthesis, the coupling agents could include a small proportion of a phosphoramidite protected by one of the stable trityl groups described below as mass tags. Elongation will produce a major proportion with the desired base and a small fraction with a corresponding tag marking the nature and position of the base.
An example of this embodiment is illustrated in FIG. 2 of the accompanying drawings. Oligonucleotide synthesis is performed on derivatised beads A, the first, second and third stages of this synthesis being shown as B, C and D. Each of four phosphoramidite reagents contains a small fraction depending on the length of the oligomer, preferably less than 1/s, of a capping phosphoramidite bearing a very acid-stable NHS-substituted trityl group. After each stage of synthesis, all incorporated NHS groups are reacted with an amine thereby attaching a label. For synthesis of longer oligonucleotides o-methyl phosphoramidite could be used to withstand repetitive amination reactions. The three different labels used in B, C and D are shown in FIG. 2 as R1, R2 and R3. At the end of synthesis, the oligonucleotides are deprotected by treatment with ammonia, but remain attached to the beads. Thus each bead carries a plurality of s-mer oligomers of identical sequence, together with a total of s different substituted trityl labels each of which indicates the identity and position of a monomer unit of the oligomer. The beads are used in an assay procedure. Thereafter photolysis of a bead generates charged substituted trityl moieties E for detection by mass spectroscopy. Alternatively the labelled oligonucleotides can be released into solution.
In another aspect, this invention provides a set of labelled compounds wherein a molecule of a compound of the set is tagged with a single label which identifies the nature and/or the position of a component of that molecule, and different molecules of the same compound are tagged with different labels. The set of labelled compounds may be releasably attached to a solid support e.g. beads; or may be mixed together in solution.
Also envisaged according to the invention is a library consisting of a plurality of the sets of the labelled compounds as herein defined, e.g. a library of ns labelled oligomers, where n is the number of different monomer units and s is the number of monomer units in each labelled oligomer.
In another aspect (e.g. as illustrated in FIG. 2) the invention provides a reagent comprising a solid support which carries on its surface molecules of an oligomer, with different oligomer molecules having the same sequence wherein the oligomer molecules include some shorter oligomer molecules and a shorter oligomer molecule carries a label which identifies the nature and position of a monomer unit of the oligomer molecule. A library consists of a plurality of the said reagents, in which the solid supports are preferably beads.
Preferred features of the labels used herein are:                They should be attached by linkages which are stable to the chemical procedures used in the preparative method and those used to detach the resulting chemical compound e.g. oligomer from a solid support. The trityl residues described below are stable throughout the procedures used to synthesise oligonucleotides.        They should have properties which allow up to n×s labels to be distinguished by the analytical procedure used to detect them, as each chemical moiety or reaction is tagged uniquely. In an example below, it is shown how all 262144 nonanucleotides can be coded uniquely using 36 different tag monomers. This number is readily achieved using the trityl derivatives described below. Alternatively, but less preferably, the same number of 9-mers could be coded for by 18 binary tags or even by a unique combination of 9 tags as described in WO 94/08051.        On cleavage, e.g. by photocleavage, chemical cleavage using acidic conditions, or enzymatic methods, from the parent molecule, they should generate stable species, either neutral molecules or preferably charged ions, for analysis by mass spectrometry. Mass spectrometry is a preferred method of analysis, allowing for the simultaneous detection of hundreds of labels. This property, of generating a preferably charged group by photocleavage of a neutral molecule, ensures that the ions are brought into the vapour phase without the need for added matrix. Therefore it is not necessary to search for “hot spots” as is the case when matrix is added. Not having matrix present also allows for further biochemical processes e.g. oligonucleotide ligation. In certain cleavage methods such as those involving acid, the addition of matrix may enhance the sensitivity of detection.        
Bearing in mind these criteria, preferred labels according to the invention are groups of the formula R1R2R3C— where R1, R2 and R3 are the same or different and each is a monocyclic or fused ring aromatic group that is substituted or unsubstituted. These are groups of the trityl (triphenylmethyl) family. Other possible labels include troponium and those discussed in WO 97/27331. Trityl groups have the desirable property that they are readily cleaved by illumination with a laser in a mass spectrometer. Sensitivity of detection of trityl groups is high because of the stability of the positively charged carbonium ion. This sensitivity gives rise to a number of advantages, e.g. there are enough trityl groups in a molecular monolayer such as results if trityl labelled molecules are tethered covalently to a surface.
Preferably at least one of R1, R2 and R3 carries a substituent selected from C1-C20 alkoxy or hydrocarbyl either unsubstituted or substituted by carboxylic acid, sulphonic acid, nitro, cyano, hydroxyl, thiol, primary, secondary or tertiary amino, primary or secondary amido, anhydride, carbonyl halide or active ester. Hydrogen atoms in these substituents may be partly or wholly replaced by deuterium or halogen e.g. fluorine; this improves the range available for analysis by mass spectrometry.
Preferably each of R1, R2 and R3 is aryl, more preferably phenyl. While substituents may be present at any point in the aromatic (e.g. phenyl) ring, para-substituents are convenient and are preferred. The substituents may be present to confer desired physical or chemical properties on the trityl (or other) group. For example, electron withdrawing groups at ortho or para positions increase the stability of trityl groups to acid hydrolysis. Substituents may be present to alter the formula weight of the trityl (or other) group, so as to enable easy detection and discrimination by mass spectrometry. Non-radioactive isotopic substituents are suitable for this purpose, e.g. small alkyl groups containing 1, 2 or 3 deuterium atoms. Preferred substituents are amine or amide groups. There is a considerable number of amines having different molecular weights that are commercially available and that can be used to provide substituted trityl groups having distinctive formula weights, see for example Table 1 below.
The masses of the majority of commercially available amines lie in the range of 50-250 Da. For some applications it would be desirable to have up to a few hundred mass-tags. The resolution of the tags in TOF-mass spectrometry was found to be satisfactory with at least 4 Da difference between the masses of tags. Therefore, the above range of amines can only yield about 50 different tags. To increase this amount using the same pool of amines, it is possible to incorporate two or four or even more amide substituents per trityl group, and this is illustrated in the experimental section below.
The principle of the system is illustrated in FIG. 3 of the accompanying drawings. At A, an oligonucleotide has been synthesised on a CPG support. At B, a 5′-hydroxyl group of the oligonucleotide has been replaced by an NHS-substituted trityl group. At C, an amide group NHR has been introduced, in which R is chosen to have a characteristic formula weight. At D, the labelled oligonucleotide has been released into solution for use in an assay procedure. At E, the NHR-substituted trityl group has been volatilised by photolysis and has been detected by mass spectrometry.
The above mass spectrometry labels are useful in a variety of other biochemical methods and manipulations. Thus according to another aspect, the invention provides a nucleic acid determination method, which method comprises providing a labelled oligonucleotide or nucleic acid, and removing the label by cleavage to give a charged species which is subjected to mass spectrometry. Preferably nucleic acid analysis, e.g. sequencing or sequence difference analysis, is performed by the use of a labelled primer and/or labelled chain extending nucleotides and/or labelled chain terminating nucleotide analogues, wherein the label is as described above.
In another aspect, the invention provides an assay method in which a labelled probe is partitioned into two fractions of which one is determined, the probe comprising a ligand joined to a label by a link which is cleavable to give a charged species for analysis by mass spectrometry. The invention also includes a library of probes, each comprising a ligand joined to a label by a link which is cleavable to give a charged species for analysis by mass spectrometry, wherein each different probe has a different label. Preferably the labels are as described above.
Certain of the labels are envisaged as new compounds per se according to the invention. These are compounds of the formula R1R2R3CY; where Y is a leaving group e.g. halide or tosylate for displacement by a nucleophile e.g. a thiol, alcohol or amine group; and R1, R2 and R3 are as defined above, with the proviso that R1, R2 and R3 together carry at least two amide groups and/or at least two reactive groups for coupling e.g. N-hydroxysuccinimide ester groups.
In addition, the inventors have manufactured a disposable glass insert for use as a target surface for laser desorption ionisation mass spectrometry. The glass target may be used for analysis of samples spotted and dried directly on to the glass surface. The glass target may also be chemically activated and used as a solid support for immobilised nucleic acids or other compounds using methods already developed. Complementary nucleic acids, mass-tagged oligonucleotides or other compounds isolated and localised on the glass target may then be subjected to direct analysis by laser desorption ionisation mass spectrometry. One advantage of using a solid support is that it may be introduced directly into a mass spectrometer for subsequent detection and avoids unnecessary liquid handling of the sample. Organic polymeric surfaces such as polypropylene are possible alternatives to glass.
Any surface chemistry developed for attachment of compounds to glass may be used to immobilise these compounds directly on to a target for laser desorption ionisation mass spectrometry or matrix-assisted laser desorption ionisation mass spectrometry. For example 3-mercaptopropyl silane derivatisation (Rogers, et al 1999) or amine derivatisation (Beattie, et al 1995; Chen, et al 1999) for the attachment of nucleic acids. The glass inserts are significantly cheaper than conventional inserts and are truly disposable. Mass spectrometry performance is unaffected. (See Example 4 below).
The invention also provides an insert for use as a target for laser desorption ionisation mass spectrometry, which insert has a target surface of glass carrying an immobilised compound for analysis.
The invention also provides a kit comprising a mass spectrometer and a supply of inserts, for use as targets for laser desorption mass spectrometry, having target surfaces of glass.
In a preferred embodiment of the invention, a system for analysing nucleic acids comprises:                a solid support carrying an array of nucleic acids to act as targets for analysis or as probes to capture a target;        oligonucleotide reagents tagged with moieties suitable for analysis by mass spectrometry;        reagents and apparatus for biochemical procedures to allow specific interaction between the tagged oligonucleotides and the target;        a means to introduce the samples into a mass spectrometer;        a mass spectrometer.        
In a more preferred embodiment of the invention, a system for analysing nucleic acids on a solid support comprises:                a solid support carrying an array of nucleic acids to act as targets for analysis or as probes to capture a target;        oligonucleotide reagents, tagged with moieties suitable for analysis by mass spectrometry;        reagents and apparatus for biochemical procedures to allow specific interaction between the tagged oligonucleotides and the target carried out on the solid support surface;        a means to introduce the solid support into a mass spectrometer;        a mass spectrometer.        
In a further preferred embodiment of the invention, an automated system for analysing nucleic acids comprises:                oligonucleotide reagents, tagged with moieties suitable for analysis by mass spectrometry;        a mass spectrometer;        
a computer to carry out the analysis;                software to interpret a mass spectrum.        
Computer programs are provided for oligonucleotide sequence determination by mass spectrometry.
Each base and base position in an oligonucleotide is associated with a unique mass tag.
For oligonucleotides of length s, 4s tags are needed to distinguish between all 4s possible oligonucleotides.
Careful choice of tags ensures that all tags have sufficiently different masses to avoid ambiguity in tag assignment when analysing a mass spectrum.
The chemical formula for each tag is known, so each tag's monoisotopic mass can be calculated.
The isotopic abundances of the elements in the tag are also known, so a complete distribution of masses and abundances of all isotopic variants of each tag can be calculated.
For the tags used so far, the major heavy isotopes of a tag are those due to the presence of 13C, and a typical isotopic abundance distribution is that for C27H30O2N, with relative abundances of 73:22:3 for isotopic masses 400.228, 401.231 and 402.234 respectively. These abundance distributions characterise the tags presence in a mass spectrum, and help to distinguish tags from other features in the spectrum.
The use of elements such as chlorine or bromine in mass tags further aids tag detection and identification, since these elements have markedly different isotopic abundances from that of carbon:                for 35Cl:37Cl the abundance ratio is 76:24 and for 79Br:81Br it is 51:49.        
Mass tags containing these elements will therefore have their own characteristic isotopic distributions. In general, the aim is to design tags with characteristic sets of masses which facilitate identification amongst a background of chemical ‘noise’.
Mass standards are included in each spectrum to allow the spectrum's mass axis to be calibrated. Use of mass standards both within and on either side of the tag mass range ensures accurate mass measurement throughout this range. Ions representing a complete set of possible masses are often seen in mass spectra and these represent the ideal calibration set.
A program has been written to calculate the isotopic abundance distribution and corresponding isotopic masses of any mass tag, using the known masses and isotopic abundances of the elements in each tag. This information is calculated for all mass tags available for use in tagging oligonucleotides.
A second program uses this information to determine the presence of mass tags and hence the sequence in the mass spectrum generated by an oligonucleotide, and works as follows. For each base position in the oligonucleotide, the four regions of the mass spectrum corresponding to the masses of the four possible tags (including their isotopic variants) are examined and compared with the expected tag spectrum. The comparison is done either by identification of spectral peak positions and amplitudes and their differences from those of the potential tag, or by measuring the sum of squares of residuals between the experimental spectrum and that expected from the potential tag. In either case, the four potential tags are ranked by the chosen measure and the best tag is used to assign a base to that base position.
A more powerful approach is to examine each possible oligonucleotide in turn, obtaining a goodness of fit over all s tag regions by the method described above, and then ranking the oligonucleotide sequences by this measure.