The present invention relates to novel compounds e.g. modified base analogues that can be used to label nucleic acids which can be used in a wide variety of molecular biology applications.
Nucleic acid molecules labelled with reporter groups have been used in many molecular biology techniques such as sequencing and hybridisation studies. The labelled nucleic acid molecules have been produced by a variety of methods. These methods have included labelled nucleoside, deoxynucleoside, or dideoxynucleoside triphosphates, labelled phosphoramidites and direct coupling of labels to nucleic acids (Renz, EP 120376). The labelled nucleotides and labelled nucleic acid molecules produced can be used in hybridisation studies of nucleic acid and nucleic acid sequencing. A wide variety of labels have been used in these techniques including radioactive isotopes, eg 3H, 14C, 32P, 33P and 35S, hapten, biotin, mass tags or fluorescence.
There has been increasing interest in the use of modified base and nucleotide analogues in the labelling of nucleic acids. Some of these analogues are base specific and may be incorporated into nucleic acids in the place of a single natural base i.e. A, T, G or C. Other analogues have the potential to base pair with more than one natural base and hence be incorporated in the place of more than one natural base.
WO 97/28177 discloses nucleoside analogues containing the structure 
Wherein X is O, S, Se, SO, CO or NR7 
R1, R2, R3 and R4 are the same or different and each is H, OH, F, NH2, N3, O-hydorcarbyl or a reporter group,
R5 is OH or mono-, di-, or tri-phosphate or thiophosphate or corresponding boranophosphate,
or one of R2 and R5 is a phosphoramidite,
Z is O, S, Se, SO, NR9 or CH2 
and R6, R7, R8, R9 are the same or different and each is H, alkyl, aryl or a reporter group.
WO 99/06422 discloses base analogues of the structure 
where X=O or NH or S
Y=N or CHR6or CR6 
W=N or NR6 or CHR6or CR6 
n=1 or 2
each R6 is independently H or O or alkyl or alkenyl or alkoxy or aryl or a reporter moiety, where necessary (i.e. when Y and/or W is N or CR6) a double bond is present between Y and W or W and W,
Q is a sugar or sugar analogue
The present invention describes novel compounds of the formula 
Wherein Q is H or a sugar or a sugar analogue or a nucleic acid backbone or backbone analogue, Y=O, S, NR10, where R10 is H, alkyl alkenyl, alkynyl, X is H, alkyl, alkenyl, alkynyl, aryl, heteroaryl or a combination thereof or, preferably, a reporter group.
In a first aspect, the present invention provides novel compounds of the formula (I), Wherein Q is H or a sugar or a sugar analogue or a nucleic acid backbone or backbone analogue, Y=O, S, NR10, where R10 is H, alkyl alkenyl, alkynyl, X is H, alkyl, alkenyl, alkynyl, aryl, heteroaryl or a combination thereof or preferably a reporter group. The reporter group may be joined to the heterocycle via a suitable linker arm, which can be similar to the options already defined for X or may be larger. In this aspect, suitably X can comprise a chain of up to 30 atoms, more preferably up to 12 atoms. The reporter may contain more than 12 atoms. X may also contain a charged group, which imparts a net positive or negative charge to the nucleotide base.
Suitably, Q may be H or a group selected from 
Where Z is O, S, Se, SO, NR9 or CH2 where R9 is H, alkyl, alkenyl, alkynyl or a reporter, R1, R2, R3 and R4 are the same or different and each is H, OH, F, NH2, N3, O-hydrocarbyl, NHR11 where R11 is alkyl, alkenyl, alkynyl, or a reporter group. Suitably, the hydrocarbyl group has up to 6 carbon atoms. R11 and R9 may comprise a chain of up to 30 atoms, preferably up to 12 atoms.
R5 is OH, SH or NH2 or mono-, di or tri-phosphate or-thiophosphate, or corresponding boranophosphate, or one of R2, R4, and R5 is a phosphoramidite or other group for incorporation in a polynucleotide chain, or a reporter group;
or Q may be of one of the following modified sugar structures:
Acyclic sugars having structures (ii) or (iii) 
wherein R12 is C1-C4 alkyl, hydroxyC1-C4alkyl, or H, preferably methyl, hydroxymethyl or H, or sugars having structures (iv) to (vi) 
R14=C1-C6 alkyl, hydroxy C1-C6 alkyl, C1-C6 alkylamine, C1-C6 carboxyalkyl or preferably a reporter moiety.
(vii) or Q is a nucleic acid backbone consisting of sugar-phosphate repeats or modified sugar phosphate repeats (e.g. LNA) (Koshkin et al, 1998, Tetrahedron 54, 3607-30) or a backbone analogue such as peptide or polyamide nucleic acid (PNA) (Nielsen et al, 1991, Science 254, 1497-1500) or a polycationic ribonucleic acid guanidine (RNG), Bruice et al, 1995 PNAS 92 6097, or pentopyranosyl oligonucleotides (HNA), Eschenmosser A., 1999, Science, 284 2118-2124.
In one preferred embodiment, when Q is H, these compounds are base analogues. In a second preferred embodiment, Q is a sugar or sugar analogue or a modified sugar, e.g. a group having a structure according to (i) to (vi) and the compounds are nucleotide analogues or nucleoside analogues. When Q is a nucleic acid backbone or a backbone analogue, (vii), these compounds are herein after called nucleic acids or polynucleotides.
When Q is a group of structure (i) R1, R2, R3 and R4 may each be H, OH, F, NH2, N3, O-alkyl or a reporter moiety. Thus ribonucleosides and deoxyribonucleosides and dideoxyribonucleosides are envisaged together with other nucleoside analogues. These sugar substituents may contain a reporter group in addition to any that might be present on the base Preferably, R1=H, R2=H, OH, F, N3, NH2, NH(CH2)n R13 or Oxe2x80x94(CH2)nNH2 where n is 0-12, R4=H, OH, N3, NH2, F, OR3, R3=H, OH or OR13 where R13 is alkyl, alkenyl, alkynyl or a reporter. More preferably at least one of R1 and R2 and at least one of R3and R4 is H.
R5 is OH, SH, NH2 or mono, di- or tri-phosphate or thiotriphosphate or corresponding boranophosphate. When R5 is triphosphate, such triphosphate nucleotides may be incorporated into a polynucleotide chain by using a suitable template- primer together with a DNA polymerase or reverse transcriptase and appropriate dNTPs and ddNTPs when necessary. NTPs can be used with suitable RNA polymerases. The compounds of the present invention may be incorporated into a PCR product using standard techniques or used in the production of cDNA from a suitable RNA template, primer dNTP mix and reverse transcriptase. Oligonucleotide and polynucleotide chains may also be extended by nucleotide analogues of the present invention by the use of terminal transferase.
Alternatively, one of R2, R4, and R5 may be a phosphoramidite or H-phosphonate or methylphosphonate or phosphorothioate or amide, or an appropriate linkage to a solid surface e.g. hemisuccinate controlled pore glass, or other group for incorporation, generally by chemical means, in a polynucleotide chain. The use of phosphoramidites and related derivatives in synthesising oligonucleotides is well known and described in the literature.
In another preferred embodiment, the nucleoside analogue or nucleotide analogue which contains a base analogue as defined is labelled with at least one reporter group. Suitable reporter moieties may be selected from various types of reporter. The reporter group may be a radioisotope by means of which the nucleoside analogue is rendered easily detectable, for example 32P or 33P or 35S incorporated in a phosphate or thiophosphate or phosphoramidite or H-phosphonate group, or alternatively 3H or 14C or an iodine isotope. It may be an isotope detectable by mass spectrometry or NMR. It may be a signal group or moiety e.g. an enzyme, hapten, fluorophore, chromophore, chemiluminescent group, Raman label or electrochemical label. Particularly preferred reporters are fluorescent dyes such as fluorescein, rhodamine, bodipy and cyanines.
The reporter group may comprise a signal group or moiety and a linker group joining it to the remainder of the molecule. The linker group may be a chain of up to 30 carbon, nitrogen, oxygen and sulphur atoms, rigid or flexible, saturated or unsaturated. Such linkers are well known to those skilled in the art. The linker group may have a terminal or other group eg NH2, OH, COOH, SH, maleimido, haloacetyl or other group by which a signal moiety may be attached before or after incorporation of the nucleoside analogue into a nucleic acid chain. It is also possible to link the molecules of the present invention to a solid surface through a suitable linker group as described above.
It is also possible that molecules of the present invention may act as reporters themselves. Antibodies may be raised to the whole molecule or part of the molecule e.g. ring structure or modified sugar. The antibodies can carry labels themselves or be detected by second antibodies by methods well known in the art. These methods often use enzyme detection or fluorescence.
The nucleoside analogues of this invention can be used in any of the existing applications which use native nucleic acid probes labelled with haptens, fluorophores or other reporter groups. These include Southern blots, dot blots and in polyacrylamide or agarose gel based methods or solution hybridisation assays and other assays in microtitre plates or tubes or assays of oligonucleotides or nucleic acids on arrays on solid supports. The probes may be detected with antibodies targeted either against haptens which are attached to the analogue or against the analogues themselves. The antibody can be labelled with an enzyme or fluorophore. Fluorescent detection may also be used if the base analogue is itself fluorescent or if there is a fluorescent group attached to the analogue.
The use of the different mass of the nucleoside analogue may also be used in detection as well as by the addition of a specific mass tag identifier to it. Methods for the analysis and detection of oligonucleotides, nucleic acid fragments and primer extension products have been reported (U.S. Pat. No. 5,288,644 and WO 94/16101). These methods are usually based on MALDI ToF mass spectrometry. They measure the total mass of an oligonucleotide and from this the sequence of the oligonucleotide may be ascertained. In some cases the mass of the oligonucleotide or fragment may not be unique for a specific sequence. This will occur when the ratio of the natural bases, ACGT is similar in different sequences. For example the simple 4 mer oligonucleotide will have the same mass as 24 other possible 4 mers e.g. CAGT, CATG, CGTA etc.
With longer nucleic acid fragments, it may be difficult to resolve differences in mass between two fragments due to a lack of resolution in the mass spectrum at higher molecular weights. Incorporation of base modified analogues according to the present invention can be used to help identify the specific oligonucleotide or nucleic acid fragment, as their masses are different from those of the natural bases. For example, the two sequences ACGT and CAGT can be identified in the presence of one another by mass spectrometry if one of the natural nucleotides in one of the sequences is replaced with one of the analogues of the present invention. For example, in the oligonucleotide CAGT, the T can be replaced with an analogue of the present invention with little effect on a specific application e.g. hybridisation or enzymatic incorporation. Yet the two sequences can be readily identified by mass spectrometry because of the change in mass due to the introduction of the analogue.
The modification can be made to the bases and also to the sugars or inter nucleotide linkage. For example thio sugars or phosphothioate linkages will also result in distinctive mass changes. A large variety of changes to the base, sugar or linker can yield a number of molecules of different masses which will be useful to define a specific sequence accurately by its mass, especially in multiplex nucleic acid hybridisation or sequencing applications.
RNA is an extremely versatile biological molecule. Experimental studies by several laboratories have shown that in vitro selection techniques can be employed to isolate short RNA molecules from RNA libraries that bind with high affinity and specificity to proteins, not normally associated with RNA binding, including a few antibodies, (Gold, Allen, Binkley, et al, 1993, 497-510 in The RNA World, Cold Spring Harbor Press, Cold Spring Harbor N.Y., Gold, Polisky, Uhlenbeck, and Yarus, 1995, Annu. Rev. Biochem. 64: 763-795, Tuerk and Gold, 1990, Science 249:505-510, Joyce, 1989, Gene 82:83-87, Szostak, 1992, Trends Biochem. Sci 17:89-93, Tsai, Kenan and Keene, 1992, PNAS 89:8864-8868, Tsai, Kenan and Keene, 1992, PNAS 89:8864-8868, Doudna, Cech and Sullenger, 1995, PNAS 92:2355-2359). Some of these RNA molecules have been proposed as drug candidates for the treatment of diseases like myasthenia gravis and several other auto-immune diseases.
The basic principle involves adding an RNA library to the protein or molecule of interest. Washing to remove unbound RNA. Then specifically eluting the RNA bound to the protein or other molecule of interest. This eluted RNA is then reverse transcribed and amplified by PCR. The DNA is then transcribed using modified nucleotides (either 2xe2x80x2 modifications to give nuclease resistance e.g. 2xe2x80x2 F, 2xe2x80x2 NH2, 2xe2x80x2 OCH3 and/or C5 modified pyrimidines and/or C8 modified purines). Those molecules that are found to bind the protein or other molecule of interest are cloned and sequenced to look for common (xe2x80x9cconsensusxe2x80x9d) sequences. This sequence is optimised to produce a short oligonucleotide which shows improved specific binding which may then be used as a therapeutic.
The base analogues described here, when converted to the ribonucleoside triphosphate or ribonucleoside phosphoramidite, will significantly increase the molecular diversity available for this selection process. This may lead to oligonucleotides with increased binding affinity to the target that is not available using the current building blocks.
The analogues of the present invention may have properties which are different to those of the native bases and have other important applications. They may find use in the antisense field. They may also be useful in the therapeutic field as antiviral (anti-HIV and anti-HBV etc) (WO 98/49177) and anticancer agents. Many nucleoside and nucleotide analogues have been developed as antiviral agents. They often act by inhibition of DNA polymerase and/or reverse transcriptase activity by a number of means. A number of nucleoside analogues, such as AZT, ddC, ddI, D4T, and 3TC are being used alone or in combination of other nucleoside or non-nucleoside analogues as anti-HIV agents. The analogues of the present invention may also have antiviral activities alone or in combination with other compounds. Since combination drug therapy is being used more frequently to treat viral infections, having an increased number of compounds available by including compounds of the present invention could enhance the possibility of successful treatments.
Particularly preferred compounds of the invention are compounds of formula (II) 
Wherein X is as herein before defined.
The invention will now be further described with reference to the following non-limiting examples.