Nucleic acids have been found to be useful analytes for the determination of the presence or absence of genes or microorganisms in human body fluids, food or environment in the field of health care. Nucleic acid analysis has found widespread use after the introduction of nucleic acid amplification, like the Polymerase Chain Reaction (PCR, see U.S. Pat. No. 4,683,202). Thus, sufficient amounts of nucleic acids are available from each sample. The nucleic acids can be determined from this pretreated sample using a variety of different techniques, dependent on the particular purpose. Most assays require the use of a probe which is either immobilized or immobilizable or is labelled by attachment of one or more reporter groups.
A reporter group has the characteristics to be itself capable to be determined or it can be reacted with reagents that make the probe determinable via said reporter group. Thus, for example, probes that are labelled by reporter groups can be determined, as can be hybrids that contain the probe and a nucleic acid to be determined. In case of immobilized probes, the hybrid between the probe and the nucleic acid to be determined is determined at the solid phase to which the probe is bound. In a particular form of assays, not only one nucleic acid having a specific sequence, but a large number of nucleic acids of different sequence is determined. For this purpose, the probes are immobilized in tiny spots in an array on a flat surface such as a glass chip (EP-A-0 476 014 and TIBTECH (1997), Vol. 15, 465-469, WO89/10977, WO89/11548, U.S. Pat. Nos. 5,202,231, 5,002,867, WO 93/17126). Further development has provided methods for making very large arrays of oligonucleotide probes in very small areas.(U.S. Pat. No. 5,143,854, WO 90/15070, WO 92/10092). Microfabricated arrays of large numbers of oligonucleotide probes, called “DNA chips” offer great promise for a wide variety of applications (see e.g. U.S. Pat. Nos. 6,156,501 and 6,022,963).
However, nucleic acid determinations often suffer from the problem that the base pairing possibilities between the natural bases A and T and C and G have different stability. This can be attributed to the different capability of these bases to form hydrogen bonding. Thus, the dA-dT-base pair has two hydrogen bridges, while the dG-dC-base pair has three hydrogen bridges. This results in different melting temperatures (Tm) of hybrids, depending on the GC content [1-3]. The higher the GC content, the higher the Tm. The hybridisation strength or the degree of hybridization may be investigated by the measurement of the Tm of the resulting duplex. This can be done by exposing a duplex in solution to gradually increasing temperature and monitoring the denaturation of the duplex, for example, by absorbance of ultraviolet light, which increases with the unstacking of base pairs that accompanies denaturation. The Tm is generally defined as the temperature midpoint of the transition from a fully duplex structure to complete denaturation, i.e. the formation of two isolated single strands.
Therefore in routine nucleic acid analysis, there is often the wish to change the Tm of a nucleic acid molecule. For example, for certain purposes it may be advantageous to equalize or harmonize the Tm of nucleic acids of the same length or to make it even independent from the length of the nucleic acid or the binding region in order to be in the position to apply similar hybridization conditions for all assays. This is particularly necessary for assays using arrays, as on such arrays the hybridizing conditions for each probe must be identical. One solution was the use of low hybridization temperatures. Under such conditions, many nucleic acids having a low degree of base sequence complementarity will bind to the probe. This is called unspecific binding which does not allow discrimination between similar sequences. Another proposal was directed to the use of chemical reagents in the hybridization mixture, for example the addition of tetramethylammonium chloride (TMAC). This reagent reduces the difference between the stability of dG-dC and dA-dT base pairs but the effect is insufficient for short oligonucleotides. Further the addition of salts such as TMAC may not be appreciated as it complicates the optimization of the assay. Another proposal was directed to the use of different concentrations of each different (immobilized) probe in one assay. This was found to be technically complex if not impossible on a chip surface. As a further option the substitution of ribonucleotides in an oligonucleotide composed of deoxyribonucleotides, and vice versa was applied for the adaptation of DNA stability, Hoheisel (1996), Nucleic Acids Res. 24, 430-432.
However, it may be also advantageous to increase the Tm of a given nucleic acid. This is interesting in the field of nucleic acids used for antisense therapy, mismatch discrimination and for nucleic acids used in diagnostics. The nucleic acids may be used as primers or probes. The aim is to allow a more simple design of primers and probes used in multiplex reactions and to synthesize shorter capture probes used on chips, as the chemical synthesis of oligonucleotides on a chip surface used for arrays is not as effective as in routine oligonucleotide synthesis. The relative contribution of each base pair to the melting temperature of a hybrid is the higher the shorter an oligonucleotide is. In consequence, the difference in stability between a mismatch and a perfect match is higher for shorter oligonucleotides. However, short oligonucleotides hybridize weakly and, therefore, the hybridization reaction has to be performed at low stringency. In consequence, the potential higher ability of discrimination between different sequences by shorter oligonucleotides can only be used under conditions of low stringency. It would be of considerable advantage to provide bases which allow to achieve a high level of mismatch discrimination under stringent conditions, in particular for short oligonucleotides at temperatures used e.g. in amplification reactions. Further, there is the desire in the state of the art to use short oligonucleotides with high discriminatory power in arrays as the chemical synthesis of oligonucleotides on solid supports used for arrays is not as effective as in routine synthesis. Therefore, the ability to use shorter oligonucleotides under stringent conditions would be of considerable advantage. If bases are found that lead to an increase of the Tm of an oligonucleotide hybridized to its complementary strand, other bases may then be used in the same oligonucleotide to further adjust the Tm according to the preferences of the test system to be used.
Theoretically, oligonucleotide duplexes forming other tridentate base pairs should exhibit a similar or higher stability, e.g. those with 2-aminoadenine opposite to thymine. Nevertheless, it has been shown that 2-aminoadenine-thyrine/uracil base pairs exhibit only a low thermal stability [4-10]. From the data published so far one can conclude that the additional NH2-group of 2′-deoxyadenosin-2-amine (molecule 1 (see below); n2Ad) contributes very little to the base pair stability of a DNA duplex. The Tm-increase is in the range of only 1-2° C. Furthermore, this stabilization does not correspond to the total number of n2 Ad-residues incorporated in the duplex instead of dA [11]. A stronger stabilization as reported for duplex DNA is found for duplex RNA or for DNA-RNA hybrids [9] [10] [12]. A rather high base pair stability is observed when 2-aminoadenine is introduced into PNA [13] or hexitol nucleic acids [14]. Modified backbones other than of DNA or of RNA appear to enhance the stability of the 2-aminoadenine-thymine/uracil pair.
The unusual behavior of oligonucleotide duplexes containing n2Ad-dT residues is interesting for the development of an adenine-thymine recognition motif showing the same or even a higher stability than a guanine-cytosine base pair. In the following compounds the purine moiety of compound 1 is replaced by an 8-aza-7-deazapurine (pyrazolo[3,4-d]pyrimidine) or a 7-deazapurine (pyrrolo[2,3-d]pyrimidine) leading to nucleosides (2a [15], 2b, 2c or 3 [16], [17] see below).

Compounds of similar chemical architecture were investigated in the prior art. The synthesis of 7-substituted-7-deaza and 8-aza-7-deazapurine 2′-deoxyribonucleotides, their incorporation into oligonucleotides, and the stability of the corresponding duplexes has been investigated (Seela et al. (1997) Nucleosides & Nucleotides 16, 963-966). This document does not contain a disclosure of 7-substituted 7-deaza-8-aza-diamino-purines. Stabilization of duplexes by pyrazolopyrinidine base analogues have been reported (Seela et al. (1988) Helv. Chim. Acta 71, 1191-1198; Seela et al. (1988) Helv. Chim Acta 71, 1813-1823; and Seela et al. (1989) Nucleic Acids Res. 17, 901-910)
Pyrazolo[3,4-d]pyrimidine residues in oligonucleotides are also useful as sites of attachment of various groups (WO90/14353). Oligonucleotides having incorporated one or more pyrazolo[3,4-d]pyrimidine have an enhanced tendency to form triplexes (Belousov et al. (1998). Nucleic Acids Res. 26, 1324-1328).
The compounds 7-iodo, 7-cyano and 7-propynyl-7-deaza-2-amino-2′-deoxyadenosine were synthesized by Balow et al. (1997, Nucleosides & Nucleotides 16, 941-944) and incorporated into oligonucleotide sequences. These oligonucleotides exhibit enhanced binding affinities to RNA complements relative to unmodified sequences. However, no corresponding 8-aza-compounds were made and investigated. Seela et al. (1999, Nucleosides & Nucleotides 18, 1399-1400) disclose 7-substituted 8-aza-7-deazapurine DNA, its synthesis and duplex stability. The authors do not address possible uses of the disclosed compounds.
WO 90/03370 discloses 3,4-disubstituted and 3,4,6-trisubstituted pyrazolo-[3,4-d]-pyrimidines, more particularly 4,6-diamino-pyrazolo-[3,4-d]-pyrimidines with a linker at the C3-position to which an intercalator, an electrophilic cross linker or a reporter group is attached. These compounds may be attached to sugars or incorporated into oligonucleotides and thereby used for the identification, isolation, localization and/or detection of complementary nucleic acid sequences of interest. U.S. Pat. No. 5,594,121 discloses novel oligomers with enhanced abilities to form duplexes or triplexes. The oligomers may contain 7-substituted 8-aza-7-deaza-diamino-purines with propinyl and aryls as substituents at the 7-position. Compositions containing these oligomers may used for diagnostic purposes.
There is still a need to provide probes with a high discriminatory power and with a short length, the Tm of which is high under stringent conditions and which can be used in various methods useful in the field of diagnostics as e.g. in the Lightcycler® system (Roche, Mannheim, Germany), TaqMan® (WO92/02638 and corresponding U.S. Pat. Nos. 5,210,015, 5,804,375, 5,487,972) or other applications involving fluorescence energy transfer.
Terms and Definitions
Conventional techniques of molecular biology and nucleic acid chemistry, which are within the skill of the art, are fully explained fully in the literature. See, for example, Sambrook et al., 1989, Molecular Cloning—A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.; Oligonucleotide Synthesis (M. J. Gait, ed., 1984); Nucleic Acid Hybridization (B. D. Hames and S. J. Higgins. eds., 1984); and a series, Methods in Enzymology (Academic Press, Inc.), all of which are incorporated herein by reference. All patents, patent applications, and publications mentioned herein, both supra and infra, are incorporated herein by reference The terms “nucleic acid” and “oligonucleotide” refer to polydeoxyribonucleotides (containing 2-deoxy-D-ribose), to polyribonucleotides (containing D-ribose), and to any other type of polynucleotide which is an N glycoside of a purine or pyrimidine base, or modified purine or pyrimidine base. There is no intended distinction in length between the terms “nucleic acid” and “oligonucleotide”, and these terms will be used interchangeably. These terms refer only to the primary structure of the molecule. Thus, these terms include double- and single-stranded DNA, as well as double- and single-stranded RNA. The term “polynucleotide” shall be used interchangably for “nucleic acid”.
The term “backbone” or “nucleic acid backbone” for a nucleic acid binding compound according to the invention refers to the structure of the chemical moiety linking nucleobases in a nucleic acid binding compound. The bases are attached to the backbone and take part in base pairing to a complementary nucleic acid binding compound via hydrogen bonds. This may include structures formed from any and all means of chemically linking nucleotides, e.g. the natural occurring phosphodiester ribose backbone or unnatural linkages as e.g. phosphorthioates, methyl phosphonates, phosphoramidates and phosphortriesters. Peptide nucleic acids have unnatural linkages. Therefore, a “modified backbone” as used herein includes modifications to the chemical linkage between nucleotides as described above, as well as other modifications that may be used to enhance stability and affinity, such as modifications to the sugar structure. For example an α-anomer of deoxyribose may be used, where the base is inverted with respect to the natural β-anomer. In an embodiment, the 2′-OH of the sugar group may be altered to 2′-O-alkyl or 2′-O-alkyl-n(O-alkyl), which provides resistance to degradation without comprising affinity. An unmodified nucleotide sequence having a phosphodiester backbone is “comparable” to a nucleobase-containing sequence having a modified backbone if the two sequences have identical base sequencing. Thus, the backbones of such sequences are also comparable.
The term “nucleic acid binding compound” refers to substances which associate with nucleic acids of any sequence and are able to function as binding partner to a substantially complementary nucleic acid. The binding preferably occurs via hydrogen bonding between complementary base pairs when the nucleic acid binding compound is in a single-stranded form. Preferably, non-natural bases, the subject of the invention, attached to the backbone of the nucleic acid binding compound may be also involved in hydrogen-bonding, however, these may also be able to form hydrogen bonds to only some or all natural occurring bases as e.g. inosine. The expert in the field recognizes that the most well-known “nucleic acid binding compounds” are nucleic acids as DNA or RNA.
The term “probe” refers to synthetic or biologically produced nucleic acids (DNA or RNA) which, by design or selection, contain specific nucleotide sequences that allow them to hybridize under defined predetermined stringencies, specifically (i.e., preferentially) to target nucleic acids. A “probe” can be identified as a “capture probe” meaning that it “captures” the target nucleic acid so that it can be separated from undesirable materials which might obscure its detection. Once separation is accomplished, detection of the captured target nucleic acid can be achieved using a suitable procedure. “Capture probes” are often already attached to a solid phase.
The term “hybridization” refers the formation of a duplex structure by two single-stranded nucleic acids due to complementary base pairing. Hybridization can occur between fully complementary nucleic acid strands or between “substantially complementary” nucleic acid strands that contain minor regions of mismatch. Conditions under which only fully complementary nucleic acid strands will hybridize are referred to as “stringent hybridization conditions” or “sequence-specific hybridization conditions”. Stable duplexes of substantially complementary sequences can be achieved under less stringent hybridization conditions; the degree of mismatch tolerated can be controlled by suitable adjustment of the hybridization conditions. Those skilled in the art of nucleic acid technology can determine duplex stability empirically considering a number of variables including, for example, the length and base pair concentration of the oligonucleotides, ionic strength, and incidence of mismatched base pairs, following the guidance provided by the art (see, e.g., Sambrook et al., 1989, Molecular Cloning—A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.; and Wetmur, 1991, Critical Review in Biochem. and Mol. Biol. 26(3/4):227-259; both incorporated herein by reference).
The term “primer” refers to an oligonucleotide capable of acting as a point of initiation of DNA synthesis under conditions in which synthesis of a primer extension product complementary to a nucleic acid strand is induced, i.e., either in the presence of four different nucleoside triphosphates and an agent for extension (e.g., a DNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature. As used herein, the term “primer” is intended to encompass the oligonucleotides used in ligation-mediated amplification processes, in which one oligonucleotide is “extended” by ligation to a second oligonucleotide which hybridizes at an adjacent position. Thus, the term “primer extension”, as used herein, refers to both the polymerization of individual nucleoside triphosphates using the primer as a point of initiation of DNA synthesis and to the ligation of two primers to form an extended product. A primer is preferably a single-stranded DNA. The appropriate length of a primer depends on the intended use of the primer but typically ranges from 6 to 50 nucleotides. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template. A primer need not reflect the exact sequence of the template nucleic acid, but must be sufficiently complementary to hybridize with the template. The design of suitable primers for the amplification of a given target sequence is well known in the art and described in the literature cited herein. Primers can incorporate additional features which allow for the detection or immobilization of the primer but do not alter the basic property of the primer, that of acting as a point of initiation of DNA synthesis. For example, primers may contain an additional nucleic acid sequence at the 5′ end which does not hybridize to the target nucleic acid, but which facilitates cloning of the amplified product. The region of the primer which is sufficiently complementary to the template to hybridize is referred to herein as the hybridizing region.
The terms “target, “target sequence”, “target segment”, “target region”, and “target nucleic acid” refer to a region or subsequence of a nucleic acid which is to be amplified or investigated.
As used herein, a primer is “specific” for a target sequence if the number of mismatches present between the primer sequence and the target sequence is less than the number of mismatches present between the primer sequence and non-target sequences which may be present in the sample. Hybridization conditions can be chosen under which stable duplexes are formed only if the number of mismatches present is no more than the number of mismatches present between the primer sequence and the target sequence. Under such conditions, the primer can form a stable duplex only with a target sequence. Thus, the use of target-specific primers under suitably stringent amplification conditions enables the specific amplification of those target sequences which contain the target primer binding sites. The use of sequence-specific amplification conditions enables the specific amplification of those target sequences which contain the exactly complementary primer binding sites.
Halogen means a fluoro, chloro, bromo or iodo group. The most preferred halogen groups are —I and —Br.
Alkyl groups are preferably chosen from alkyl groups containing from 1 to 10 carbon atoms, either arranged in linear, branched or cyclic form. The actual length of the alkyl group will depend on the steric situation at the specific position where the alkyl group is located. If there are steric constraints, the alkyl group will generally be smaller, the methyl and ethyl group being most preferred. All alkyl, alkenyl and alkynyl groups can be either unsubstituted or substituted. Substitution by hetero atoms as outlined above, will help to increase solubility in aqueous solutions.
Alkenyl groups are preferably selected from alkenyl groups containing from 2 to 10 carbon atoms. For the selections similar considerations apply as for alkyl groups. They also can be linear, branched and cyclic. The most preferred alkenyl group is the ethylene group.
Alkynyl groups have preferably from 2 to 10 carbon atoms. Again, those carbon atoms can be arranged in linear, branched and cyclic manner. Further, there can be more than one triple bond in the alkynyl group. The most preferred alkynyl group is the 3-propargyl-group.
Alkoxy groups preferably contain from 1 to 6 carbon atoms and are attached to the rest of the moiety via the oxygen atom. For the alkyl group contained in the alkoxy groups, the same considerations apply as for alkyl groups. The most preferred alkoxy group is the methoxy group.
By “aryl” and “heteroaryl” (or “heteroaromatic”) is meant a carbocyclic or heterocyclic group comprising at least one ring having physical and chemical properties resembling compounds such as an aromatic group of from 5 to 6 ring atoms and comprising 4 to 20 carbon atoms, usually 4 to 9 or 4 to 12 carbon atoms, in which one to three ring atoms is N, S or O, provided that no adjacent ring atoms are O—O, S—S, O—S or S—O. Aryl and heteroaryl groups include, phenyl, 2-, 4- and 5-pyrimidinyl, 2-, 4- and 5-thiazoyl, 2-s-triazinyl, 2-, 4-imidazolyl, 2-, 4- and 5-oxazolyl, 2-, 3- and 4-pyridyl, 2- and 3-thienyl, 2- and 3-furanyl, 2- and 3-pyrrolyl optionally substituted preferably on a ring C by oxygen, alkyl of 1-4 carbon atoms or halogen. Heteroaryl groups also include optional substitution on a ring N by alkyl of 1-4 carbon atoms or haloalkyl of 1-4 carbon atoms and 1-4 halogen atoms. Exemplary substituents on the aryl or heteroaryl group include methyl, ethyl, trifluoromethyl and bromo. Such substituted aryl and heteroaryl groups include benzyl and the like. “Heteroaryl” also means systems having two or more rings, including bicyclic moieties such as benzimidazole, benzotriazole, benzoxazole, and indole. Aryl groups are the phenyl or naphtyl moiety, either unsubstituted or substituted by one or more of amino, -cyano, -aminoalkyl, —O—(C1-C10)-alkyl, —S—(C1-C10)-alkyl, —(C1-C10)-alkyl, sulfonyl, sulfenyl, sulfinyl, nitro and nitroso. Most preferred aryl group is the phenyl group. Preferred arylalkyl group is the benzyl group. The preferred alkylamino group is the ethylamino group. The preferred —COO(C1-C4) alkyl group contains one or two carbon atoms in the alkyl moiety (methyl or ethyl esters). Other aryl groups are heteroarylgroups as e.g. pyrimidine, purine, pyrrol, or pyrazole. Aryl and heteroaryl. According to the present invention the term aryl shall also include all heteroaryls.
Aryloxy groups preferably contain from 6 to 20 carbon atoms. Those carbon atoms may be contained in one or more aromatic rings and further in side chains (for example, alkyl chains) attached to the aromatic moiety. Preferred aryloxy groups are the phenoxy and the benzoxy group.
A “protecting group” is a chemical group that is attached to a functional moiety (for example to the oxygen in a hydroxyl group or the nitrogen in an amino group, replacing the hydrogen) to protect the functional group from reacting in an undesired way. A protecting group is further defined by the fact that it can be removed without destroying the biological activity of the molecule formed, here the binding of the nucleic acid binding compound to a nucleic acid. Suitable protecting groups are known to a man skilled in the art. Especially preferred protecting groups for example for hydroxyl groups at the 5′-end of a nucleotide or oligonucleotide are selected from the trityl groups, for example dimethoxytrityl. Preferred protecting groups at exocyclic amino groups in formula I are acyl groups, most preferred the benzoyl group (Bz), phenoxyacetyl or acetyl or formyl, and the amidine protecting groups as e.g. the N,N-dialkylformamidine group, preferentially the dimethyl-, diisobutyl-, diisobutyryl and the di-n-butylformamidine group. Preferred O-protecting groups are the aroyl groups, the diphenylcarbamoyl group, the acyl groups, and the silyl groups. Among these most preferred is the benzoyl group. Preferred silyl groups are the trialkylsilyl groups, like, trimethylsilyl, triethylsilyl and tertiary butyl-dimethyl-silyl. Another preferred silyl group is the trimethylsilyl-oxy-methyl group (TOM)(Swiss Patent Application 01931/97). Further, preferred protecting groups are groups as ortho nitro-benzyl protecting groups like 2-(4-nitrophenyl)ethoxycarbonyl (NPEOC) or photoactivable compounds as 2-nitrophenylpropyloxycarbonyl (NPPOC) (Giegrich et al., Nucleosides & Nucleotides 1998, 17, 1987). According to the invention, also the phthaloyl group may be used as protecting group.
Any atom in the definitions within the formulae presented herein is not limited to a specific isotope. Thus, a phosphorus atom (P) can either mean the regular 31P or the radioactive 32P or a mixture thereof. The same applies for hydrogen (H/D/T), carbon (C), iodine (Cl, Br, I) and nitrogen (N).
During chemical synthesis, any reactive groups as e.g. —OH, —SH, —NH2, —NH-alkyl, —NH-alkenylene, —NH-alkynylene, or —NH-aryl (including those groups in reporter groups) should be protected by suitable protecting groups, i.e. that the present invention contemplates compounds for the synthesis of olignucleotides wherein the formulas or substituents are chosen with the proviso that one or two hydrogen atoms of any —OH, —SH, —NH2, —NH-alkyl, —NH-alkenylene, —NH-alkynylene, or —NH-aryl group are substituted by a protecting group. Further, during chemical synthesis, the compound will be attached for convenience to a solid phase. In these cases, the definitions of the substituents given above will be selected accordingly.
Reporter groups are generally groups that make the nucleic acid binding compound as well as any nucleic acids bound thereto distinguishable from the remainder of the liquid, i.e. the sample (nucleic acid binding compounds having attached a reporter group can also be termed labeled nucleic acid binding compounds, labeled probes or just probes). The term reporter group and the specific embodiments preferably include a linker which is used to connect the moiety intended to be used (the actual solid phase or the fluorophoric moiety) to the position of attachment as the reporter group. The linker will provide flexibility such that the nucleic acid binding compound can bind the nucleic acid sequence to be determined without major hindrance by the solid phase. Linkers, especially those that are not hydrophobic, for example based on consecutive ethylenoxy units, for example as disclosed in DE 3943522 are known to a man skilled in the art.
By “array” is meant an arrangement of addressable locations on a device. The locations can be arranged in two dimensional arrays, three dimensional arrays, or other matrix formats. The number of locations can range from several to at least hundreds of thousands. Most importantly, each location represents a totally independent reaction site. Each location carries a nucleic acid binding compound which can serve as a binding partner for a second nucleic acid binding compound, a nucleic acid, in particular a target nucleic acid.
The term “building block” or “subunit” refers to a compound which can be used in oligonucleotide synthesis wherein subsequently single building blocks are chemically linked to form a more complex structure, i.e. an oligonucleotide precursor. Examples for building blocks are phosphoramidites or phosphonates.
The term “substituted compound” shall mean that a compound carries further chemical groups, moieties or substituents other than the compound itself. These substituents shall in principle include but are not limited to halogens or alkyl, alkenyl, alkynyl, or aryl compounds optinally substituted with further heteroatoms