The present invention relates to a method for identifying accessible sites in an RNA molecule for antisense agents and to methods for identifying antisense oligonucleotides.
Most library strategies used for identifying oligonucleotides (ONs) that bind an RNA have relied on cleavage of the target mRNA by RNAse H as the key component of their selection process. This, however, only selects for ONs which bind their targets with normal Watson-Crick interactions. ONs that bind their target RNA can mediate cleavage by RNAse H and the resultant fragments can be isolated by high resolution gel electrophoresis. The precise sequence of the successful ON is determined by sequencing the fragments. Mishra and Toulmxc3xa9 (1) have developed an alternative selection procedure based on selective amplification of ODNs (Oligodeoxynucleotides) that bind their target and have demonstrated non-Watson-Crick interactions in that binding. Their protocol requires just as much sequencing as the RNAse H strategy but they sequence the ODNs and not the target RNA fragments. Furthermore most library strategies use libraries that are free in solution which has the problem of cross hybridisation of ONs within the library. This can be solved by using mulitple libraries of minimally cross hybridising subsets but this adds to the labour involved in the selection of ODNs with active antisense properties.
The present invention provides a method for identifying an antisense oligonucleotide capable of binding to a target mRNA, which comprises contacting the target mRNA with each member of an oligonucleotide library separately under hybridisation conditions, removing unhybridised material and determining which member or members hybridise; wherein the oligonucleotide library comprises a plurality of distinct nucleotide sequences of a predetermined common length, and wherein each nucleotide sequence comprises a known sequence of 4 to 8 bases and all possible combinations of the known sequence are present in the library.
Each member of the oligonucleotide library may be contacted with the target mRNA in a separate container which may be immobilised in the container. Alternatively, each member of the oligonucleotide library may be immobilised at a separate location on a hybridisation array. Preferably the length of the known sequence is from 4 to 6 bases.
Each member of the oligonucleotide library may comprise a set of nucleotide sequences. Each nucleotide sequence may have a window region comprising the known sequence and a flanking region of no more than 8 bases, wherein all possible combinations of bases in the flanking region are present in each set and the common length of the nucleotide sequences is no more than 12 bases, preferably no more than 10 bases. The nucleotide sequences may be made from DNA analogues.
In one arrangement, the step of determining which member or members hybridises comprises determining the member or members which hybridise more rapidly.
The sequence of each member which hybridises may be compared with sequence information relating to the target mRNA so as to identify in the target mRNA one or more antisense binding sites.
This invention is aimed at the problem of identifying accessible sites within an mRNA but the object of this technique is to find accessible sites in a system that is applicable to any RNA molecule, that requires no sequencing and is not biased by the use of RNAse H. The emphasis of this patent is on the use of libraries of short oligonucleotide (ON) probes, preferably 4-mers. This invention provides a method for determining the accessible sites of an RNA to short ON probes by following the hybridisation reactions of each of the probes in the library with the target. Probes are spatially isolated and in the reactions, and either the probe or the target RNA is immobilised.
The rationale behind using probes with windows as short as 4mers is the assumption that most occurrences of a given 4-mer will be sequestered in secondary or tertiary structure. The data gained from following each individual hybridisation is interpreted in terms of the primary sequence:xe2x80x94hybridisation of several 4 mers to an accessible region will be marked by clustering of overlapping 4-mers in the primary sequence of the RNA. Ambiguities should be resolvable in part by analysis of the binding kinetics, so even if a 4-mer occurs more than once, it should still be possible in most cases to identify the most accessible regions in the RNA. The importance of using short probes is the ability to calibrate and normalise the hybridisation behaviour of all members of the library, since the library is not so large that this is impractical. A 4-mer library has 256 members, which is a modest number and it is realistic to test such a library against a number of model molecules to calibrate the system. Calibration is important as most 4-mers have different binding energies for a given degree of accessibility. In order to identify the regions of a molecule that are accessible on the basis of hybridisation kinetics requires that one be able to compare normalised kinetic data that takes into account that discrete overlapping 4-mers will bind the same accessible site with different kinetics. Small arrays of probes are cheaper to construct and manipulate when spatial isolation of probes is desired. Furthermore short probes are much less likely to have any secondary or tertiary structure themselves which would complicate the analysis of the hybridisation reactions.
Hybridisation times can be varied to derive kinetic data about each hybridisation reaction. The amount of probe ON hybridised after a given time will give quantitative data about the binding affinity of that probe for its target. By varying the hybridisation duration, a time course for the hybridisation reaction of each probe can be derived. The accessibility of any region of the structure will be inversely related to the time its complementary probe will take to bind hence detailed information about secondary structure can be acquired by this approach. In combination with a Scintillation Proximity Assay (Amersham), discussed below, the system is still more effective in that no washing is needed, so shorter probes can be used and the hybridisation reaction of each probe can be followed in real time.
Molecular Probes of RNA Structure
Spatial isolation and immobilisation of probes avoids cross-hybridisation problems associated with the use of oligonucleotide libraries free in solution simplifying interpretation of results. This process does not necessarily determine the most effective cut sites for RNAse H but provides detailed structural information about the target RNA to allow rational design of targetted molecules which can be then be developed into effective antisense agents, which will not necessarily operate on the basis of RNAse H mediated RNA degradation.
Advantages Over Previous Library Strategies
This invention uses random libraries of ONs since, in a random library of ONs of a given length, every possible sequence of that length is represented, hence every sub-sequence of that length that comprises a given target RNA is also represented. This means that a random library will be applicable to any RNA and will contain all the target subsequences within it. Since ONs are monitored independently, this means that this approach entails the targeted libraries discussed in a previous patent PCT/GB96/02275, and gives detailed kinetic behaviour about them. Explicit data concerning accessibility of sub-regions of the RNA to normal Watson-Crick interactions can thus be derived by this system.
Non-Watson-Crick interactions might well be less common than the normal complementary interactions but will also be represented in a system like this. One would expect interactions with regions that are partially complementary to the target RNA. If Watson-Crick interactions are the only potential interactions that are possible, such partial hybridisation would be expected to be weak but if non-Watson-Crick interactions play a role, as Mishra and Toulmxc3xa9 suggest (1), then subsets of these interactions might be expected to have higher strength interactions with the target RNA. Hence this strategy will identify potentially specific interactions that are non-Watson-Crick.
There appears to be a difference in the results of experiments performed with random libraries and those with targeted libraries. Different cut sites were identified by each library in the experiments performed in our previous patent application PCT/GB96/02275. In those experiments the targeted libraries covered only three specific sub-regions of the target RNA, TNFa. However the sites identified by this library were not identified by the random library. This is probably due to the fact that targeted libraries have fewer members, each of which are at higher relative concentrations than might be the case for members of a random library. The random library, therefore, picked up more accessible sites in regions not covered by the targeted libraries but the ONs corresponding to those used in the targeted library did not pick up the same cut sites. The cut sites picked up by the targeted libraries may have been due to their relatively higher concentration so these may not have been the most accessible. The problem of cross-hybridisation is also difficult to assess in these systems and the contribution of this effect is likely to be different for each type of library. This novel approach should normalise all of these effects giving much more meaningful data.
Kinetic data should also be acquired by this approach, as discussed below, which will give detailed structural information about target RNAs.
Kinetic Data from Hybridisation Experiments
Hybridisation times can be varied to derive kinetic data about each hybridisation reaction. The amount of probe ON hybridised after a given time will give quantitative data about the binding affinity of that probe for its target. By varying the hybridisation duration, a time course for the hybridisation reaction can be derived. The accessibility of any region of the structure will be inversely related to the time its complementary probe will take to bind to a given degree hence detailed information about secondary structure should be acquired by this approach.
This procedure would be impossible if libraries of large ONs were used due to the number of ONs that would be required. Calibration of each of the probes in a large library would be a massive task, although with automated image analysis of combinatorially synthesised chips, it might be feasible. However it is not necessary or desirable to use large oligonucleotides.
Real Time Kinetic Assays
Radiolabelling is a favourable labelling scheme with desirable properties, particularly radio-isotopes that produce low energy radiation with short path lengths such as 33p. Radiolabelling permits the development of proximity assays. The radiation emitted can be detected by various means, including scintillation or by geiger counters. Proximity detection systems and corresponding proximity assays measure the intensity of a signal from a label which gives a measure of the distance of the label from the detector. Scintillation proximity assays for example are based on the detection of radiation emitted from a radio-isotope. The amount of radiation reaching a scintillant surface is detected by photo-amplification of the scintillation. The mean path length of certain forms of radiation is fairly short. Beta radiation from 33P has a relatively low energy and short path length. This means a probe labelled with 33P will be detected only when it is relatively close to the detection surface. The further from the source the lower the intensity that is measured.
If an RNA molecule is labelled with 33P, and hybridised to an oligonucleotide probe immobilised on a scintillant containing surface or visa versa, one would expect the scintillation count to increase as the amount of RNA hybridised to the probe increases. One would expect there to be a background count, from molecules free in solution close enough to the scintillant surface to be detected, so control reactions with labelled RNA or probe depending on which is to be labelled in the actual assays must be performed. In this sort of system, spatial resolution of probes is essential, as only one radiolabel can be used at a time, so only one probe or RNA could be labelled at a time. The primary benefit of a system based on radiolabelling is the ability to measure hybridisation reactions in real time to give detailed kinetic data. Real time analysis can be achieved by detecting scintillation with a photoamplification and detection system coupled to appropriate signal processing electronics such as the Amersham Cytostar-T scintilling microplates.
Hybridisation Probes
There are two complementary difficulties with the use of short ON libraries. To ensure that the hybridisation of probe to an immobilised RNA is strong enough, one requires that the oligonucleotides be long enough to ensure a reasonable degree of hybridisation but the longer the probe the more massive the task of resolving the behaviour of individual library members. Since RNA/DNA hybrids have higher binding energies than RNA/RNA or DNA/DNA duplexes they are more stable than either of the homo-duplexes, one can use shorter probes than one might use for homogenous interactions and one can use non-natural analogues that have higher binding affinity for natural nucleic acids than probes composed of natural nucleic acids.
Non-Natural Nucleic Acid Analogues
Since this approach does not require recognition by RNAse H, one has more flexibility in the use of non-natural backbones. One can use non-natural base analogues to increase binding energies of any interactions. For the purposes of this system one might use a backbone that is less charged than the natural phosphodiester linkages such as methylphosphonodiester linkages or peptide nucleic acids, Increased energies of probe binding interactions might be useful to allow strand invasion of double stranded regions more easily by the immobilised probed oligonucleotides, the kinetics of which should reveal useful information about the structure of the region being invaded.
Increasing xe2x80x98Boundxe2x80x99 Time of ON Probes
The problem of weak hybridisation that faces this embodiment is quite acute in the light of the fact that the shorter the ON, the more readily the molecule will dissociate from its target RNA. In carrying out structural probes using short ONs, a large quantity of the probe oligonucleotide would be added to each well of immobilised RNA. This would drive the hybridisation equilibrium in favour of hybridisation, thus significant hybridisation might occur. Once however the unbound oligonucleotide is washed away, the equilibrium shifts dramatically in favour of dissociation of hybridised probe. Various measures can be taken to increase the xe2x80x98on-timexe2x80x99 of bound oligonucleotides by increasing the binding energy of the interactions. Similar considerations about non-natural nucleic acid analogs as discussed in the first embodiment above apply here. One might conceivably also include cross-linking effectors at this stage that are photo-activatable to ensure that hybridised probe is xe2x80x98fixedxe2x80x99.
Using libraries of short ON probes is a problem in that to get decent hybridisation one needs a nucleic acid of reasonable length, but each additional base added to a probe increases the number of ONs in a library exponentially. One can overcome the problem of weak hybridisation of short oligonucleotides indirectly by constructing oligonucleotides in sets such that each set is composed of a fixed number of bases of known sequence flanked on one side or other or both by a further fixed number of bases where all possible combinations of nucleotides are represented, these could also be a universal base:
5xe2x80x2-NNXXXXNN-3xe2x80x2
The above example has 2 bases, labeled N for any base, flanking the known sequence xe2x80x98windowxe2x80x99, XXXX, on both sides. Thus there would be 256 different sets of 8 mers which could be used to probe the immobilised RNA. This is effectively the same as probing with 4-mers but the flanking bases would increase the stability of interactions. In conjunction with non-natural nucleic acid analogs, this alternative could increase the stability of complexes sufficiently to allow washing and quantitative measurements to be made.
Using probes with windows as short as 4mers might be problematic in that they may appear too frequently in an RNA and thus might make resolving kinetic data more difficult, as there are likely to be more than 2 occurrences of any given 4mer present in a single RNA. Using 5-mers or 6-mers as windows, with stabilising flanking regions, might be easier to resolve, if automated systems were available to cope with the added labour as these are more likely to appear only once in any given RNA.
Probe Arrays
One can test oligonucleotides (ONs) individually by spatially isolating probes in separate wells on a microwell plate. One can immobilise a target RNA and challenge the RNA with individual, fluorescently labeled ONs. Thus for a library of ONs of 4 nt, an array of 256 wells would be required, into which equal quantities of RNA would be immobilised. Each well would be challenged with a different labeled ON. The ON is allowed to hybridise for a predetermined length of time and the unhybridised ONs can then be washed off The quantity of ON hybridised is determined by measuring the fluorescence in each well. This approach will require significant quantities of RNA which can readily be generated using, for example, the T7 phage RNA polymerase system. An alternative is to immobilise the ON probes and challenge these with labeled RNA.
A random oligonucleotide library can be constructed while immobilised on a glass surface such that distinct regions of the array carry distinct oligonucleotides within the library (2). A random library of ONs of 8 nt will have 48 possible members if all possible sequences are represented. Since the ONs are all immobilised on the array there will be no cross-hybridisation.
An RNA for which the tertiary structure is unknown can be cloned and produced in quantity in vitro using for example the T7 phage RNA polymerase system. The RNA can be labeled with a fluorescent label. This can then be used to challenge the immobilised library under conditions that favour the adoption of the normal tertiary structure of the RNA, i.e. in vivo cytoplasmic conditions. The RNA is allowed to hybridise onto the array for a predetermined duration and then unhybridis ed RNAs are washed off. Where hybridisation has occurred can be visualised by detecting fluorescence. The regions of the array from which fluorescence is detected will reveal which ONs the RNA will hybridise to.
These approaches face the problem of weak hybridisation of short ON probe; the minimum size of probe ON will be determined by the strength of interaction necessary to immobilise a large molecule of RNA to the array and allow the RNA to resist being washed off. This would have to be tested empirically but the problem of rapid dissociation should be less severe, when the RNA is the mobile element, as the size of the molecule ought to make it somewhat more sluggish than the small ONs used in the first embodiment. The same approach of creating ONs with short windows of known sequence and flanking regions of random sequence can be readily applied to this embodiment and similar considerations regarding non-natural nucleic acid analogues apply too. Using nuclease resistant analogues would have the additional advantage of being more reusable than phosphodiester linkages, as one can never completely ensure that there is no contamination by nucleases, etc. which would damage a DNA array.
This embodiment is quantitative, as well, as long as each RNA molecule bears a single fluorescent tag. Thus the regions of the array which give the most fluorescence will be those to which the RNA will hybridise to the most strongly. By repeatedly challenging the array with the target RNA for varying but predetermined durations, one would be able to get detailed kinetic information about the hybridisation reaction occurring at each point on the array. This will reveal regions whose structure must be disrupted to allow hybridisation as these will have much slower kinetics.
An alternative to the use of fluorescent labels would be the use of mass labels cleavably linked to the RNA used to probe the array. Mass labels that are derivatives of photo-excitable compounds of the kind discussed in GB 9700746.2, such as nicotinic, sinapinic or cinnamic acid, can be photocleaved and excited into the gas phase by application of appropriate frequencies of light, ideally using a laser. Such labels could be incorporated into an RNA molecule using a terminal transferase reaction for example or an end ligation. If an RNA molecule labelled with laser excitable mass labels of this kind were hybridised to an array of oligonucleotide probes, the degree of hybridisation of the RNA to distinct regions of the array, corresponding to individual oligonucleotides, could be determined by MALDI (Matrix Assisted Laser Desorption Ionisation) mass spectrometry. Simpler, non-exitable mass labels could also be used. In this embodiment the RNA would be hybridised to an array, and following hybridisation, the supernatent would be removed and the array would be embedded in an excitable agent, of the kinds described above, and ionisation of cleaved labels could be mediated indirectly by excitation of the matrix.
Rational Drug Design with the Process
Tertiary Structure Determination
Secondary structure modelling of nucleic acids is well developed and in conjunction with data from a system like this it should be possible to develop decent models of tertiary structure of a target RNA. This will start to make so-called xe2x80x98rationalxe2x80x99 drug design a much more quantitative process. In conjunction with other methods such as NMR, complete structure determination will probably be possible. At the very least, accurate secondary structure predicition should be achieved. Most likely only minimal further information from other methods would be needed to determine a structure for any given RNA.
Databases and Drug Selection
This system will, with use, provide a comprehensive database of information about RNA tertiary structure which will lead to better theoretical models of RNA structure and allow a much more specifically targeted approach to designing effective antisense agents. The evidence of Wagner et al (3) suggests that short oligonucleotides can show specificity for target RNAs. The qualification that must be made to this suggestion is that short ONs would have to be highly structure specific. With a comprehensive database of RNA structures a system like this might actually pinpoint potential drugs directly once it is established what structural features are widespread and which are rare. Directly searching for rare features will thus be possible making choice of drug candidates much simpler.
More General Antisense Targets
This system is not specific to detecting RNAse H sensitive sites and could be used for targeting more general antisense agents at various RNAs in cells. Thus this system has potential for more general applications in developing therapeutic agents and research tools for targeting more general functional RNAs in vivo such as ribosomes, splicing apparatus and some of the active RNAs that have recently been implicated in sex determination.
Automation
Both embodiments can be implemented with automated liquid handling systems as the procedure is simple and repetitive. The hybridisation array approach would probably be cheaper in terms of reagents but may be difficult to implement if the binding strength of the interaction between array and target RNA is too small to permit washing of the array. The first embodiment will have a much larger requirement for reagents, although micro-arrays of wells can be constructed to reduce this requirement.
(1) R. K. Mishra, J. J. Toulmxc3xa9, C.R.Acad. Sci. Paris, Life Sciences 317, 977-982, 1994.
(2) W. Bains, Chemistry in Britain, February 1995, 122-125.
(3) R. W. Wagner et al, Nature Biotechnology 14, 840-844, 1996.
(4) Siew Peng Ho et al, Nucleic Acids Research 24, 1901-1907.