1. Field of the Invention
The present invention is related to the sequencing of nucleic acids by hybridization. line 5, replace the heading with the following new heading
2. Description of the Related Art
There are currently three formats for sequencing by hybridisation (SBH).
Format 1 SBH [1] attaches the nucleic acid to be analysed to a solid support and then sequentially hybridises labelled oligonucleotides. Format 2 SBH [2] attaches an array of positionally encoded oligonucleotides to a solid support and then hybridises the labelled nucleic acid to be analysed to the array. Format 3 SBH [3] attaches an array of positionally encoded oligonucleotides to a solid support and then hybridises the nucleic acid to be analysed to the array in the presence of is labelled oligonucleotides in free solution. A ligation reaction is then used in order to join the two oligonucleotides, giving greater specificity and information.
Format 1 SBH has been shown to work with short oligonucleotides [4]. 8 mers and even shorter oligonucleotides have been successfully employed [5]. Format 2 SBH requires the use of much longer oligonucleotides for success. 11 mer probes, or longer, are generally required. 20 mers are the norm [6], making the use of generic arrays of all N mers out of the question with current technology (an array of all 20 mers with the smallest pixels currently imaginable would be prohibitively large).
A difficulty with performing format 2 SBH arises because target nucleic acids often have secondary structure which sterically hinders some parts of the target from hybridising with oligonucleotides immobilised in an array. To overcome this problem it has been proposed to chop the target nucleic acid into shorter segments, e.g. of length comparable to the immobilised oligonucleotides. In practice such chopping has proved difficult to achieve in a reliable and uniform manner. The present invention can be seen as providing an indirect way of achieving the same effect. The invention permits the advantages of both format 1 and format 2 SBH to be combined in the same method. In particular, the use of a format 2 positionally encoded array of all N mers or a subset thereof is made possible with arrayed oligonucleotides of length less than 11 mers. This method allows the rapid and facile characterisation of sequence differences between two or more nucleic acid species. The method may be used in order to determine the existence or otherwise of point mutational differences between one or more test nucleic acids and a reference nucleic acid. The method may also be used in order to characterise sequence differences arising from either small deletions or insertions.
In one aspect the invention provides a method of analysing a target nucleic acid by the use of a mixture of labelled oligonucleotides in solution and an array of immobilised oligonucleotides, which method comprises the steps of:
a) incubating under hybridisation conditions the target nucleic acid with the mixture of labelled oligonucleotides.
b) recovering those labelled oligonucleotides that hybridised in a) with the target nucleic acid,
c) incubating under hybridisation conditions the recovered labelled oligonucleotides from b) with the array of immobilised oligonucleotides,
d) observing distribution of the labelled oligonucleotides on the array and using the information to analyse the target nucleic acid.
In another aspect the invention provides a method of determining differences between a target nucleic acid and a reference nucleic acid, by the use of a first mixture of oligonucleotides in solution labelled with a first label, a corresponding mixture of oligonucleotides in solution labelled with a second label distinguishable from the first label, and an array of immobilised oligonucleotides, which method comprises the steps of:
a) incubating under hybridisation conditions the target nucleic acid with the first mixture of labelled oligonucleotides; and incubating under hybridisation conditions the reference nucleic acid with the second mixture of labelled oligonucleotides,
b) recovering a mixture of those first labelled oligonucleotides and those second labelled oligonucleotides that hybridised in a) with the target nucleic acid or the reference nucleic acid,
c) incubating under hybridisation conditions the recovered mixture of first labelled oligonucleotides and of second labelled oligonucleotides from b) with the array of immobilised oligonucleotides,
d) observing distribution of first labelled oligonucleotides and of second labelled oligonucleotides on the array and using the information to determine differences between the target nucleic acid and the reference nucleic acid.
Preparation of Single Stranded Nucleic Acid
The target nucleic acids may be DNA, RNA, PNA [7], other nucleic acid mimetics or mixtures thereof. They may be single stranded or double stranded; linear, circular, relaxed or supercoiled. They may be of eukaryotic, prokaryotic or viral or archeabacterial origin and may range in size from oligomers to whole genomes.
The target nucleic acids are rendered single stranded. The most preferable method is to amplify the region of interest by PCR [8] and then capture one of the amplified strands using a solid support. Many methods will be obvious to those skilled in the art. The use of a biotinylated PCR primer followed by capture with streptavidin coated magnetic beads [9] is a preferred embodiment.
The PCR may be carried out either by using conventional dNTPs or dNTP analogues that impart altered properties to the PCR productxe2x80x94such as reduced intramolecular secondary structure and thus improved short oligonucleotide access to PCR product in single stranded form. Example nucleotide analogues include: dITP, 7-deaza-dGTP, 7-deaza-dATP, 7-deaza-dlTP, 5-hydroxymethyl-dUTP and 4-methyl-dCTPxe2x80x94either singly and in combination. Many other analogues will be obvious to those skilled in the art. Some of these analogues may require the use of lower PCR annealing temperatures and/or longer PCR extension times for optimal incorporation.
The method of the invention involves use of a mixture of labelled oligonucleotides in solution. This is preferably a mixture of all or a subset of N mers where N is from 5 to 10, preferably 8 or 9. The labelling moieties may be detected by means of fluorescence (emission, lifetime or polarisation), absorption, colour, chemiluminescence, enzymatic activity, radioactive emission, mass spectroscopy or refractive index effects (e.g. surface plasmon resonance).
The N mers in solution may be DNA, RNA, PNA, other nucleic acid mimetics or mixtures thereof. They may be single stranded or partially double stranded. The N mers may also contain bases such as 5-nitroindole, 3-nitropyrrole or inosine that pair with all four usual DNA basesxe2x80x94improving the hybridisation properties of the N mers without increasing the nucleic acid sequence complexity. The N mers may likewise contain bases such as 2-aminopurine and 5-methylcytosine that again improve the hybridisation properties without increasing the nucleic acid sequence complexity.
Structures that can only (or preferentially) form A helices are of particular interest as conditions may be found (e.g. R-loop conditions) where the N mer/PCR product complexes are more stable than the internal secondary structure within the PCR product.
The N mers could also be molecular beacon [10] type xe2x80x98panhandlexe2x80x99 structures with stems comprising 5-nitroindole, 3-nitropyrrole, inosine, isodC:isodG [11], dk:dX [12] or dk:dp [13] hairpins. Other such structures will be obvious to those skilled in the art.
The method of the invention also involves use of an array of immobilised oligonucleotides. Each oligonucleotide is immobilised at a spaced location on a surface of a support. The array is preferably of all possible N mer sequences or a subset thereof where N is preferably from 5 to 10, particularly 8 or 9.
The array elements may be DNA, RNA, PNA, other nucleic acid mimetics or mixtures thereof. They may be single stranded or partially double stranded. The array elements may also contain bases such as 5-nitroindole, 3-nitropyrrole or inosine that pair with all four DNA basesxe2x80x94improving the hybridisation properties of the array without increasing its nucleic acid sequence complexity. The array elements may likewise contain bases such as 2-aminopurine and 5-methylcytosine that again improve the hybridisation properties of the array without increasing its nucleic acid sequence complexity.
Arrays may be employed on glass, plastic, silicon, supported membrane and supported gel substrates. A given substrate may have one or more test site arrays for use with the invention.
In step a) of the method, the target nucleic acid is incubated under hybridisation conditions with the mixture of labelled oligonucleotides. In step b), those labelled oligonucleotides that hybridised in a) with the target oligonucleotide are recovered. Where the target nucleic acid has been immobilised on magnetic beads as discussed above, the captured oligonucleotides may readily be recovered by denaturation and removal of the magnetic beads
In a preferred aspect, the method of the invention may be performed to determine differences between a target nucleic acid and a reference nucleic acid. In this case, the reference nucleic acid is incubated under hybridisation conditions with a second mixture of labelled oligonucleotides, and those members of that mixture that hybridised with the reference nucleic acid are recovered. The first mixture of labelled oligonucleotides in solution is distinguishable from the second mixture of labelled oligonucleotides in solution. For example, the labels used may be fluorescent dyes having different fluorescence characteristics. The labels are herein called label 1 and label 2. Preferably the two sets of captured oligonucleotides are mixed.
In step c) the recovered mixture of labelled oligonucleotides is incubated under hybridisation conditions with the array of immobilised oligonucleotides.
Upon hybridisation to the array, captured oligonucleotides in the test and reference nucleic acids of N bases complementary to array sequences will display the normal ratio of label 1 to label 2 upon detection where the test and reference nucleic acid have the same sequencexe2x80x94i.e. in the majority of cases.
Upon hybridisation to the array, captured oligonucleotides in the test and reference nucleic acids of N bases complementary to array sequences will display an altered normal ratio of label 1 to label 2 upon detection where the test and reference nucleic acid have different sequencexe2x80x94i.e. in the vicinity of a mutation.
Difference Characterisation
By observing the sequences of array elements where the label 1 to label 2 ratio is different from the majority of hybridisation events and by observing which of the two labelled moieties dominates at each such complementary array element (of known sequence), one may deduce the sequence at and around any difference between the two nucleic acid species. In the simple case of a point mutational difference between the test and reference nucleic acid with an array of all possible N-mers, a region of 2Nxe2x88x921 bases will be characterised (the reference/mutated base and the Nxe2x88x921 bases to either side of this).
Advantages of the Current Invention
A particular problem that is overcome in this approach where part of the amplified single stranded region of interest has significant internal secondary structure. This situation will deny access from short oligonucleotides in solution (or as part of a positionally encoded array on a solid support). It is essentially for this reason that success has not been achieved for format 2 SBH with arrayed oligonucleotides shorter than 11 mers (arrays of 20 mers are generally used). In this invention, nucleotide analogues may be usedxe2x80x94either in the PCR reaction or in the solution oligonucleotides or in the arrayed oligonucleotidesxe2x80x94in order to circumvent problems with PCR product secondary structure.
This method has the advantage that by detecting perturbations in the ratio between the labelling moieties upon detection, all hybridisation events are internally controlled for their absolute hybridisation intensitiesxe2x80x94a significant improvement over other SBH methods. Not only is information given that a difference exists between the two nucleic acid species but also the exact nature of the difference and the local sequence around this difference can be determined.
If four colour detection is implemented, the mutational event could be sequenced on both strands simultaneouslyxe2x80x94greatly improving the accuracy of an already very informationxe2x80x94rich method.
The method does not use enzymes for the recognition of sequence differences. The method thus provides a more robust and reliable way to characterise nucleic acid sequence differences.
In addition to the above, a single array of, for example, all possible N-mers or a subset thereof, can be employed for the analysis of any nucleic acid system. Unlike other methods for sequence characterisation with arrays [6], a distinct sequence array does not need to be fabricated anew for every nucleic acid system that is to be characterised.
Unlike methods such as SSCP [14], where the optimal size for a PCR product is around 200 bp, this method allows the user to xe2x80x98walkxe2x80x99 along a genomic region of interest in much larger stepsxe2x80x941-10 kb fragments would probably be about optimum for this method.
This method allows for highly parallel analysis where the shorter labelled oligonucleotides allow better mismatch discrimination. Repeated cycles of N mer capture and denaturation can be used to improve the final detection signal. Optimal chemical intermediates can selectively overcome, secondary structure. Incomplete arrays of (optimised) longer probes could be used with appropriate sequence reconstruction algorithms. Solution hybridisation to long probes and oligonucleotide hybridisation to the array should also be faster.
The present invention will now be illustrated in more detail in the Example below. However, it is important to note that the following Example represents only a specific embodiment of the present invention. Other embodiments are also possible and encompassed by the claims. Thus, the following Example should not be construed to limit the spirit and scope of the claims.