Detection of variation in DNA sequences forms the basis of many applications in modern genetic analysis: it is used in linkage analysis to track disease genes in human pedigrees or economically important traits in animal and plant breeding programmes; it forms the basis of fingerprinting methods used in forensic and paternity testing [Krawczak and Schmidtke, 1994]; it is used to discover mutations in biologically and clinically important genes [Cooper and Krawczak, 1989]. The importance of DNA polymorphism is underlined by the large number of methods that have been developed to detect and measure it [Cotton, 1993]. Most of these methods depend on one of two analytical procedures, gel electrophoresis or molecular reassociation, to detect sequence variation. Each of these powerful procedures has its drawbacks. Gel electrophoresis has very high resolving power, and is especially useful for the detection of variation in the mini- and microsatellite markers that are used in linkage analysis and fingerprinting; it is also the method used to analyse the variation found in the triplet repeats that cause a number of mutations now known to be the cause of around ten genetic disorders in humans [Willems, 1994]. Despite its great success and widespread use, gel electrophoresis has proved difficult to automate: even the systems which automate data collection require manual gel preparation; and as samples are loaded by hand, it is easy to confuse samples. The continuous reading electrophoresis machines are expensive, and manual analysis is technically demanding, so that its use is confined to specialised laboratories which have a high throughput. Furthermore, difficulties in measuring fragment size preclude rigorous statistical analysis of the results.
By contrast, oligonucleotide hybridisation lends itself to automation and to quantitative analysis [Southern et al., 1992], but it is not well suited to the analysis of variation in the number of repeats in the micro- and minisatellites, as the small fractional change in the number of repeats produces a barely detectable change in signal strength; and of course it would not be possible to distinguish two alleles in the same sample as each would contribute to a single intensity measurement. Thus, many different combinations of alleles would produce the same signal. Present hybridisation methods are much better suited to analysing variation in the DNA due to point mutation-base substitution deletions and insertions, for which it is possible to design allele specific oligonucleotides (ASOs) that recognise both the wild type and the mutant sequences [Conner et al., 1983]. Thus it is possible in principle, in a relatively simple test, to detect all possible genotypes. However, a problem that arises in practice in the use of oligonucleotide hybridisation is that in some cases the extent of reassociation is barely affected by a mismatched base pair.
The invention describes a general approach which can be applied to all forms of variation commonly used as DNA markers for genetic analysis. It combines sequence-specific hybridisation to oligonucleotides, which in the preferred embodiment are tethered to a solid support, with enzymatic reactions which enhance the discrimination between matching and non-matching duplexes, and at the same time provide a way of attaching a label to indicate when or which reaction has taken place. Two enzymatic reactions, chain extension by DNA dependent DNA polymerases and DNA strand-joining by DNA ligases, are dependent on perfect matching of sequences at or around the point of extension or joining. As we shall show, there are several ways in which these enzymes can be used with sequence-specific oligonucleotides to detect variation in target sequences.
In all cases, the sequence to be analysed, the target sequence, will be available as a nucleic acid molecule, and may be a DNA molecule produced, for example, by the polymerase chain reaction. However, the methods are not confined to analysis of DNA produced in this way. In all applications, the target sequence is first captured by hybridisation to oligonucleotides which are preferably tethered to a solid support; for example, the oligonucleotides may be synthesised in situ as described [Maskos and Southern, 1992]; or they may be presynthesised and then coupled to the surface [Khrapko et al, 1991].
In one aspect of the invention the novelty arises from the exploitation of enzymes in combination with substrates or primers tethered to solid supports. A further novelty exploits the observation that DNA ligases and polymerases can be used to distinguish sequence variants which differ in the number of units of a tandemly repeating sequence. This observation is surprising, as tandemly repeated sequences can form duplex in any register, thus in principle, length variants can form duplexes which match at the ends even when the two strands contain different numbers of repeat units. Although we demonstrate the application of this method in conjunction with tethered oligonucleotides, it should be evident that this reaction could be used to analyse VNTR (variable number tandem repeat) sequences in the liquid phase followed by some other method of analysis, such as gel electrophoresis.
In one aspect the invention provides a method of analysis which comprises: providing a polynucleotide target including a nucleotide at a specified position, and an oligonucleotide probe, tethered to a support, said probe being complementary to the target and terminating at or close to the said specified position; and performing the steps:
a) incubating the target with the probe to form a duplex,
b) incubating the duplex under ligation conditions with a labelled oligonucleotide complementary to the target,
c) and monitoring ligation in b) as an indication of a point mutation at the specified position in the target.
In another aspect the invention provides a method of analysis which comprises: providing a polynucleotide target having a variable number tandem repeat section and a flanking section, and an oligonucleotide probe having a section complementary to the repeat section and a flanking section of the target; and performing the steps:
a) incubating the target with the probe to form a duplex,
b) incubating the duplex with a labelled oligonucleotide and/or at least one labelled nucleotide under chain extension conditions,
c) and monitoring chain extension as an indication of the length of the variable number repeat section of the target.
A polynucleotide target is provided, in solution when the probe is tethered to a support, and may be DNA or RNA. This polynucleotide target is caused to hybridise with an oligonucleotide probe. The term oligonucleotide is used here, as common terminology far the primers and substrates commonly utilised by polymerase and ligase enzymes. However, the term is used in a broad sense to cover any substance that serves as a substrate for the enzymes, including single stranded chains of short or moderate length composed of the residues of nucleotides or of nucleotide analogues, and also longer chains that would ordinarily be referred to as polynucleotides.
The probe may be tethered to a support, preferably by a covalent linkage and preferably through a 5xe2x80x2 or 3xe2x80x2 terminal nucleotide residue. An array of oligonucleotide probes may be tethered at spaced locations, for example on a derivatised glass surface or the surface of a silicon microchip, or alternatively on individual beads.
In another aspect the invention provides an array of oligonucleotides, for analysing a polynucleotide target containing a variable sequence, in which each component oligonucleotide i) comprises a sequence complementary to the target including an expected variant of the target, and ii) is tethered to a solid support in a chemical orientation which a) permits duplex formation with the target, and b) permits chain extension only when the sequence of the oligonucleotide matches the variable sequence of the target.
In another aspect the invention provides a set or array of oligonucleotides, for analysing a polynucleotide target containing a variable number tandem repeat sequence, in which each component oligonucleotide i) comprises a sequence complementary to a part of the target-immediately adjacent the repeat sequence, ii) comprises a sequence complementary to the repeat sequence of the target and containing a number of repeats expected in the target, and iii) is configured in a way that a) permits duplex formation with the target, and b) permits chain extension only when the number of repeats in the oligonucleotide equals or is less than the number of repeats in the target.
In another aspect the invention provides an array of oligonucleotides in which different oligonucleotides occupy different locations and each oligonucleotide has a 3xe2x80x2 nucleotide residue through which it is covalently tethered to a support and a 5xe2x80x2 nucleotide residue which is phosphorylated.
The invention also provides a method of making an array of different oligonucleotides tethered to different locations of a support, which method comprises the steps of: providing a first intermediate oligonucleotide tethered to the support and a second intermediate oligonucleotide in solution, and a third oligonucleotide that is complementary to both the first and second intermediate oligonucleotides, forming a duplex of the third oligonucleotide with the first and second intermediate oligonucleotides, and ligating the first intermediate oligonucleotide with the second intermediate oligonucleotide; and repeating the steps with oligonucleotides tethered to different locations of the support.