The invention relates to methods and compositions for measuring the resolution of molecular separation systems, such as polyacrylamide gels.
Systems for separating and resolving nucleic acids employ many different technologies and are used for many different purposes. Separation technologies include, e.g., chromatographic methods such as High Performance Liquid Chromatography (HPLC), mass spectrometry methods such as Matrix-Assisted Laser Desorptionxe2x80x94Time-of-Flight (MALDI-TOF), both of which are principally used for resolving small ( less than 1000 base) nucleic acids. Another separation technology is electrophoresis. Electrophoresis can be further divided into low-voltage techniques, for resolving large molecules, typically employing agarose as a separation matrix; and high-voltage applications, for resolving small molecules, typically employing polyacrylamide derivatives as a separation matrix.
All of the above techniques separate different molecular species based on differential migration rates. The results of all of these techniques can be represented by a plot of the amount of a molecular species as a function of either migration time to a fixed or mobile reference point, or of distance migrated in a fixed time. In either case the position of the peak concentration in time or distance, is characteristic for each different concentration reaches a peak at a characteristic time or distance, for each different nucleic acid molecular species. Typically, the peak approximates a gaussian curve, and the distance between peaks is measured from the highest point of each peak. The capability to distinguish molecules of different sizes is called xe2x80x9cresolution.xe2x80x9d Resolution (R) is usually expressed by the mathematical formula
R=d/w
where d is the distance from the signal peak representing a molecular species to the signal peak representing a molecular species differing in length by a single nucleotide, and w is the average fall width at half maximum (FWHM) for the peaks.
In typical separation systems, d decreases and h increases as the separated molecules increase in size. Plots of d and h against molecular size generate two curves that cross at the point where d=h, and therefore R=1, defining the xe2x80x9ccross-over pointxe2x80x9d of the particular separation system employed.
In a separation application that results in even peak heights, if R greater than 1, the sum of the signal in between two adjacent peaks must at some point be less than the height of either peak; thus there will be at least a small dip in the trace between the two peaks. Similarly, when R less than 1, two adjacent peaks will merge into an apparent single peak.
Different applications have different resolution requirements. Some applications, such as DNA and RNA sequencing, high-accuracy genotyping, and some forms of mutation detection (Oeffner reference) require single-base resolution (i.e., R greater than 1) in at least part of the separation range. Other applications, such as genotyping by determining the number of dinucleotide, trinucleotide or tetranucleotide repeats present at a locus, require minimum resolution of 2, 3, and 4 bases, respectively (R=0.5, 0.33, and 0.25) throughout their usefull ranges.
In typical separations, R reaches its maximum value at small molecular sizes, and drops dramatically at larger molecular sizes. Furthermore, values of R achieved in a separation can be adversely affected by problems with equipment, reagents, and protocols. A reliable measure of system performance is to measure the point at which R=1, which is often referred to as the xe2x80x9ccrossover pointxe2x80x9d. This value is typically expressed as the number of nucleotides corresponding to the position at which the curve for peak spacing, d, crosses the curve for peak width, h.
In developing, evaluating, and testing nucleic acid sequencing systems, determination of the crossover point is desirable, because the crossover point is directly related to the sequencing read length a system can deliver. Measurement of the crossover point is not usually performed using the output of a sequencing system in normal operation because of the difficulty in measuring FWHM values.
Instead, system performance is assayed by running sequencing reaction products of a DNA molecule of known sequence on the system and the number of high-confidence correct base determinations is counted. This method produces results that are confounded by variations in DNA sequencing chemistry and reaction quality. In addition, the resulting quality measures can not easily be compared in different locations and different times, because they are particular to the computer software used to perform the base sequence determination.
The invention is based in part on the discovery of methods for measuring the crossover point of molecular separation systems, and of compositions useful for measuring the crossover point. The methods and compositions allow for rapid, routine, and reproducible estimations of the crossover point of a separation system. In addition, the invention provides a method of assessing the quality of electrophoretic separation that is independent of any particular chemistry, reaction conditions, or software analysis program.
In one aspect, the invention provides a method for estimating the crossover point of a polymer separation system by electrophoresing a plurality of polynucleotide pairs, which can be alternatively referred to as crossover standards, through a polymer separation system. Each polynucleotide pair includes a first polynucleotide and a second polynucleotide. A signal associated with the first polynucleotide and a signal associated with the second polynucleotide in each polynucleotide pair is then detected. Next, a first polynucleotide pair is identified in which the signal associated with the first polynucleotide of the pair is not resolved from, or is coincident with, the signal associated with the second polynucleotide of the polynucleotide pair. A second polynucleotide pair is also identified in which the signal associated with the first polynucleotide of the pair is resolved from the second polynucleotide of the pair. Next, a region in the polymer separation system corresponding to that part of the system containing components migrating between the first polynucleotide pair and the second polynucleotide pair is identified. This region corresponds to the location of the crossover point in the polymer separation system.
In a preferred embodiment, the invention includes a method for estimating the crossover point of a polyacrylamide or polyacrylamide derivative-based separation system by electrophoresing a plurality of polynucleotide pairs through a polyacrylamide separation system. Each polynucleotide pair consists of a first labeled polynucleotide consisting of a core sequence and a second labeled polynucleotide consisting of the core sequence and an extension sequence, e.g., a one nucleotide extension sequence.
Also provided by the invention are compositions and kits that include polynucleotide pairs useful for estimating a crossover point in a separation system.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In the case of conflict, the present Specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
Other features and advantages of the invention will be apparent from the following detailed description and claims.