DNA sequencing is a method for determining the order of the nucleotide bases (adenine, guanine, cytosine, and thymine) in a sample including DNA. With one technique, the DNA is lysed, and fragments thereof containing alleles, or short tandem repeat (STR) sequences of the four nucleotides, are replicated through polymerase chain reaction (PCR). A reference substance containing synthesized fragments of known fragment sizes is added to the sample containing the replicated fragments. These fragments of known size have been referred to as internal sizing standards or internal lane standard (ILS) fragments.
The DNA and ILS fragments are labeled with different target specific fluorescent dyes (e.g., one for each nucleotide base and one for the ILS fragment) and separated through electrophoresis. In one instance, this includes applying a high voltage across negative and positive electrodes of a capillary carrying the labeled fragments. A net electric field exerts an electrostatic force on the surface charge of the labeled fragments, and the labeled fragments migrate through the capillary at a speed that depends on the size of the fragment and/or other factors.
Theoretically, fragments of the same size should migrate and arrive at a reading region at about the same time. For reading, a light beam having a wavelength within a predetermined wavelength range irradiates and excites the dyes of the fragments, and an optical reader senses characteristic fluorescent light emitted by the dyes and generates electrical signals indicative of the characteristic fluorescent light, including DNA and ILS signals. The characteristic fluorescent light allows for separating the fragments by nucleotide base and ILS sub stance.
However, the migration time of the DNA and ILS fragments varies from one lane to another lane of a biochip and from one run to another run in different biochips for various reasons. As a consequence, the fragment sizes cannot be determined simply from the acquisition time of the peaks in the signal. On the other hand, the ILS substance contains only fragments of known sizes and migrates in the same manner as the DNA fragments. As such, the ILS signal can be used to translate the acquisition times of the DNA fragments into fragment sizes. Unfortunately, the ILS signal may include false peaks and/or missing peaks, which may lead to erroneous translation and sequencing.
Furthermore, there is a small offset between the calculated fragment size and the true DNA fragment size, and this offset may differ among alleles and from run to run. To correct for these offsets, a substance that contains virtually all possible DNA fragments is processed alone in a separate lane. The signal of this substance is called an allelic ladder signal, and it is detected like the DNA sample. Each peak in the ladder signal indicates the expected peak position for the DNA fragment, and the allele number can be accurately determined by matching the peak in DNA signal to the peaks in the ladder signal. However, similar to the ILS signal, the allelic ladder signal may include false peaks and/or missing peaks, which can lead to erroneous matching.