DNA genotyping is a process of determining the sequence of DNA nucleotides at a generic locus, or at a position on a chromosome of a gene or other chromosome marker. For the purpose of identifying a human, certain generic loci have been selected as the standard markers to characterize the DNA. Each marker is a DNA fragment containing a repetition of a certain nucleotide sequence. Generally, there are 13 cores and several other accepted standard markers by the security authorities. These markers contain short repetitions (e.g., roughly from 5 to 40) of four nucleotides. They are in the class of Short Tandem Repeat (STR) of DNA sequence.
The repetition numbers at these markers varies rather randomly from person to person. The specific form of DNA sequence at a generic locus is called an allele, which provides sufficient differentiation among people. The STR sequence is inherited from parent's DNA. At each marker, there may be two different alleles, one from each parent, and it is called heterozygous. If the alleles from both parents have same STR numbers, it is homozygous. If the alleles of 13 core markers were heterozygous, each person will have 26 different allele numbers. Assume each number is evenly distributed over a range of 10, the likelihood of having two people with the same alleles numbers from these 13 markers is extremely small.
To measure allele numbers, a DNA fragment containing all STR nucleotides and adjacent sections of nucleotides at each locus is copied from the DNA sample, and replicated by a technique called polymerase chain reaction (PCR). The fragment size is measured in the unit of base pairs, where a base pair is the size of a pair of DNA nucleotides. The sample is placed in a capillary of a sample carrier, and the fragments are separate by size through electrophoresis in which same size fragments arrive at a destination at about the same time, and different size fragments arrive at the destination at different times.
A modern apparatus for DNA analysis uses a rigid sample carrier called biochip which contains multiple capillaries in parallel to run multiple samples simultaneously. To detect the fragments, a fluorescent dye is attached to the fragments and the sample is excited by a light source of narrow beam at a fixed spot of the capillary. The fluorescent dye is also called fluorophore, and its attachment to fragments is also said to label the fragments. Following the excitation, fluorescent light is emitted from the dye very much instantaneously, typically within one microsecond.
The sizes of the fragments in a DNA locus are known to be within certain range. It is possible to find a number of loci in which the fragment sizes of a locus do not overlapped with other loci. Furthermore, it is possible to divide the whole set of loci into several groups. In each group, the fragment sizes of a locus are separated from other loci, and it is called a color group. The fragment size is measure in DNA base pairs and it is ranged from 100 to 400 base pairs in the figure. For each color group, a dye with a distinct fluorescent color is attached to the fragments of all loci in the group. Usually, the dye is attached to a molecule called primer at one end of the fragment. The fragments are separated by the electrophoresis process and detected by an optical system as a digital signal. A fragment is detected as a peak in the signal, and the detection time of a peak can be used to determine the fragment size.
Based on the non-overlapping range of the loci in the color group, the measured fragment size identifies the locus of the fragment. With other supporting data, the measured fragment size can be used to identify it as one of DNA fragments in the locus with known STR number. The sample is prepared with multiple dyes with one dye for each color group. When the sample is excited by the light source, the fluorescent light is mixed with multiple colors from these dyes. It is necessary to use optical filter to separate the fluorescent colors. Each filtered fluorescent color is measured in a detection channel as an electrical signal. Typically, a photo-multiplier tube (PMT) or other detectors, such as charge-coupled device (CCD) camera is used in each detection channel.
Ideally, the emission spectrum of each dye is narrow such that the spectra of the multiple dyes in the sample do not overlap each other. If that were the case and if the optical filter could also be narrow band to detect only one dye, then each of the detected signals would contain only one dye color. In this hypothetic ideal case, each signal measures one and only one color group, in which a DNA fragment peak would only appear in one of the detected signals. By finding and identifying the peaks in these signals, the complete set of STR numbers in all loci of interest can be determined.
However, the emission spectra of the dyes overlap with each other substantially. As the result, each detected signal contains fluorescent signals from all dyes. This has been referred to as color-bleed, and is similar to the cross-talk problems in electronic instruments. With conventional systems, the degrees of color-bleed can be severe, and it is necessary to know the degree of color-bleed from each dye as accurate as possible. The degree is used as a set color-bleed factors that are used to determine signals corresponding to only one distinct color from a dye through a process referred to as color separation. Unfortunately, an inaccurate set of color-bleed factors can lead to false peaks and/or amplitude-diminished true peaks, which may lead to uncertainty with determining STR numbers.