DNA genotyping is the process of determining the sequence of DNA nucleotides at a position on a locus (a chromosome of a gene or other chromosome marker). For the purpose of identifying a human, certain generic loci have been selected as the standard markers to characterize the DNA. Each marker is a DNA fragment containing a repetition of a certain nucleotide sequence. Generally, there are thirteen (13) cores and several other accepted standard markers by the security authorities. These markers contain short repetitions (e.g., roughly from five (5) to forty (40)) of four nucleotides referred to as a Short Tandem Repeat (STR).
The repetition numbers at these markers vary randomly from person to person. The specific form of the DNA sequence at a generic locus is called an allele, which provides sufficient differentiation among people. The STR sequence is inherited from parent's DNA. At each marker, there may be two different alleles, one from each parent, and it is called heterozygous. If the alleles from both parents have same STR numbers, it is homozygous. If the alleles of 13 core markers were heterozygous, each person will have twenty-six (26) different allele numbers. Assume each number is evenly distributed over a range of ten (10), the likelihood of having two people with the same alleles numbers from these thirteen (13) markers is extremely small.
A DNA analyzer has been used to determine allele numbers in DNA samples. For this, the DNA sample is introduced into a micro-channel of a rigid sample carrier called a biochip, which generally includes multiple micro-channels in parallel to process multiple samples simultaneously. A DNA fragment containing all STR nucleotides and adjacent sections of nucleotides at each locus is copied from the DNA sample, and replicated through polymerase chain reaction (PCR). The fragments are labeled with target specific fluorescent dyes (or fluorophores) that emit radiation having different principle emission spectra in response to being excited by light. The labeled fragments are separated by size through electrophoresis.
The fragment size is measured in the unit of base pairs (e.g., 100 to 400), where a base pair is the size of a pair of DNA nucleotides. The separated fragments are excited by an excitation light. In response thereto, the dyes emit their characteristic radiation. A detection system includes multiple detection channels, each configured to detect radiation having emission spectra in a different emission spectrum range corresponding to a different one of the dyes. The channels detect the emission spectra and output signals with peaks indicative thereof. The peaks are used to locate fragments in the signal, the peak detection time determines the fragment size, and the fragment size identifies the locus of the fragment and is used to identify it as a DNA fragment in the locus with known STR number.
Unfortunately, the emission spectra of the dyes partially overlap. As a result, a detection channel output signal will include peaks originating from dyes attached to fragments and with principle emission corresponding to the detection channel and peaks originating from dyes attached to fragments and with principle emission corresponding to other detection channels. This has been referred to as color-bleed. The output signal will also include a cluster peak of free dye peaks in which free dyes are dyes that are not attached to any fragment. The detection channel output signal will also include an offset signal including an optical and detection system offset and background signals from fluorescent emission from the sample carrier and non-fluorescent excitation light scatter.
FIG. 1 shows an example portion of an output signal 102 of one of the detection channels. In FIG. 1, a y-axis 104 represents signal amplitude and an x-axis 106 represents time. The output signal 102 includes peaks 108 and 110, which originate from dyes attached to fragments and with principle emission corresponding to detection channel, peaks 112 and 114, which originate from other dyes attached to other fragments and with principle emission corresponding to other detection channels, a cluster peak 116, which is an superposition of the free dye peaks, and an offset signal 118 representing optical and detection system offset and background signals. As shown, the cluster peak 116 has a tail 120 that gradually decays over time and adds to and raises the amplitudes of the peaks 108-114, and the offset signal 118 raises the amplitudes of the cluster peak 116 and the peaks 108-114.
The raised amplitudes of the peaks 108-114 may introduce error in the color separation process. One approach to attempt to return the amplitudes back to their pre-raised state has been to measure the offset signal 118 and the cluster peak 116 and subtract them from the output signal 102. FIG. 2 shows an example signal 202 which is the output signal 102 (FIG. 1) with a measured offset signal 118 and cluster peak 116 subtracted there from. As shown, the example signal 202 includes the peaks 108-114 and a portion 204 of the cluster peak 116, which was not fully removed in this example. Unfortunately, it is difficult to accurately measure the tail 120 of the cluster peak 116, especially in the presence of the peaks 108-114. Consequently, the amplitudes of the peaks 108-114 in the example signal 202 may deviate from their true amplitudes.
As the result, the color separation process may introduce pull-up (positive or negative (pull-down)) artifact into the color-separated signals, even with accurate color bleed factors. FIG. 3 shows an example of such artifact for three color-separated fragments and five dyes. In FIG. 3, each signal 302, 304, 306, 308 and 310 represents a different dye. A first fragment peak 312 corresponds to the signal 304, is measured accurately, and does not introduce artifact into the other signals. A second fragment peak 314 corresponds to the signal 306, includes error, and introduces pull-up artifact 316, 318, 320 and 322 into the other signals. A third fragment peak 324 corresponds to the signal 304, includes error, and introduces pull-up artifacts 330 and 332 in signals 308 and 310 and pull-down artifacts 326 and 328 in signals 302 and 306.
Pull-up and pull-down artifacts can be detrimental to identifying fragments from the signals 302-310. That is, pull-up artifact may lead to false peaks being identified as true peaks (e.g., by pull-up artifact of sufficient amplitude) in the color-separated signals, and pull-down artifact may lead to true peaks being suppressed (e.g., by pull-up artifact of sufficient negative amplitude) from the color-separated signals and missed. The foregoing adds uncertainty in the detection and identification of fragments in the sample and, thus, determining STR numbers in DNA samples. Furthermore, the nature of the pull-up artifact makes it difficult to effectively and reliably perform a correction for the artifact after color separation, and such a correction can substantially distort the signal, which can further add to the uncertainty.
Therefore, there is an unresolved need for other approaches to removing the tail 120 of the cluster peak 116 from the peaks 108 and 110, without introducing or reducing pull-up artifact into the color-separated signals.