Genotyping is an important technique in genetic research for mapping a genome and localizing genes that are linked to inherited characteristics such as genetic diseases. In genetic disease studies, for example, a library of genetic markers is screened against DNA samples from affected individuals and their families. Resulting data are analyzed to find chromosomal regions whose transmission from parent to child is correlated with (i.e., linked to) transmission of a particular disease.
The result of screening a single subject marker pair is a "genotype" and comprises one or more so-called alleles that are determined by the subject's DNA sequence at a marker location. Alleles are alternate forms of the DNA sequence at a genetic locus, that is, a position on a chromosome of a gene or other chromosome marker. Persons may be homozygous or heterozygous depending on the number of different alleles they possess for a given marker. Heterozygous persons have two different alleles (one from each parent) for a given marker, whereas homozygous persons inherit the same allele from both parents for a given marker.
As researchers study more complex traits and diseases, the number of genotypes required to detect linkages to such traits and diseases grows significantly. Therefore, performing accurate and high throughput genotyping becomes an important factor for genetic analysis research.
One approach for genotyping uses a large family of microsatellite markers that enable selective amplification. The typical amplification process used is the so-called "polymerase chain reaction" (PCR) technique, which involves the use of a heat stable enzyme to catalyze a synthesis of nucleic acids on pre-existing nucleic acid templates. PCR uses the polymerase enzyme and two base pair primers, one complementary to each strand, at the end of the sequence to be amplified to produce synthesized DNA strands. The synthesized DNA strands serve as templates for the same primer sequence thus permitting successive iterations of primer annealing, strand elongation, and dissociation to produce rapid and highly specific amplification of the desired sequence. The PCR technique is applied to short segments of an individual's chromosome that are known to contain a variable length tandem repeat (i.e., marker). Each possible length corresponds to a distinct allele for the particular marker. The length of the allele is measured by separating the amplified DNA segments by length in lanes on an electrophoretic gel. Because these alleles are transmitted from parent to child, they can be used to trace the inheritance characteristics of chromosomal regions.
Processing an electrophoretic gel is time consuming. To increase throughput, often several markers are multiplexed in each lane of the gel. Markers with overlapping size ranges are tagged with different colored dyes so that their alleles can be distinguished. The same dye can be used for multiple markers, as long as their size ranges do not overlap. A DNA sequencer is used to scan the gel and produce a pixelmap color-coded image in machine-readable format. The pixel information is stored as a file that can be accessed by a gene scanner to produce individual traces. Alleles are determined from these individual traces.
One conventional approach for determining alleles uses a genotyping that presents traces to human "callers" i.e., highly-trained people who visually examine the traces to determine whether or not peaks in particular traces correspond to alleles. Often two different allele callers examine traces in double-blind fashion. If both callers agree that a particular peak or pair of peaks in a trace correspond to alleles, the genotype is "called" or identified. On the other hand, if there is no agreement, the trace may be uncallable.