Because cancer and many other diseases have a genetic basis, and because a patient's response to a particular drug or treatment can be influenced by the patient's genetic makeup, researchers have undertaken the task of genetically characterizing patient tissue samples (including tumor samples), cell lines, and xenografts.
A population of organisms will contain several variants (alleles) of a given gene. Alleles can differ from one another at single basepair. These single basepair differences are called single nucleotide polymorphisms (SNPs), and several can be present in a single gene. High-throughput genotyping methods using high-density nucleic acid arrays and other methods can accurately genotype (i.e., determine the nucleotide(s) present) hundreds or thousands of SNPs in a genetic sample in parallel to provide an unequivocal molecular fingerprint of the genetic sample. The somatic cells of diploid organisms have two copies of each autosomal gene and many high throughput genotyping techniques are be sensitive enough to analyze a SNP to determine if an individual is homozygous for a first allele, homozygous for a second allele, or heterozygous (i.e., possesses one copy of each allele). The results of this analysis can be used for research and for making diagnostic and therapeutic decisions.
The value of genetic analysis for research, diagnosis and treatment depends on the accuracy of the analysis. For example, potential therapeutic agents for treatment of cancer are sometimes identified by screening in cancer cells lines. Thus, it is crucial that the cell lines used in such analysis are properly identified. In addition, normal and tumor samples taken from cancer patient are used in staging, diagnoses, selecting treatments and monitoring the effect of treatments. Thus, the accurate identification of samples is extremely important because mis-identification can lead to an incorrect diagnosis or selection of a sub-optimal therapy.
Methods are available for genotyping hundreds or thousands of SNPs in parallel, and these methods can be used for proper identification of samples. However, particularly where a large number of samples must be analyzed, the genotyping of hundred or thousands of SNPs and the analysis of the resulting data can prove to be time-consuming and complex. Thus, it would be desirable to identify a manageable number of SNPs that, taken together, have the power to discriminate among and identify genetic samples.