An unparalleled level of complexity in data analysis has been introduced in the current post-genome era including computationally intensive high throughput technology (including microarrays), large scale computing, data mining oriented genomic research (i.e., for example, the international HapMap project), genome-wide association studies, genomic profiling applications to pharmacogenetics and translational therapeutic treatments, and drug/medical treatment developments. In particular, the widespread use of microarrays (i.e., for example, Affymetrix Inc., Santa Clara, Calif.) provide routine implementation of high-throughput research that requires a high level of sophisticated data analysis.
Currently used microarray methods are capable of genotyping single nucleotide polymorphisms (SNPs) based upon probe label intensities or identifying differentially expressed genes using data obtained from oligonucleotide microarrays. Disadvantages of these methods are primarily due to the dependence upon using probe label intensities from oligonucleotide arrays based on the belief that the intensity of perfect match sequences provides an accurate proportionality to the reference gene copy number. Further, another misleading belief is that mismatch sequences only provide background information as a result of non-specific binding. Such beliefs ignore the fact that probe label intensities depend on factors other than just gene copy number. An efficient and accurate estimation of unobservable copy numbers from oligonucleotide microarrays to determine SNP genotypes is difficult to obtain using these existing techniques.
What is needed is a method to determine genetic variations for DNA genotyping and differentiation in gene expression using determinations in gene copy number calculated by not only probe label intensity data but with other factors such as probe binding affinity to the reference sequence.