The disclosed subject matter relates generally to systems and methods for factor selection, including factors useful in gene expression analysis.
The expression levels of thousands of genes, measured simultaneously using DNA microarrays, can provide information useful for medical diagnosis and prognosis. However, gene expression measurements have not provided significant insight into the development of therapeutic approaches. This can be partly attributed to the fact that while traditional gene selection techniques typically produce a “list of genes” that are correlated with disease, they do not reflect interrelationships.
Gene selection techniques based on microarray analysis often involve individual gene ranking depending on a numerical score measuring the correlation of each gene with particular disease types. The expression levels of the highest-ranked genes tend to be either consistently higher in the presence of disease and lower in the absence of disease, or vice versa. Such genes usually have the property that their joint expression levels corresponding to diseased tissues and the joint expression levels corresponding to healthy tissues can be cleanly separated into two distinct clusters. These techniques are therefore convenient for classification purposes between disease and health, or between different disease types. However, they do not identify cooperative relationships or the synergy among multiple interacting genes.
There is therefore a need for the ability to analyze genes in terms of the cooperative, as opposed to independent, nature of their contributions towards a phenotype.