The disclosed subject matter relates generally to techniques for factor selection, including factors useful in gene express analysis.
The expression levels of thousands of genes, measured simultaneously using DNA microarrays, may provide information useful for medical diagnosis and prognosis. However, gene expression measurements have not provided significant insight into the development of therapeutic approaches. This can be partly attributed to the fact that while traditional gene selection techniques typically produce a “list of genes” that are correlated with disease, they do not reflect any interrelationships of the genes.
Gene selection techniques based on microarray analysis often involve individual gene ranking depending on a numerical score measuring the correlation of each gene with particular disease types. The expression levels of the highest-ranked genes tend to be either consistently higher in the presence of disease and lower in the absence of disease, or vice versa. Such genes usually have the property that their joint expression levels corresponding to diseased tissues and the joint expression levels corresponding to healthy tissues can be cleanly separated into two distinct clusters. These techniques are therefore convenient for classification purposes between disease and health, or between different disease types. However, they do not identify systems of multiple interacting genes, whose joint expression state predicts disease.
There is therefore a need for an approach that identifies modules of genes that are jointly associated with disease from gene expression data. There is also a need to for an approach that will provide insight into the underlying biomolecular logic by producing a logic function connecting the combined expression levels in a gene module with the presence of disease.