The ability to automatically classify objects into categories of interest has applications across a wide range of industries and scientific fields, including biology, social sciences, and finance. One particular application of interest is the classification of biological cells according to cell phenotype. Due to the large number of features (i.e., numeric properties) that typically must be calculated and considered in such classification, this process can be difficult, computationally intensive, and time-consuming.
For example, in the classification of cell phenotypes, typically hundreds of texture and morphology features of cells depicted in images are calculated and used for automated cell classification. First, training is performed whereby imaged cells are identified by a user as belonging to one of two or more categories. Many texture and morphology features are computed for each of these identified cells, and the system determines algorithms for distinguishing between the categories on the basis of these features. Then images containing cells of unidentified type can be analyzed to automatically determine cell type, based on those algorithms.
To improve the speed of this classification process, it is desirable to narrow down the number of texture and morphology features used for classification to something more manageable. Identification of a small subset of those features that are most effective for distinguishing a particular cell phenotype of interest is a complex problem. Unfortunately, there is no universally accepted, safe, and fast way to select a few relevant features and omit the features that in fact do not contribute to classification of cell phenotype.
Selection of a few relevant features out of hundreds of initially calculated features is a scientific problem that has several approaches but no commonly-accepted solution. For example, some features may be useful or relevant for classification only when considered in combination with one or more other features. As a further complication, the number of possible combinations of features is extremely high and trying them all is not practical. Current approaches of the classification of cell phenotype, such as Artificial Neuron Networks, utilize methods that are time-inefficient and non-transparent.
Thus, improved systems and methods are needed for fast identification of relevant features of objects depicted in images for classification or regression.