For an information processing apparatus for appearance inspection or the like, there is a scheme in which a group of various feature amounts, such as the mean and variance of pixel values, is extracted from picked-up images of inspection target objects, and in which determination of whether the inspection target objects are non-defective or defective (classification into two classes, non-defective products and defective products) is performed. However, when all of a large number of feature amounts are used, the dimension of feature becomes higher-order. A problem (such as the curse of dimensionality) that occurs in a particular dimension or an increase in processing time that is caused by extraction of redundant feature amounts occurs.
Accordingly, emphasis is being laid on a scheme in which, by selecting appropriate feature amounts, a problem that occurs in a particular dimension is prevented from easily occurring, and in which the speed of arithmetic processing is increased.
Hereinafter, a scheme disclosed in Non Patent Document 1 will be described. In Non Patent Document 1, a scheme is disclosed, in which an evaluation value for evaluating a degree of separation is determined for each of feature amounts, and in which the feature amounts are selected on the basis of the evaluation values in the order from more favorable ones. Specifically, the scheme is a feature selection scheme in which selection criteria are determined using Bayes-error-probability estimation values or ratios of within-class variances to between-class variances.
A Bayes-error-probability estimation value will be described in detail. For example, in a case of a two-class issue, when two classes are denoted by w1 and w2 and features that are observed are denoted by xO=[x1, x2, . . . , xk, . . . , xN], probabilities that xk belongs to w1 and w2 are denoted by P(w1|xk) and P(w2|xk), respectively. In this case, a Bayes-error-probability estimation value is represented by the following equation.
[Expression 1]Bayes=∫min{P(w1|xk),P(w2|xk)}dxk  (Equation 1)
The Bayes-error-probability estimation value is determined for each of the feature amounts. A lower Bayes-error-probability estimation value indicates that it is better to be used to separate the two classes from each other. Accordingly, the feature amounts can be selected in ascending order of the Bayes-error-probability estimation values.
Next, a ratio of a within-class variance to a between-class variance will be described in detail. For example, in a case of a two-class issue, when two classes are denoted by w1 and w2 and features that are observed are denoted by xO=[x1, x2, . . . , xk, . . . , xN], a ratio of a within-class variance to a between-class variance associated with the feature amount xk is determined. Additionally, a set of patterns belonging to the class w1 is denoted by Ai. The number of patterns included in Ai is denoted by ni. The mean of xk of the patterns belonging to the class w1 is denoted by mi. Furthermore, the number of all patterns is denoted by n, and the mean of xk of all of the patterns is denoted by m. In this case,a within-class variance (σ2W) and a between-class variance (σ2B)  [Expression 2]are represented by the following equations.
      [          Expression      ⁢                          ⁢      3        ]                                            σ            W            2                    =                                    1              n                        ⁢                                          ∑                                                      i                    =                    1                                    ,                  2                                            ⁢                                                ∑                                      xk                    ∈                                          A                      i                                                                      ⁢                                                                            (                                                                        x                          k                                                -                                                  m                          i                                                                    )                                        2                                    ⁢                                                                          [                                      Expression                    ⁢                                                                                  ⁢                    4                                    ]                                                                                          (                      Equation            ⁢                                                  ⁢            2                    )                                                          σ            B            2                    =                                    1              n                        ⁢                                          ∑                                                      i                    =                    1                                    ,                  2                                            ⁢                                                                    n                    i                                    ⁡                                      (                                                                  m                        i                                            -                      m                                        )                                                  2                                                                          (                      Equation            ⁢                                                  ⁢            3                    )                    
A ratio of the within-class variance to the between-class variance can be represented by the following expression.σB2/σW2  [Expression 5]
In this manner, the ratios of the within-class variances to the between-class variances are determined, and the feature amounts are selected in descending order of the ratios of the within-class variances to the between-class variances. Furthermore, in Non Patent Document 2, a scheme is disclosed, in which combinations of two feature amounts are generated, in which the combinations of two feature amounts are evaluated in a two-dimensional feature space, and in which the feature amounts are selected in units of two. It is described that, using the scheme in which combinations of two feature amounts are generated, features can be selected with an accuracy that is higher than an accuracy with which the feature amounts are selected on a one-by-one basis.
Documents of Related Art
Non Patent Documents
Non Patent Document 1:
Kenichiro Ishi, Naonori Ueda, Eisaku Maeda, You Murase, “Easy-to-understand pattern recognition”, Ohmsha, Tokyo, 1998.
Non Patent Document 2:
Trond Hellem Bo and Inge Jonassen, “New feature subset selection procedures for classification of expression profiles,” Genome Biology 2002, volume 3 (vol. 3), no. 4: research.