1. Field of the Invention
The present invention relates to an image conversion method and apparatus and a pattern identification method and apparatus which are robust against a variation in the brightness or contrast of an image caused by a difference of illumination environment or the like.
2. Description of the Related Art
As image conversion methods robust against a variation in the brightness or contrast of an image, for example, LBP (Local Binary Pattern) of reference 1 and CT (Census Transform) of reference 2 have received attention. These methods fundamentally convert the value at a position corresponding to a pixel of interest into a sequence of a plurality of numerical values or one scalar value calculated from them based on the result of luminance value comparison between the pixel of interest and each neighboring pixel. Improved methods (reference 3) and pattern identification methods using these conversion methods have also been proposed (Japanese Patent Laid-Open Nos. 2007-188504 and 2007-241418).
In CT, the obtained numerical value sequence is directly used as a converted value. In LBP, one scalar value calculated based on the numerical value sequence is used as a converted value. This will be described using a detailed example. FIGS. 3A to 3C show local images used in LBP, each having 3×3 pixels with a pixel of interest at the center, and the luminance values of the pixels. In LBP, a pixel of interest is sequentially selected from a conversion target image in the raster scan order, and a converted value for the selected pixel of interest is obtained.
For example, in FIG. 3A, the luminance value of a pixel 30a of interest is compared with those of eight neighboring pixels 31a to 38a. If the luminance value of a neighboring pixel is larger than that of the pixel of interest, 1 is obtained. Otherwise, 0 is obtained. The obtained values are simply arranged. The result is “00000010”. In LBP, this sequence is regarded as an 8-bit numerical value so that the value “00000010”=2 is obtained as a converted value. At this time, since the absolute magnitude of the difference between the compared values is not taken into consideration, robustness against a variation in brightness or the like is ensured. In this method, however, the numerical value “2” after conversion is merely an index, and the value “2” itself has no particular significance.
FIG. 3B shows a pattern obtained by rotating the pattern in FIG. 3A clockwise by 90°. In LBP, the converted value corresponding to the pattern in FIG. 3A is 2, and that corresponding to the pattern in FIG. 3B is 128. That is, the value largely changes. In LBP, a converted value is only a symbol describing each pattern. For example, calculation between numerical values to obtain, for example, the difference between numerical values has no significance. Additionally, the pattern in FIG. 3B, which is a slight variation of the pattern in FIG. 3A, yields a converted value 64 times as large. As described above, a conversion method such as LBP may largely change the converted value for a pattern variation such as rotation, and is therefore supposed to be less robust.
In CT, an 8-bit numerical value is regarded as an eight-dimensional vector value. The values of the patterns in FIGS. 3A and 3B are “00000010” and “10000000”, respectively. In this case, the Euclidean distance between the converted values in FIGS. 3A and 3B is √2. That is, in CT, a converted value or the result of calculation between numerical values is significant as compared to normal LBP.
In CT, however, the Euclidean distance between a converted value (“00010000”) corresponding to the pattern shown in FIG. 3C and each of the converted values corresponding to the patterns in FIGS. 3A and 3B is √2. That is, the converted values corresponding to the three patterns in FIGS. 3A to 3C hold the same relationship in any combination. From the viewpoint of a pattern identification method using a converted image, as described in Japanese Patent Laid-Open No. 2007-241418, it is supposed to be preferable that the converted values of similar patterns be similar, and those of nonsimilar patterns be different. When viewed from the pattern in FIG. 3A, the pattern in FIG. 3B is the result of 90° clockwise rotation, and the pattern in FIG. 3C is the result of 135° counterclockwise rotation. That is, the pattern in FIG. 3C varies from the pattern in FIG. 3A more largely than the pattern in FIG. 3B. For this reason, the relationship of the converted value corresponding to the pattern in FIG. 3A to that corresponding to the pattern in FIG. 3C is preferably farther than that to the converted value corresponding to the pattern in FIG. 3B.
In CT, since a converted value corresponding to a pixel of interest is a high-dimensional value, a problem called “curse of dimensionality” arises at high probability in pattern identification using a converted image.
That is, LBP or CT is robust against a variation in brightness or the like. However, since a converted value has little significance as a numerical value, or is a high-dimensional value, it is impossible to preferably reflect the difference between conversion source patterns.
LBP or CT can be regarded as a kind of vector quantization in a broad sense. Vector quantization is sometimes used in the field of pattern identification, as in reference 4.
The technique described in reference 4 identifies a pattern using the frequency histogram of vectors that match representative vectors after vector quantization. In this technique, when indices corresponding to two patterns are handled as numerical values, only a less significant result is obtained from calculation between the numerical values to obtain, for example, the difference between the indices, like LBP.
In the method using vector quantization, a representative vector itself that matches is also usable as a converted value. In this case, however, the converted value is a vector value of relatively high dimensions (the same dimensions as those of a conversion source pattern), as in CT.    [Reference 1] T. Ojala, M. Pietikainen, D. Harwood, “A Comparative Study of Texture Measures with Classification Based on Feature Distributions”, Pattern Recognition, Vol. 29, pp. 51-59, 1996    [Reference 2] R. Zabih, J. Woodfill, “A Non-parametric Approach to Visual Correspondence”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996    [Reference 3] S. Marcel, Y. Rodriguez, G. Heusch, “On the Recent Use of Local Binary Patterns for Face Authentication”, International Journal on Image and Video Processing Special Issue on Facial Image Processing, 2007    [Reference 4] Koji Kotani, Chen Qiu, Tadahiro Ohmi, “Face Recognition Using Vector Quantization Histogram Method”, International Conference on Image Processing, Vol. 2, pp. 11-105-11-108, 2002    [Reference 5] J. C. Gower, “Some Distance Properties of Latent Root and Vector Methods used in Multivariate Analysis”, Biometrika, Vol. 53, pp. 325-338, 1966    [Reference 6] Robert W. Floyd, “Algorithm 97: Shortest Path”, Communications of the ACM, Vol. 5, Issue 6, p. 345, 1966    [Reference 7] H. Jin, Q. Liu, H. Lu, X. Tong, “Face Detection Using Improved LBP under Bayesian Framework”, International Conference on Image and Graphics, pp. 306-309, 2004    [Reference 8] T. Maenpaa, M. Pietikainen, T. Ojala, “Texture Classification by Multi-Predicate Local Binary Pattern Operators”, International Conference of Pattern Recognition, Vol. 3, pp. 951-954, 2000    [Reference 9] T. Ojala, M. Pietikainen, T. Maenpaa, “Multiresolution Gray-scale and Rotation Invariant Texture Classification with Local Binary Patterns”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, pp. 971-987, 2002    [Reference 10] Joshua B. Tenenbaum, Vin de Silva, John C. Langford, “A Global Geometric Framework for Nonlinear Dimensionality Reduction”, Science, Vol. 290, pp. 2319-2323, 2000    [Reference 11] Teuvo Kohonen, “The Self-Organizing Map”, Proceedings of The IEEE, Vol. 789, No. 9, pp. 1464-1480, 1990    [Reference 12] Kenichi Maeda and Sadakazu Watanabe, “Pattern Matching Method with Local Structure”, IEICE(D), Vol. J68-D, No. 3, pp. 345-352, 1985    [Reference 13] George Arfken, Hans Weber, “Gram-Schmidt Orthogonalization”, Mathematical Methods for Physicists, 6th Edition, Academic Press, pp. 642-648, 2005    [Reference 14] Michael. J. Swain, Dana. H. Ballard, “Color Indexing”, International Journal of Computer Vision, Vol. 7, No. 1, pp. 11-32, 1991    [Reference 15] Yossi Rubner, Carlo Tomasi, Leonidas J. Guibas, “The Earth Mover's Distance as a Metric for Image Retrieval”, International Journal of Computer Vision, Vol. 40, No. 2, pp. 99-121, 2000    [Reference 16] Stuart P. Lloyd, “Least Squares Quantization in PCM”, IEEE Transactions on Information Theory, IT-2, pp. 129-137, 1982    [Reference 17] Jianbo Shi, Jitendra Malik, “Normalized Cuts and Image Segmentation”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, No. 8, pp. 888-905, 2000    [Reference 18] Sam T. Roweis, Lawrence K. Saul, “Nonlinear Dimensionality Reduction by Locally Linear Embedding”, Science, Vol. 290, pp. 2323-2326, 2000    [Reference 19] XiaofeiHe, Partha Niyogi, “Locality Preserving Projections”, Advances in Neural Information Processing Systems, Vol. 16, pp. 585-591, 2003    [Reference 20] Guoying Zhao, Matti Pietikainen, “Dynamic Texture Recognition Using Volume Local Binary Patterns with an Application to Facial Expressions”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 29, No. 6, pp. 915-928, 2007
Note that references 5 to 20 are referred in “DESCRIPTION OF THE EMBODIMENTS”.
As described above, to implement pattern identification robust against a variation in brightness or the like, an image conversion method is demanded, which allows converted values having dimensions lower to some extent to preferably express the difference between conversion source patterns while maintaining the effect of robustness against a variation in brightness or the like.