Field of the Invention
The invention relates, in general, to the field of image processing, computer vision and pattern recognition, particularly to the field of multi-class segmentation, and more particularly to an image processing apparatus and method for classifying each region in an image.
Description of the Related Art
Multi-class segmentation is a method to segment an image to different regions. Each region of the image is classified to a predefined class, such as Sky, Green, Body and Others. The method is helpful to parse the scenes of image. FIG. 1 is a schematic view of multi-class segmentation. As shown in FIG. 1, each segmented region belongs to a predefined class.
Richard Socher (reference can be made to Richard Socher, Cliff Chiung-Yu Lin, Andrew Y. Ng. Parsing Natural Scenes and Natural Language with Recursive Neural Networks. Proceeding of the 28th International Conference on Machine Learning, Bellevue, Wash., USA, 2011) proposed a multi-class segmentation method named Recursive Neural Network (RNN). FIG. 2 schematically shows a flowchart of the RNN segment method.
As shown in FIG. 2, firstly, the method segments the image to regions at step 210. Then, at step 220, it calculates the classification confidence of every class for each region based on the extracted features and trained model. The classification confidence represents the probability of a region belonging to a predefined class and also called as score. Finally, at step 230, the region is classified to the class with the highest score.
Because of the scores are calculated from the features and trained model, if the highest score of a region is not enough higher than the second highest score of the region, it means that the feature of one class is not obvious to others. Then, it may be difficult to distinguish one class from the other when the scores of two classes are close. As described above, this method chooses the class with the highest confidence score as the classification result and if the highest score of one class for a region is not obvious to the others, the classification result will more probably to be wrong. For example, the illustration of the RNN segmentation is shown in FIG. 3.
As can be seen from the upper picture of FIG. 3, for the region B, the score of Green is far higher than those of the other classes. Then, the region B is classified to Green without doubt. Similarly, the region C is classified to Sky. Those regions B and C are the obvious regions for a class.
However, as can be seen, for the region A, the score of Others is only a little higher than that of Green. According to the RNN segmentation, the region A is classified to Others as shown on the lower picture of FIG. 3. However, from the original image (i.e., the upper picture of FIG. 3), it can be seen that the region A should belong to classification Green. In this regard, the unobvious score leads to a wrong classification result, which is also not self-adaptive inside an image.
In view of above, it is desired to provide a new image processing apparatus and image processing method, which are capable of precisely classifying all the regions, particularly, the non-obvious regions, to predetermined classes.