Field of the Invention
The present invention relates to an image processing technique for processing an inputted image.
Description of the Related Art
A segmentation method is used as a technique for dividing an image into a plurality of areas for which an attribute such as a color, a pattern, a brightness, or the like, is the same. For the divided areas, it is possible to reduce a processing amount compared to a case of processing an image at a pixel level because it is possible to perform an area recognition, or encoding processing in units of areas thereafter. In recent years, cases in which image processing is performed on an image of a high resolution in an embedded device are increasing, and it can be considered that complicated real-time processing on an image of a high resolution will become possible even in an embedded device by processing images in units of areas after a segmentation.
Several methods for realizing real-time segmentation processing have been proposed. A technique amongst these for dividing an image into areas by clustering pixel data using 5-dimensional information (color space (R, G, B) and coordinate space (X, Y)) is known. R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Süsstrunk, “SLIC Superpixels,” tech. rep., EPFL, EPFL, 2010 is a document that discloses this technique (hereinafter referred to as document 1). The method disclosed in document 1 is referred to as Simple Linear Iterative Clustering (SLIC). First of all, representative points which are the centers of each cluster is arranged in a reticular pattern in an image. The representative points in the SLIC method comprise 5-dimensional information (a color space (R, G, B) and a coordinate space (X, Y)). The representative points are referred to as seeds, cluster centroids, or the like. Clustering in the SLIC method is based on a k-means method, and each pixel configuring the image is clustered at a representative point arranged in the reticular pattern. A characteristic of the SLIC method is a point that a coordinate space clustered at a representative point is limited to a predetermined area. A collection of pixels clustered at a representative point is a segmented area. Segmented areas are referred to as Superpixels. This method has a characteristic in that it is possible to process with a small calculation amount in proportion to the image size for something where there is repetitive processing.
Also, a technique for realizing an optimization of the method of document 1 by implementing it on a GPU (Graphics Processing Unit) is known. C. Y. Ren and I. Reid. gSLIC: a real-time implementation of SLIC superpixel segmentation. University of Oxford, Department of Engineering, Technical Report, 2011 (hereinafter referred to as document 2) is a document that discloses such a technique. In document 2, a hierarchical clustering scheme is used to implement high speed processing on a GPU. With this, real-time segmentation processing of a high-resolution image is realized.
Meanwhile, there is a technique that is a method for performing a segmentation based on a Superpixel unification (an area unification). Iwane, Yoshida, “landscape recognition of in-vehicle camera using segmentation based on superpixel unification”, Japanese Fuzzy System Symposium, 2011, Iwane, Yoshida, “Landscape recognition of in-vehicle camera views based on graph-based segmentation”, 27th Fuzzy System Symposium, 2011 (hereinafter referred to as document 3) is a document that discloses such a technique. In document 3, Superpixels are generated based on graphs. Then, using a discriminator generated by Adaboost on the Superpixels, unification is performed by adding area labels to the Superpixels. This unification processing unifies a plurality of Superpixels replacing them with 1 new Superpixel. For example, an image captured by an in-vehicle camera can be divided into 3 areas by a discriminator: sky, ground, and a vertical object. This is referred to as a semantic segmentation, and is processing in which each area has a meaning.
In the graph-based processing of document 3, the generation of Superpixels is performed, and then at a subsequent stage, using the graph, Superpixel unification is performed. Meanwhile, in order to perform a graph-based unification of Superpixels generated by a clustering scheme illustrated in document 1, it is necessary to generate a graph that represents an adjacency relationship of the Superpixels prior to the unification processing.
Explanation is given for this graph generation processing using FIGS. 10A-10D. FIG. 10A illustrates a label map 801 for Superpixels generated by a clustering scheme. The label map 801 manages label values corresponding to each pixel of the input image, and the label values are indexes of the Superpixels generated by clustering. For example, the index “2” is added for a Superpixel 802 by the clustering, and “2” is added as a label value in an area of the Superpixel 802 on the label map 801. FIG. 10A illustrates that there are 9 Superpixels, and the label values 0-9 are assigned.
In the graph generation processing, the label map 801 is read, the adjacency relationship of the Superpixels is investigated, and an adjacency graph such as in FIG. 10B is generated. The areas adjacent to the area for which the label value is “0” are the areas for which the label value is “1” and “3”. In order to obtain the adjacency relationship from the label map 801, boundary portions of the label values are detected, and an adjacency list is generated by making a set of label values of boundary portions into a list. The set of label values may be “0” and “1”, and “0” and “3”, for example. Because multiple sets of label values are obtained, the adjacency list is generated having excluded overlapping label value sets. An adjacency list such as is illustrated in FIG. 10C is generated for the adjacency graph of FIG. 10B. By the above process, a representative point 803 and a representative point 804, and information of an edge 805 that connects these can be obtained.
Next, explanation is given for a simple example of Superpixel unification. Feature amounts of the Superpixels on both sides of an edge are investigated based on the adjacency list of FIG. 10C, and if they are similar, processing for unifying them is performed. For a feature amount, information such as a color average or a histogram of pixels belonging to a Superpixel is used. Also, for a similarity, determination can be made by a difference in color averages, or a histogram intersection value. FIG. 10D is a label map after the Superpixels are unified. As is illustrated in the same figure, the Superpixels having label values of “0”, “1” and “3” are unified into the Superpixel having the label value of “0”. Also, the Superpixels having label values of “2”, “5” and “8” are unified into the Superpixel having the label value of “2”. Also, the Superpixels having label values of “4”, “6” and “7” are unified into the Superpixel having the label value of “4”.
In the graph generation processing, in order to investigate the adjacency relationship of the areas, the label map is read in a raster scan, or the like, and boundaries between areas are detected. Then, processing for extracting the label value sets from the boundaries between the detected areas, and generating an adjacency list without overlapping is necessary. Because this processing performs a label map readout, and searches whether or not a label value set that is detected sequentially already exists in the adjacency list, it is necessary to perform random access on the memory in which the adjacency list is stored, and therefore optimization is difficult.