A process of obtaining a binary mask image, which specifies an object region and a background region in an image, is referred to as an image segmentation process. As one method for this image segmentation process, a graph cut method is known (see NPL 1).
In this graph cut method, an entire image is expressed by a graph structure, with information about a color distribution and edges of the image being energy, a maximum flow problem (Maxflow) thereof is solved, and labels of zero and one are assigned so that the energy is minimized, thereby obtaining a binary mask image.
However, the memory usage for holding this graph structure for all the pixels in the memory is very large, and the resolution of an image to be processed is limited.
One of methods for solving such a problem is an idea of performing the graph cut method with a little larger granularity than units of pixels, for example, in units of small regions, such as sets of pixels.
Originally, in the graph cut method, no restriction is given to the connection state among nodes of a graph, and free connection is permitted. Thus, the graph cut method can be applied as is by freely forming small regions and by replacing pixels by the small regions. Note that the energy that is set to the nodes and edges of the graph needs to be calculated in a method different from that in units of pixels.
A method for defining such a set of small regions and applying the graph cut method is suggested in NPL 2 and PTL 1. In this method, an adjacent energy and a likelihood energy between pixels are calculated using an average color of a small region.
In this method, the graph cut method is performed in units of small regions the number of which is smaller than the number of pixels. Accordingly, memory saving and higher speed are realized, and also the amount of calculation of two types of energy is reduced.