In the related art, a system for searching for an image including a captured image of a specific person from video that is captured by a monitoring camera is known. Such a system is used for searching for stray children and missing persons or analyzing behavioral patterns of consumers by person tracking, for example.
For example, a system disclosed in Japanese Laid-open Patent Publication No. 2009-199322 may search for an image including a captured image of a specific person from accumulated video. The system extracts feature information of person's face and feature information of clothes (color histogram and the like) and stores such information in a database when the system records the video. Then, the system extracts an image including a captured image of a person who is similar to a person in a query image from the database.
Specifically, the system compares feature information of a face and feature information of clothes, which are stored in the database, with the feature information of the face and the feature information of the clothes, which are extracted from the query image, and searches for an image with similarity that is equal to or greater than a threshold from the database. The system disclosed in Japanese Laid-open Patent Publication No. 2009-199322 includes a face feature extraction unit that extracts a face region and extracts features from the extracted face region and a clothing feature extraction unit that extracts a clothing region and extract features from the extracted clothing region.
Here, color information (a color histogram or the like) is typically used as feature information. This is because quality of video captured by a monitoring camera is low, it is difficult to recognize detailed features, and determination based on colors is effective. Color features are more stable than other features and have an advantage that the color features are not easily affected by a facing direction of a person and outside light. Therefore, an image including a captured image of a person in similar clothes is searched for from video that is captured by the monitoring camera by comparing the color information.
In a case of comparing color information of the “entire” clothing region of the person in the query image with color information of a clothing region of a person in an image that is registered in the database, there is a possibility that a person in different clothes from those of the person in the query image is searched for from the database as a search result. Specifically, there is a possibility that an image including a captured image of a person in a black jacket and white pants is searched for from the database in response to a query image including a captured image of a person in a white jacket and black pants. This is because a certain range is permitted in determination of similarity and an image with similarity that is equal to or greater than a threshold is searched for even in cases other than a case in which two pieces of color information completely coincide with each other.
Thus, there is a technique of dividing the clothing region and extracting feature information from each of the divided regions in order to further narrow down the search results. A system disclosed in International Publication Pamphlet No. 2011/046128 extracts a person region from video and then separates a clothing region of the person into a plurality of portions. Specifically, the system determines discontinuity (separation position) in clothing based on a variation in luminance in a longitudinal direction of a clothing region and extracts color information from each of an upper region over the separation position and a lower region under the separation position. Then, the extracted color information of each of the regions is accumulated in a database. In the system, a query text, for example, “a white jacket and blue pants” is received as a query and an image including a captured image of a person in clothes corresponding to the query text is searched for.