For example, in order to search for a desired information item among collection of multimedia data items, such as images, an annotation technology in which meta data (e.g., an object name) is attached to the image of an object included in each of the multimedia data items has been developed.
For example, T. Malisiewicz and A. A. Efros, “Recognition by association via learning pre-exemplar distances,” CVPR, 2008, discusses an annotation technology in which, using the result of recognition of the face images of a plurality of persons included in a still image, a tag indicating the name of a person is attached to each of the face images. The tag attached to each of the images of objects included in an image is determined on the basis of, for example, the similarity between the color or the shape of the object included in the image and the color or the shape prepared for each of the objects that the user wants to recognize.
In addition, Takayuki Baba and Tsuhan Chen (Cornell Univ.), “Object-Driven Image Group Annotation”, Proceedings of 2010 IEEE 17th International Conference on Image Processing (ICIP2010), pp. 2641-2644, Sep. 26-29, 2010, discusses a technology in which the scene of a still image is recognized on the basis of, for example, information regarding a combination of objects included in the still image. In this technology, by using information regarding the objects included in a plurality of images pre-classified by the user and referring to information regarding a correspondence between object combination information prepared by a user and meta data indicating the type of a scene, the information regarding a combination of the objects that is the same as the object information is detected. Thereafter, the meta data indicating the type of the scene corresponding to the detected combination information is attached to each of the plurality of images.
Furthermore, Japanese Laid-open Patent Publication No. 2008-181515 discusses a technology in which among moving image data items, such as movies, for a moving image data item separated into parts of predetermined time spans by a user, a region including a partial image indicating a person or an object specified by a user is identified. Thereafter, meta data predetermined for the partial image is attached to the region including the partial image.