Conventionally, technology to recognize a person from a face area of a person included in moving image data imaged by a monitoring camera and so on, based on the feature information of facial images previously stored, and to retrieve a facial image of a specific person has been proposed. In this case, feature information is extracted from a face area included in the moving image data, and a facial image having a high index (similarity) indicating the similarity between the extracted feature information and the feature information previously stored is retrieved out of the facial images previously stored, and is outputted.
However, in the above-described technology, it has been difficult to understand up to what facial images are to be confirmed, out of the facial images which have been retrieved as facial images having high similarity. For example, when the condition such as a face direction of a person imaged by a monitoring camera is disadvantageous for collation with the previously stored feature information, a facial image of a person different from the imaged person may be outputted as the result of a higher rank, and a facial image which is to be essentially retrieved may become a lower rank. Accordingly, it is made easy to understand up to what facial images are to be confirmed, out of the facial images which have been retrieved, and thereby overlooking can be prevented