In recent years, it has been general practice to install security cameras in shops, downtown areas, housing complexes and the like and install vehicle drive recorders or the like in commercial vehicles. The number of cases in which video images are used as material evidence has increased. Currently, when a video image or a sound is used as evidence, a video tape, an image file or the like is submitted without change. However, when an image and a sound are digitalized and stored, the image and the sound can be easily altered and edited. When the video image or the sound is used as evidence, third party authentication such as a digital signature or a timestamp is necessary. Currently, services and products that record voices of telephone operators with timestamps are being sold. It is expected that needs for such techniques will increase in the future.
As a technique for detecting alteration performed by a third party, there is a technique for using a method for dividing the contents of a digital document into data, calculating a summary information for each of the data, and adding a digital signature to a group of the summary information for the data. In this case, the summary information correspond to hash information calculated using a cryptographic one-way hash function and are also called message digests. When this technique is used for video image data, it is possible to ensure the originality of the video image data and extract a data to be digitally signed while privacy can be protected (for example Japanese Laid-open Patent Publication No. 2008-178048).
In addition, since video image data has a large amount of data, there are various techniques for compressing video data. Among the compression techniques, there is an inter-frame prediction technique. For example, the inter-frame prediction technique is used for video image data so that the video image data is compressed into Motion Picture Expert Group-1 (MPEG-1) format. The video image data compressed in MPEG-1 format includes three types of images, which are I pictures, P pictures and B pictures. The I pictures maintain all images necessary to be displayed as a video image. The P pictures each maintain the difference between the P picture and an I picture that precedes the P picture. The B pictures each maintain the difference between the B picture and a P or I picture preceding the B picture and the difference between the B picture and a P or I picture succeeding the B picture. Since the P pictures each maintain the difference between the current image and the previous image, and the B pictures each maintain the difference between the current image and the images preceding and succeeding the B picture, the data can be compressed at a high compression rate.
In order to decompress video image data compressed by the inter-frame prediction technique, it is necessary to perform a large amount of processing. To avoid this, the following technique is disclosed in Japanese Laid-open Patent Publication No. 2006-74690: a technique for extracting frames (I pictures), encoding the frames into still images on a frame basis, and thereby quickly reproducing video image data.