The success of digital imaging and video has lead to a wide use of this technology in many fields of everyday life. Technology to edit, alter or modify digital images or video sequences is commercially available and allows modifications of the contents of said images or videos without leaving traces. For a variety of applications, such as evidential imaging in law enforcement e.g. from security cameras, medical documentation, damage assessment for insurance purposes, etc., it is necessary to ensure that an image or video has not been modified and is congruent with the image or video originally taken. This led to the development of signal authentication systems for which an example is shown in FIG. 1, wherein a signature is created at 1.20 for an audio-visual signal, such as an image or video, which is acquired in 1.10. The signature is embedded e.g. as a watermark in 1.30 into the signal. Thereafter the signal is processed or tampered in 1.40, played, recorded or extracted in 1.50 and finally verified in 1.60 in order to either ensure that the authenticity of the signal is proven or that modifications of the signal are revealed.
Embedding data into a video-signal is known from U.S. Pat. 6,211,919 wherein an analogue video signal is converted to a digital video signal into which data is embedded and then converted back to an analogue video signal. Error correction across frames is implemented in order to compensate for transmission losses. The solution disclosed therein is of complex technical nature requiring large buffer memories for storing the entire frame or several frames of the video signal. These memories are expensive and it is therefore desired to minimize the amount of memory needed.
Furthermore, especially for the above mentioned applications of authenticating signatures, it is important that each video frame possesses the capability to authenticate itself, because in e.g. the above mentioned security camera application, not all frames of a sequence are stored, e.g. only every fiftieth frame, likewise for medical imaging, perhaps only a subset of images are retained. In general it is not known which frame will be recorded and which will be discarded. Consequently, all information required to authenticate a certain frame of a video sequence must be available in and derivable from the frame itself. This is not possible, when a frame has a dependency on preceding or subsequent frames, as in the above document, in order to enable authentication of the frame.
The signature calculation and embedding has to take place as soon as possible after the generation of the video signal in order to prevent the video being tampered before authentication information is stored in it. Therefore it is an advantage if the signature calculation and embedding is placed close to the image capturing device, e.g. inside a security camera, and the signature calculation and embedding takes place in real-time on the video stream generated. Today's solutions, as disclosed in the above document, are technically complicated and expensive.
Finally, according to the prior art, in order to embedded the signature bits calculated in 1.20 for an audio-visual signal, such as a digital image, inside the audio-visual signal itself as a watermark in 1.30, an entire frame of the audio-visual signal has to be buffered in a large, expensive memory while the signature bits for the frame of said audio-visual signal are calculated, the watermark having the signature bits as a payload is constructed, and finally said watermark is embedded inside said frame of the audio-visual signal. This renders such solutions expensive due to the amount of expensive memory needed.
Thus, the problem to be solved by the invention is defined as how to provide low-cost real-time generation of an audio-visual signal with self-authenticating frames.