In recent years there has been a proliferation of video devices capturing images of a scene. The captured images may be still images or video images and they may or may not be in the visual spectrum. Some of these devices are for private use, such as webcam, some are for security purposes, such as surveillance cameras. In many cases these images are stored for later use. One example would be the use of images in a court of law. For legal purposes it is crucial to maintain a chain of custody over video evidence and be able to verify that the evidence has not been tampered with since it was captured. A failure in the chain of custody may lead to the inadmissibility of the video evidence.
In a case of, for example, child abduction from a shopping centre, where in the subsequent trial a surveillance video plays a crucial part in the conviction of the offender, it is vital that there be no possibility left open for the defence to claim a compromise in the video evidence itself. Currently, there is no quantitative and impartial means of establishing video integrity. This is particularly relevant because of the recent proliferation of video surveillance. The ubiquitous presence of security cameras, and the sheer volume of video and image data makes it more important and more difficult to provide an audit trail with assurance that the video evidence is reliable, or otherwise.
Currently digital image, audio, video and related media are vulnerable to theft, misuse or manipulation. A way to guard against this is watermarking technology. One of the first references to using a digital watermark for security is found in A. Z. Tirkel, G. A. Rankin, R. M. Van Schyndel, W. J. Ho, N. R. A. Mee, C. F. Osborne, “Electronic Water Mark”, DICTA 93, Macquarie University, pp. 666-673, (1993). This paper introduced the concept of a spread spectrum technique to embed and recover hidden messages in images. This paper was based on a seminar presented by A. Z. Tirkel at the Department of Mathematics, Melbourne University, on Feb. 22, 1993.
The “hidden message” has become known as a digital watermark, which is defined as an embedded message, difficult to detect, but which can be recovered from the watermarked image, without access to the original unwatermarked image, by using correlation (or other) techniques and a template of the watermark. The known techniques involve the addition of a binary number sequence carrying a hidden message to pixels in a still image. The message can be embedded within the cyclic shift of the sequence. This message is recovered by correlation with all cyclic shifts of a reference sequence. The sequence used has the unique property that the correlation is high for only the zero cyclic shift.
Throughout this application, it is assumed that the term correlation denotes the dot product of a sequence or array with a complex conjugate of another (or the same) sequence or array. Where the same sequence or array is involved, it is called autocorrelation, otherwise it is called cross-correlation. Either of the sequences or arrays may be subjected to shifts. It is also assumed that all shifts are cyclic, or periodic in all the dimensions of the array. In this context, correlation is a measure of similarity, with a high value indicating greater similarity.
Since 1993, the area of digital watermarking has undergone an explosion in activity. For instance see, A. Z. Tirkel, R. G. van Schyndel, C. F. Osborne, “A Two-Dimensional Digital Watermark”, DICTA'95, University of Queensland, Brisbane, pp. 378-383, (1995). Digital watermarks have been applied to still images, audio, video, text, sheet music, etc. Watermarking techniques have been used to provide copyright protection, access control, audit trail, traitor tracing, provide certificates of authenticity, etc. Watermark embedding and recovery techniques have been studied extensively and have been tailored to use the masking effect of the human visual system and human auditory system. Almost all of these advances have occurred in the applications domain. Major advances have occurred in protecting watermarks against unintentional distortions (compression, cropping, geometrical effects etc) and against deliberate cryptographic attack. New forms of attack have emerged as a result of these advances.
By contrast, the generators or sequences used to carry the message have not changed significantly. As a consequence, watermarks can benefit significantly by using families of sequences or arrays with good auto and cross-correlation. This is because multiple sets of such sequences or arrays can be embedded as composite watermarks. Such composite watermarks have three significant advantages: they are more secure against cryptographic attack, they can carry more information, and where the watermarks are used as fingerprints, composite watermarks can have immunity to collusion attack.
One popular watermarking technique that has been developed uses a statistical method to generate the watermark patterns, employing a random number generator or a noisy physical process. It is simple and effective, easy to implement, and can be made resistant to standard compression methods. Its weakness is that it cannot specify a probability that the watermarks generated by this process are “unique”, or at least sufficiently dissimilar, so as never to be confused. This is not a problem for proof of ownership or copyright applications, where there are few watermarks needed, and many recipients of the media receive the same watermark. This is not true for video surveillance cameras, nor for audit trail applications, where a large number of watermarks are required. It should be noted that the statistical method can be adapted, so that any similar watermarks are “filtered out”. However, this only applies to a single node of watermarking, and is difficult or impossible to implement in a distributed watermarking system, such as a network of surveillance cameras.
By contrast, the watermark method developed by Tirkel et al mentioned above is based on an algebraic construction. Originally, it used m-sequences to embed watermark information line by line in an image. It was primitive, difficult to implement, and to make resistant to compression and attack. It also suffered from visibility problems, due to the fact that each watermark was embedded in a small portion of the image: a line. However, it was free from the weakness of other methods, in that the probability of missed or mistaken detection could be specified for a set of watermarks generated using this method.
While many video watermarking solutions have been proposed, few of them are appropriate for hardware implementation. In addition, most are implemented as post-processing steps after the initial video was obtained. This means that an unwatermarked version of the image or data already exists, and that constitutes a security vulnerability.
U.S. Pat. No. 6,625,295 teaches that two and three dimensional arrays are necessary or desirable. The examples in U.S. Pat. No. 6,625,295 are all based on one dimensional binary m-sequences folded into two dimensional patterns, using the Chinese Remainder Theorem. These binary m-sequences are restricted to lengths of 2n−1, and the arrays they yield are limited by pairs of relatively prime factors of 2n−1. In addition, sets of m-sequences with good cross-correlation (maximal connected sets) [A. Z. Tirkel, C. F. Osborne, N. Mee, G. A. Rankin, A. McAndrew, “Maximal Connected Sets—Application to CDMA”. International Journal of Digital and Analog Communication Systems 1994, vol. 7. p. 29-32.] are small, so that very few arrays built from them can be overlaid before they interfere with each other. Embedding multiple arrays in one image or other media is desirable, because it increases the information payload, or cryptographic security of the watermark. It is possible to substitute Gold Codes or Kasami sequences to overcome the mutual interference problem, but the sizes and aspect ratios are restricted, just as in the case of the m-sequences. The sequence folding technique can be extended to three or more dimensions, but the restrictions on sizes and aspect ratios become worse or untenable. U.S. Pat. No. 6,625,295 introduces the concept of three dimensional cross-correlation, without indicating how three-dimensional arrays are to be constructed.
A similar situation arises in video watermarking, where two-dimensional images (frames) are arranged in time (the third dimension). The pioneering work by Mobasseri [B. G. Mobasseri, “Direct Sequence Watermarking of Digital Video using m-frames”, ICIP (2), pp. 399-403, (1998)] uses an m-sequence to select specific frames, which are then watermarked using the same, or another m-sequence. This method has the shortcoming that the unmarked frames are vulnerable to attack, and the ordering of frames can also be tampered with. A watermark in three or more dimensions would address these issues. The latter is relevant when video is accompanied by audio. A multidimensional watermark can be used to detect tampering in either video, or audio, or tampering with the synchronization between the video and audio streams. The latter is easy to achieve and has been responsible for evidence being inadmissible in a court off law.
A recent high profile leak of sensitive information by Wikileak has compromised military security and caused political havoc around the world. The material accessed could have been in any format: image, audio, video or metadata. Had the data been watermarked uniquely upon access, the source of the leak could have been identified immediately.
Therefore, new constructions of families of multi-dimensional arrays with desirable properties are essential. These properties are: low off-peak autocorrelation, low cross-correlation, balance, large family size, and availability in a variety of suitable sizes. A desirable, but not essential property is that the arrays be binary. Watermarks can be designed for a single user or multi-user application.