In the field of video and audio signal processing, there are a number of applications where a media signal processor has to perform temporal synchronization of the media signal. Temporal synchronization generally refers to a process for finding a reference coordinate system in the time domain of the signal. Typically, video signals are comprised of a stream of image frames at discrete time intervals. In many applications, however, a media processor may receive a collection of frames without knowing the temporal coordinate system for the frames. For a given arbitrary frame, the time point of that frame within a given program, scene or other semantic grouping is often unclear. Frames may be inserted, lost, or added due to transmission errors in the communication channel or due to intentional attacks on the stream. Further, the sampling rate may fluctuate due to speed changes, bandwidth fluctuations, or format conversions. This lack of temporal synchronization can lead to improper interpretation and handling of the video.
One specific example where temporal synchronization plays a role is video digital watermarking, where auxiliary data is hidden within a video stream and extracted without the original video signal. When it receives a sequence of frames at an arbitrary point in a given frame, the digital watermark detector is often unable to extract the auxiliary data accurately without establishing temporal coordinates relative to the embedding coordinates. A similar synchronization problem can occur in the process of interpreting video transmitted in a network protocol over a network communication channel, or in the process of interpreting a compressed video compression format over a transmission channel or from storage.
To understand temporal synchronization, it is useful to consider an example application such as digital watermarking. Digital watermarking is a process for modifying physical or electronic media to embed a hidden machine-readable code into the media. The media may be modified such that the embedded code is imperceptible or nearly imperceptible to the user, yet may be detected through an automated detection process. Most commonly, digital watermarking is applied to media signals such as images, audio signals, and video signals. Temporal synchronization is relevant to digital watermarking applications for media signals with a temporal component such as video and audio. For the sake of illustration, we focus on video signals, but our techniques are applicable to other temporal signals, such as audio. Our examples can be extended to audio by considering temporal blocks of audio as frames.
Digital watermarking systems typically have two primary components: an encoder that embeds the watermark in a host media signal, and a decoder that detects and reads the embedded watermark from a signal suspected of containing a watermark (a suspect signal). The encoder embeds a watermark by subtly altering the host media signal such that the watermark is imperceptible or nearly imperceptible to a human, yet automatically detectable with appropriate knowledge of the embedding function. The reading component analyzes a suspect signal to detect whether a watermark is present. In applications where the watermark encodes information, the reader extracts this information from the detected watermark. The embedding and reading functions typically employ parameters, typically referred to as a key or keys, that identify the attributes of the host signal that are changed to embed a watermark signal and that define how those attributes are to be interpreted to carry hidden message symbols. Several particular digital watermarking techniques have been developed for signals with a temporal component. The reader is presumed to be familiar with the literature in this field. Particular techniques for embedding and detecting imperceptible watermarks in media signals are detailed in the assignee's U.S. Pat. No. 6,122,403, which is herein incorporated by reference.
The challenge of temporal synchronization in video can be viewed as an analysis of a suspect video stream to determine its temporal coordinate system. For many applications, it is sufficient that the coordinate system is defined relative to neighboring sets of frames as opposed to some absolute reference point. One way to address the challenge is to search for the coordinate system, or some attribute of the hidden auxiliary data that indicates the coordinate system. In video watermarking, the watermark signal may be frame independent. For example, the key or embedding function may change with each frame. While this makes the watermark difficult to detect by unauthorized processes or attackers, it tends to complicate the synchronization process.
Alternatively, the watermark may be time invariant. For example, the watermark may be the same in every frame. While this extent of redundancy reduces the search space for the watermark to a great extent, it also simplifies unauthorized detection. Defined generally, temporal redundancy in watermarking refers to the degree to which the watermark signal can be deduced given the host signal's history. As temporal redundancy increases over time, the watermark is easier to deduce by a detector and attacker alike. Preferably, there should be an appropriate balance between ease of synchronization and security in watermarking.
This disclosure provides a protocol for temporal synchronization of media signals with temporal components. While the disclosure focuses on temporal synchronization of video in a watermark application, the protocol is applicable to other applications and media types.
The synchronization protocol achieves initial synchronization by finding an initial synchronization key through analysis of a temporal media signal stream. It then uses features of the stream and a queue of one or more keys from previous frames to derive subsequent keys to maintain synchronization. If synchronization is lost due to channel errors or attacks, for example, the protocol uses the initial synchronization key to re-establish synchronization. In digital watermarking applications, the synchronization protocol is agnostic to the digital watermark embedding and reading functions.
Further features of the synchronization protocol will become apparent with reference to the following detailed description and accompanying drawings.