Video image sequences can be encoded or compressed in many different formats, with one of the most common compression schemes being MPEG (Motion Pictures Expert Group). A typical MPEG data stream consists of a sequence of compressed images with appropriate header information. The compressed images may be of three kinds: (1) I-Pictures or Intra-coded images, which are compressed using spatial compression involving a present image. I-Pictures may therefore be completely de-compressed (or reconstructed) without reference to any other image. (2) P-Pictures or Predictive-coded images which use all or part of a previous image as a reference. Compression of these images uses temporal redundancy between the reference image and the present image to achieve higher compression. However, P-Picture images require the I reference image in order to be de-compressed. (3) B-Pictures or Bi-directionally predictive-coded images. These images can avail of more than one reference image for exploiting for exploiting temporal redundancy and so, can be more efficiently compressed.
A sequence of images or pictures is typically divided into Groups of Pictures (referred to as GOPs), with each GOP typically consisting of a sequence of I, P, and B pictures. A typical picture sequence might consist of I, B, B, P, B, B, P, B, B, P, B, B, P, B etc. To limit error propagation and facilitate editing functions, each GOP typically has a duration of approximately 0.5 seconds.
It is unusual to have two P-Pictures in succession in such an MPEG sequence, as this often does not yield optimum compression. However, such sequences are equally valid within the MPEG standards. A scheme to encode additional information into an MPEG data stream based upon varied sequencing of the I, P, and B Pictures within a GOP has been proposed by Philips. This scheme is commonly referred to as PTY (i.e. picture-type). The scheme relies on requiring the encoder to sometimes make the unusual choice of encoding two consecutive P-Pictures and uses this as the basis for an alphabet. Such embedded data could be used, for example, to carry the copyright status of the video content. Such embedded data is tamper resistant to the extent that to alter or corrupt the data, the I, P, B GOP sequence must be modified. This cannot be readily accomplished without de-compressing and re-compressing at least some of the video sequence, which represents a significant barrier to tampering.
A significant limitation exists, however, with such prior data embedding schemes if the compressed data is also scrambled. One such scrambling scheme is CSS (Content Scramble System) for use with DVD drives. Scrambling typically makes the embedded data unavailable for use unless the data stream is first descrambled. This limits the use of embedded data in applications such as conditional play control and record control. Moreover, in applications where the embedded data might be used to convey copy protection information, scrambling might be used to obscure such information and thus defeat copy protection.
Hence, what is needed in the field is a robust method of carrying additional information in a compressed data stream in a manner that makes it available even if the compressed data stream is scrambled.