A multimedia bitstream organizes data into groups, referred to as packets, for easy parsing, fast searching, error resilience, etc. A packet includes header fields and data fields. A packet starts with a unique marker to indicate start of a packet, and may end with another unique marker to indicate the end of the packet. Markers are a set of special binary strings that are reserved in a multimedia format. To facilitate identification of each packet, data codes are carefully designed to avoid emulation of any markers in a data field. Otherwise a bitstream may be parsed incorrectly to generate an improper result.
For example, in the JPEG 2000 image coding standard, a compressed bitstream in a packet contributed from coding passes of a code-block does not allow any values in the range of hexadecimal 0xFF90 through 0xFFFF for any two consecutive bytes of coded data. JPEG 2000 does not allow a data bitstream ending with a byte of hexadecimal 0xFF either. In another example, data codes using the MPEG-4 Fine Granularity Scalability (FGS) video coding standard are carefully designed to avoid emulation of any markers in a data field. For instance, in MPEG-4 FGS, compressed bit-plane data in the enhancement layer is grouped into packets separated by a bit-plane start code denoted as fgs_bp_start_code or, if the flag fgs_resync_marker_disable is set to 0, a resynchronization marker denoted as fgs_resync_marker. Both markers are byte-aligned, i.e., start at a byte boundary. The marker fgs_bp_start_code starts with 23 bits of 0 followed by 0xA plus another five bits to indicate which bit-plane the data belongs to. The marker fgs_resync_marker is 22 bits of 0 followed by bit 1. Therefore compressed bit-plane data in a packet does not allow byte-aligned 22 consecutive bits of 0.
Multimedia is often protected to prevent unauthorized consumption. Typical protection is to encrypt multimedia data and to restrict access to the decryption key(s) to only authorized users. This approach is widely used in multimedia Digital Rights Management (DRM), which provides persistent protection for content from creation to consumption. A good cipher applied to multimedia data produces “random” ciphertext which may emulate markers that the original syntax is carefully designed to avoid. Conventional methods to ensure correct decryption and decoding of encrypted multimedia content add additional information to unencrypted header fields of a packet (e.g., length of the ciphertext or a number of occurrences of marker emulation in the data field). However, the resulting bitstream may not be syntax compliant. This is because spurious markers inserted into ciphertext typically destroy syntax compliance of ciphertext.
A syntax noncompliant approach to encrypting multimedia data has several drawbacks. First, the encrypted bitstream may not be backward compatible with a corresponding decoder. For example, adding non-standard header fields to a packet may lead a compliant but encryption-unaware decoder to parse a packet incorrectly, and thereby, produce undesired results. Non-syntax compliant encryption may also impair fast random access of encrypted multimedia, a desirable feature, for example, when playing long audiovisual content. Non-syntax compliant encryption may also cause wrong parsing and false synchronization when error or data loss occurs. In this latter scenario, deteriorated error resilience may result.
In view of the above, and given a syntax which does not allow certain strings to appear in a bitstream and arbitrary syntax compliant plaintext, systems and methods to encrypt the plaintext to generate syntax compliant ciphertext that does not contain any illegal substreams are highly desired.