As is known, modern video display devices are capable of displaying individual of dots of light, or “pixels”, of various colors. The term “frame” has been employed to refer to a matrix of pixels at a given resolution. For example, a frame may comprise a 640 by 480 rectangle of pixels containing 480 rows having 640 pixels each. In an uncompressed state, the amount of data required to represent a frame is equal to the product of the number of pixels together with the number of bits associated with each pixel used in color representation. Thus, in a pure black and white image lacking any grayscale shades, a pixel could be represented by one bit where “1” represents white and “0” represents black. More typically, in modern full-color displays a single pixel is represented by 8-bits, 16-bits or 32-bits. Thus, a single uncompressed 32-bit frame at a resolution of 640 by 480 would require (32*640*480) 9.8 million bits, or 1.2 Megabytes of data.
The representation of digital video involves the display of a series of frames in sequence (e.g., a motion picture is composed of 24 frames displayed every second). Thus, one second of uncompressed 32 bit frames at a pixel resolution of 640 by 480 requires 29.5 Megabytes of data (i.e., 1.2*24). As a consequence of the large amount of data associated with uncompressed digital video, various compression techniques have been employed in an effort to reduce the bandwidth required to transmit digital video.
Existing digital video compression techniques are complex processes which rely upon a variety of techniques in transforming (i.e., “encoding”) a unit of uncompressed video data into an encoded form. Such encoding permits fewer bits to be used in representing the content of the original uncompressed video data. The resultant encoded data is capable of being transformed using a reverse process (i.e., “decoding”) yielding a digital video unit of data that is either visually similar or identical to the original data. Encoding techniques which enable recovery of an identical version of the original data are characterized as “lossless”, while those that yield only visually similar versions are categorized as “lossy”.
Modern techniques of digital video compression can achieve very high levels of compression with relatively low loss of visual quality. As a general rule, modern techniques of digital video compression are very computationally intensive, with the degree of computational intensity varying directly with the extent of compression. Anything that adds to computational intensity over and above the processing overhead associated with the applicable decoding process is undesirable, since this leads to increased system complexity and expense. In particular, in most efficient forms of modern compression the amount of data in each compressed video frame will vary, sometimes to a great extent. This maximizes compression, but at the cost of making the processing power needed to decode the frames inconsistent.
Turning now to FIG. 1, a block diagram is provided of a conventional digital video encoder 125. As mentioned above, digital video encoders have been used to reduce the size of a stream of uncompressed digital video data. The digital video encoder 125 is comprised of a video processing unit 110 and an entropy compression unit 115. Digital video encoder 125 is configured to generate compressed video output by using motion estimation and motion compensation to exploit temporal redundancy in certain of the uncompressed video frames 120.
During operation of video encoder 125, video processing unit 110 accepts uncompressed video frames 120 and applies one or more video and signal processing techniques to such frames. These techniques may include, for example, motion compensation, filtering, two-dimensional (“2D”) transformation, block mode decisions, motion estimation, and quantization. The associated 2D event matrices include some or all of a skipped blocks binary matrix, a motion compensation mode (e.g. intra/forward/bi-directional) matrix, a motion compensation block size and mode matrix (e.g. 16×16 or 8×8 or interlaced), a motion vectors matrix, and a matrix of transformed and quantized block coefficients. Practical implementations of the video processing unit 110 and entropy compression unit 115 generally operate in accordance with one of the accepted video compression standards of the type discussed below.
In the special case of a video encoder employing lossy compression, these video and signal processing techniques aim to retain image information that is important to the human eye. The video processing unit 110 produces intermediate data streams 124 that are more suitable for use by the entropy encoding algorithms executed by the entropy compression unit 115 than are the uncompressed video frames 120. Conventionally, these intermediate data streams 124 would comprise transform coefficients with clear statistical redundancies and motion vectors. As an example, video processing unit 110 may apply a block discrete cosine transform (DCT) or other transform function to the output of motion compensation and quantize the resulting coefficients.
An entropy coding technique such as Huffman Coding may then be applied by entropy compression unit 115 to the data streams 124 in order to produce a compressed stream 130. Entropy compression unit 115 may compress the data streams 124 with no loss of information by exploiting the statistical redundancies therein. The compressed stream 130 output by entropy compression unit 115 is of significantly smaller size than both the uncompressed video frames 120 and the intermediate data streams 124.
As shown in FIG. 2, a conventional digital video decoder 230 may be bifurcated into two logical components: entropy decompression unit 235 and video processing unit 240. Entropy decompression unit 235 receives the compressed data stream and outputs data streams 250, which typically comprise motion vectors and transform (or quantized) coefficients. Video processing unit 240 receives the data stream output 250 from decompression unit 235 and performs operations such as motion compensation, inverse quantization, and inverse 2-D transformation in order to reconstruct the uncompressed video frames.
The Motion Pictures Experts Group (MPEG) and the International Standards Organization (ISO) have produced international standards specifying the video compression and decompression algorithms of the type implemented by the encoder 125 and decoder 230, respectively. These standards include MPEG-1, MPEG-2, MPEG-4, H.261, H.263, and permit equipment and software from different manufacturers to exchange compressed video formatted in accordance with such standards.
FIG. 3 shows a graph 300 displaying an approximate representation of the relative processing power expected to be required in connection with decoding of frames of different sizes. For example, FIG. 3 shows that certain frame (see, e.g., frame 304) require much more processing power than other frames (see, e.g., frame 302). Any processing of frames required in addition to decoding (e.g., decryption) consumes yet further processing resources.
As is known, various types of encryption schemes may be used to protect data. In the digital realm, encryption is often implemented by using a collection of bits of some length known as a “key” to execute a predictable transform on a unit of data. This yields another unit of data that cannot be “read” without knowledge of the key used to execute the transform. The process of encryption is only easily reversible to the extent the encrypting key or its counterpart (e.g., a “public” key) is available for use in transforming or “decrypting” the encrypted data back into the original form. Video data is often encrypted using a symmetric block cipher conforming to, for example, the Data Encryption Standard (DES) or Advanced Encryption Standard (AES).
Turning now to FIG. 4, a graphical representation 400 is provided of the processing power necessary required to both decrypt and decode a sequence of frames. FIG. 4 also depicts graph 300, which illustratively represents the relatively smaller amount of processing power required to decode unprotected (i.e., unencrypted) frames. As may be appreciated by reference to FIG. 4, the maximum processing power required to both decrypt and decode a frame increases proportionally to its size. As a consequence, adequate processing power needs to be provided to ensure that even the largest frames expected to be received may be successfully decrypted and decoded. This requirement may significantly increase system cost and complexity, even though only a relatively small percentage of received frames may necessitate use of the full extent of available peak processing power. Accordingly, a need exists for an adequately secure technique for bounding the resources consumed during decryption, thereby reducing peak processing requirements.