A H.264 video encoder/decoder (CODEC) is configured, as part of the main profile, to perform context adaptive binary arithmetic coding (CABAC). During encoding, binary symbols (bins) are encoded using arithmetic coding to produce a series of bits. Video encoding is done in real time, with limits placed on the bit-rate and binary symbol rate separately. The encoding of each binary symbol takes a single arithmetic encoding cycle.
An arithmetic encoding cycle is complex and forms a bottleneck. During each arithmetic encoding cycle, 0 to 6 data bits can be produced. The data bits are either (i) output into a bit stream as a “0” or a “1” or (ii) delayed (i.e., by accumulating the date bits in a variable, e.g., bitsOutstanding). When the data bits are output into the bit stream, a data bit is output followed by the complement of the output bit bitsOutstanding number of times. The variable bitsOutstanding is subsequently cleared. Depending upon the data encoded, the amount of delayed data bits (i.e., the value of the variable bitsOutstanding) can grow without bound. For example, the H.264 recommendation limits the variable bitsOutstanding only by the size of the picture.
The lack of a limit other than the picture size causes wasted arithmetic encoding cycles. For example, as bits are encoded, yet not output, a point is reached where the bits must suddenly be output, taking time during which the arithmetic encoder is idle. When the variable bitsOutstanding (i.e., the number of delayed data bits) grows over multiple binary symbols, the output bit-rate is interrupted. Furthermore, while the variable bitsOutstanding is expanded into the output bit stream, the binary symbol rate is interrupted. Large delays are statistically rare. However, a series of small delays can significantly drop the efficiency of the arithmetic encoder.
A conventional solution, provided by the H.264 JM software reference model, performs the operations serially as a single thread. The conventional solution has a disadvantage of being inefficient to implement in hardware for real-time applications. When directly ported to hardware for real time video processing, the conventional solution can result in (i) idle cycles in the output bit-rate data path when the variable bitsOutstanding is accumulated and (ii) an idle cycle in the arithmetic encoder when the variable bitsOutstanding is expanded into the output data path. The effective binary symbol rate and bit-rate throughput of the conventional arithmetic encoder are limited. To gain back real time performance with the conventional solution, multiple arithmetic encoders must be employed. However, using multiple arithmetic encoders is practical only if the image is sliced, and each slice is processed independently. Although, slices can be used for H.264, quality and/or compression can be adversely affected.
Referring to FIG. 1, a block diagram of an arithmetic encoder 10 is shown. The arithmetic encoder 10 has an input that receives context index (CTX) and binary symbols (BIN) from a binary symbol/context calculation logic 12. The binary symbol/context calculation logic 12 generates the context index and binary symbols in response to a compressed video syntax stream. The arithmetic encoder 10 has an output that presents a series of bits to an output buffer 14. The output buffer 14 implemented as a dynamic random access memory (DRAM). The output buffer 14 generates a bit stream in response to the series of bits.
The arithmetic encoder 10 includes an arithmetic encoder/decoder (CODEC) 16 and an arithmetic output logic 18. The arithmetic encoder/decoder 16 generates a first signal (i.e., CODIRANGE) and a second signal (i.e., CODILOW) in response to the context index CTX and the binary symbols BIN received from the binary symbol/context calculation logic 12. The arithmetic output logic circuit 18 generates the series of bits (i.e., BITS) in response to the signals CODIRANGE and CODILOW.
It would be desirable to have a method and/or apparatus for sustained real time binary arithmetic coding.