Referring to FIG. 1, a diagram of a conventional context-adaptive binary arithmetic decoder 80 is shown. An arithmetic decoder performs a read-modify-write on context variables stored in a single-port context memory. Due to the delays internal to the memory, read data is registered in output flip-flops before being presented back to the arithmetic decoder. Therefore, a two-cycle latency commonly exists between when a read address is decided by the arithmetic decoder and when the context data is available to the arithmetic decoder. However, the next read address depends on a current decoded binary value, at least indirectly through a current state and neighbor data. The two-cycle latency thus prevents the arithmetic decoder from running at full speed. A single binary value is decoded every other cycle because the next context variable is not immediately available after decoding the current binary value. If the next read address depends directly on the current decoded binary value, which happens from time to time, a state machine inserts two wait cycles. Otherwise, a timing path 82 from the output flip-flops in the context memory, through the arithmetic decoder and back to the address port of the context memory usually prevents timing closure. The timing closure issue is further troubled by longer setup times on the address port of the context memory than the setup time on the data port of output flip-flops.
It would be desirable to implement a context-adaptive binary arithmetic decoder with low latency.