Existing coder-decoder systems using entropy transcoding to reduce complexity and enable real-time encode/decode, must use quite large transcode buffers, and must also make provision for additional buffer room in the input buffer. The fundamental reason for requiring this extra data storage is that prior system transcode engines must operate on complete network abstraction layer (NAL) units (i.e. maxTranscodeUnit=max NAL size in the figure above) and cannot be used to perform any other action before the processing of the current NAL completes. If the transcode engine is to run efficiently without stalling, the input buffers to the transcode engine must be able to contain at least a complete NAL unit in addition to buffer space for building up the next complete NAL unit so that the transcode engine can do something else (e.g. process another stream) until a complete NAL is formed for a particular stream. This data storage allows sufficient buffer room for the processes feeding the transcode engine to be able to work on filling/forming the next NAL unit while the transcode engine is operating on the previous NAL unit, and minimizes the risk of transcoder stalling, at the cost of the additional buffer data storage.
The exact storage size will vary, but typically several maximum NAL sizes are required for these additional buffers. The transcode buffer may need to be even larger, to account for expansion from content adaptive binary arithmetic coding (CABAC) to content adaptive variable length coding (CAVLC). The amount of additional data storage that is required can be as great as the amount needed to store 2.5 uncompressed pictures.
The transcode engine can be a single hardware unit that is used to perform multiple simultaneous encode and decode operations. Typically, one high density (HD) television decoder with one standard density (SD) television encoder, or two decoders (SD or HD) with one encoder (SD or HD) are found in many consumer applications.
In the prior art, the transcode operation cannot be interrupted while processing a NAL unit. The advantage of this is that there is no need to store state information when switching between processing of different streams, since each NAL unit can be independently entropy decoded/transcoded. The disadvantage is the large buffer memories. In contrast, since decode from CAVLC is typically performed on a host processor that may potentially be interrupted/stored and restarted/restored on a row of macroblocks (for example) basis, the decode transcode buffer size is only dictated by the smallest unit that the decode transcode engine can process (one expanded max NAL size), and need not be sized significantly larger to accommodate it not being able to store/restore its state in the middle of a row of macroblocks (since the maximum size of a compressed row of macroblocks is much smaller than the maximum size of a compressed picture).