Conventional systems use a large volume of data for multimedia content storage. Such a large volume necessitates high quality content compression. Multimedia compression systems usually employ predictive coding to maximize the compression ratio. Additionally, due to the sheer amount of data that needs to be processed, the content is usually divided into smaller pieces. In particular, a sliding window of digital audio samples for spectral analysis and 16×16 non-overlapping macroblocks of pixels for video coding are often used. The smaller pieces are analyzed and compressed separately during compression.
Predictive coding techniques have been implemented that are capable of achieving an improved compression ratio and lower complexity. The introduction of such predictive coding techniques and the division of the input establish long term and highly complicated dependencies between divisions of the input signal. In MPEG and H.26x video coding, motion estimation is performed to find a best match between a reference known to both the encoder, the decoder, and the current input. As a result, given the same bitrate budget to be spent on the input, the quality of the coded representation is highly dependent on (i) which reference was used and (ii) how the reference was compressed and reconstructed.
Given the overall bitrate budget, the encoder should allocate more bits to portions of the input that are referenced more in subsequently encoded portions, to an extent proportional to the amount of the reference. However, due to complexity concerns, multimedia content are usually encoded in a temporally linear manner where the coded representation of the referenced portions would have to be determined prior to the time the encoder establishes the reference dependencies between the referenced and the referencing portions.
Some advanced encoding systems attempt to alleviate the non-optimal situation by jointly optimizing the encoding of portions of the input signal (i.e., encoding two consecutive frames jointly in the case of video coding). Because of the complicated and long term dependencies between the coded representations of portions, conventional practical joint optimization based optimal encoding systems have to make a compromise and can only consider a small fraction of the potential dependencies that need to be taken into account. The complexity of such conventional systems tends to grow exponentially with regard to the amount of portions and dependencies that are considered.
It would be desirable to implement an iteration based method and/or apparatus for offline high quality encoding of multimedia content.