Coding schemes are well known in the art. Coding schemes are used to transform information into a more efficient structure for several purposes. As for example, compression coding schemes are used to minimize the size of data including the information, whereas transport coding schemes are used to increase the differences in the information. Other important purposes could be to optimize energy, to increase the recognition of the information or to increase the robustness against outer influences.
An information bit stream is assembled out of code sequences. A code sequence is a predetermined alignment of data bits, as the most minimum interpretable information unit, to mechanically refer to a denotation of a real circumstance. Normally, in an information bit stream all code sequences comprise a fixed length. Each code sequence represents a data symbol. A data symbol is an interpretable link to the denotation of the real circumstance for a final user or a final using unit. As for example, a single letter of the alphabet as data symbol is an interpretable information for a user. The representation by an unique 8-bit code sequence is the calculation reference for a computer.
To point out the disadvantages of the prior art, the present patent application now describes exemplary entropy coding schemes. However, this should not be taken in a limiting purpose. The problems are also derivable for all other coding schemes.
Due to an unequal occurrence of the code sequences included in the information bit stream, the information bit stream often comprises more data bits than necessary to file the interpretable information. The entropy of a data bit is a value quantifying the interpretable information represented by one data bit in the information bit stream. Accordingly, entropy coding schemes commonly take the advantage of the unequal occurrence of code sequences in the information bit stream, and form new code sequences by reallocating short code sequences to relatively frequently occurring data symbols and long code sequences to relatively occasionally occurring data symbols. With the new code sequences a coded bit stream is assembled increasing the entropy and thus the information content, of the single data bits in the coded bit stream.
Entropy coding schemes are divided into several categories, wherein statistical methods and dictionary methods are best known. Statistical methods only analyze the occurrence of single data symbols in the information bit stream, whereas dictionary methods analyze the occurrence of whole data symbol chains. In addition, run length coding schemes are also well known coding schemes. The principle is to count consecutive occurring data symbols and to indicate only the data symbol itself and the amount of its consecutive occurrence. However, this coding scheme requires an appropriate information bit stream being provided by e.g. picture data.
Other coding schemes take other pre-requirements to assemble a coded bit stream out of a given information bit stream. As for example energy optimizing coding schemes calculate the overall energy of the information bit stream and minimize the overall energy in the new digital bit stream. Accordingly, robustness increasing coding schemes verify the information distance between the single code sequences in the information bit stream and increase the information distance between all possible coding sequences in the coded bit stream. The basic principle of a coding scheme, namely allocating new code sequences to data symbols of an information bit stream is the same, every time.
A conventional decoder reconstructs the information bit stream out of the coded bit stream. Usually, a header added to the beginning of the coded bit stream provides all necessary information to reconstruct the information bit stream and respectively the original code sequences. In case of a conventional entropy decoding scheme, this information comprises at least a coding table for a code bit stream coded by a statistically coding method and a dictionary for a coded bit stream coded by a dictionary based method. That is, prior reconstructing the information bit stream out of a coded bit stream, the conventional entropy decoder reads out information provided by the header of the coded bit stream.
The several data symbols in the information bit stream establish a data unit. A data unit is an interpretable alignment of data symbols, e.g. executable programs, pictures and texts. Usually, an information bit stream comprises a plurality of data units. However, the coded bit stream comprises code sequences of an unequal length such that the conventional decoder is not able to determine the positions of the single data units in the coded bit stream, and the coded bit stream appears as one single data unit first. Therefore, if only one predetermined data unit is needed to be decoded out of the coded bit stream, usually this data unit is not selectable until all data units prior to this data unit are decoded. Therefore, the worst case is that the conventional decoder needs to reconstruct the whole information bit stream before a designated data unit is determinable. Thus, considering all of the involved components within a coded bit stream, the conventional decoder has to handle a huge amount of data and needs a lot of computation steps to reconstruct only one designated data unit. Moreover, the information bit stream must be stored somewhere, increasing the demand on memory. That is, the conventional decoder is extremely time and hardware consumptive in decoding only a few data units out of a coded bit stream comprising a plurality of data units.
A further problem could occur when a data unit is variable in size. That is, the data symbols coded in the coded bit stream keep constant, but the desired data unit is composed of different data symbols in every new decoding step. One known example is when only a section of an image should be displayed. The image section is the data unit to be decoded. It depends on a default value, which might be provided by e.g., a user. In other words, in each time an image section is to be reconstructed out of the coded bit stream, the picture pixels vary as data symbols-vary. Especially in applications with limited memory resources, it is not suitable to store a complete image for selecting only a section of the image.
One solution is to provide a directory in the header of the coded bit stream. Such a directory should store the start points of all data units within the coded bit stream. However, a directory is only reasonable, if the structure of the data units coded in the coded bit stream is manageable in amount and size. In other words, if the coded bit stream comprises a lot of small data units, the directory gets large, such that in case of entropy encoding, no overall compression is achieved at all. Even worse, the new stream size might exceed the size of the original uncompressed data. Moreover, such a directory is not suitable to all coding schemes. Especially coding schemes used to transmit uncoded data bits as sparse as possible cannot take advantage of a directory, because the directory cannot be encoded.
A popular example for a very large amount of small-sized coded data units in a coded bit stream is the JPEG-format used in the fields of image processing. JPEG is a lossy compression method described by the ISO/IEC 10918 standard. JPEG codes an information bit stream with the aid of different coding and compression tools. This method is a very suitable example to point up the disadvantages of conventional decoders.
A digital picture is a two-dimensional area of pixels. Each pixel is a data symbol representing a color within a color table. The digital picture is therefore an information bit stream with a plurality of data symbols according to the picture pixels. The color table is usually based on the RGB color model, being an additive color model. That means that a color value is assembled by weighting and adding basic colors. In the case of the RGB color model, the basic colors are red, green, and blue. The weighting factor of a basic color is named “color level,” hereinafter simply called “level.”
To compress information bit streams in an efficient way, most compression standards like JPEG, firstly transforms the colors of the digital picture from the RGB color model into the YCbCr color model. The YCbCr is also an additive color model, wherein a color value is assembled by a luminance component describing the gray scaling of a picture and a red chrominance difference component as well as a blue chrominance difference component describing the red and blue scaling of a picture, wherein the red chrominance and the blue chrominance are color difference signals. That is, the original information bit stream is split up into three information bit streams describing the luminance component, the red chrominance difference component and the blue chrominance difference component of the digital picture.
Secondly, in the most simple way, the red chrominance difference component and the blue chrominance difference component are down-sampled due to the less sensitivity of the human eye against color contours, such that the information bit stream according to the red chrominance and the blue chrominance are transformed accordingly. In case of a mere gray scaled digital picture, this step is omitted.
Thirdly, the digital picture is split up into successional groups of 8×8 pixel blocks. The characteristics of the 8×8 pixel blocks are included in at least one minimum coding units, hereinafter called MCUs. Accordingly, each information bit stream describing the digital picture either in the luminance component, in the red chrominance difference component or in the blue chrominance difference component is divided and allocated such that each information bit stream is divided accordingly into a plurality of 8×8 blocks. In general the corresponding 8×8 blocks of the three components can be either coded interleaved in one MCU or separate into three different, non-interleaved, MCUs.
Fourthly, each 8×8 block of an MCU is transformed by a two-dimensional discrete cosine transformation, hereinafter called DCT. Without going deeply into the mathematical background, as a result, the DCT outputs one weighting factor for a constant level basis block, hereinafter called DC-basis block, and 63 weighting factors for 63 varying basis blocks, hereinafter called AC-basis blocks. FIG. 1 shows all possible gray scaling 8×8 basis blocks used to reconstruct the gray part of an 8×8 pixel block in a weighted superposition. The weighting factors outputted by the DCT, hereinafter called DCT-coefficients, allows the DC- and AC-basis blocks to be selectively intensified or weakened, such that each 8×8 block of an MCU could be assembled by a superposition of the DC-basis block and the 63 AC-basis blocks based on the DCT-coefficients. As for the signal processing, the code sequences for each MCU describing the color values in the selected (e.g. YCbCr) color model are replaced by code sequences for each MCU describing the weighting factors of the DCT coefficients. It is important to note that the DC- and AC basis blocks themselves are not the result of the two dimensional cosine transformation, but of the weighting factors. JPEG stores the DCT-coefficient of a DC-basis block differentially in respect to a DC value of a preceding 8×8 block and the DCT-coefficients of the AC-blocks absolutely in the information bit stream.
Fifthly, in a quantization step, all DCT-coefficients needed to describe a 8×8 pixel block of a MCU are sorted, such that the DCT-coefficient belonging to the DC-basis block or slowly varying AC-basis blocks are stored in a consecutive manner. The sorting is performed by a specific scan path, called a ZigZag scan, in a matrix. DCT-coefficients according to the DC-block and to slowly varying AC-blocks are now typically divided by small quantization values. And DCT-coefficients according to strongly varying AC-blocks are typically divided by high quantization values. This division step tracks the basic idea that the human eye is more sensitive to slowly varying frequencies than for to strongly varying frequencies. The quantized DCT-coefficients are rounded to integer numbers. Summarized, a set of quantized DCT-coefficients representing reduced weighting factors for one DC-basis block and 63 AC-basis blocks now define each MCU 8×8 block. That is, the information bit stream originally including color values to describe the digital picture now includes quantized DCT-coefficients. However, until now, only the data symbols, themselves, where transformed into a suitable structure. From now, the data symbols needs to be coded to form a coded bit stream.
Therefore, sixthly, the information bit stream describing the digital picture colors in the selected color model (e.g. YCbCr) are now connected to one information bit stream in such a way, that the code sequences according to the first 8×8 pixel blocks in the upper left position are firstly inserted and the code sequences according to the last 8×8 pixel blocks in the bottom right position are finally inserted. This single information bit stream is entropy and run-length encoded. The quantization of the DCT-coefficients typically generates a lot of data symbols with a level of zero in succession, hereinafter called successive zero levels. The successive zero levels are counted and run length encoded. The result is an intermediate coded bit stream comprising runs and levels. The runs correspond to the amount of successive zero levels replaced at the according position in the intermediate coded bit stream and the levels are the non zero levels at the according position in the intermediate bit stream. Finally, the intermediate coded bit stream is statistically encoded to a coded bit stream to further increase the entropy.
FIG. 2 shows a block diagram describing a conventional decompressor reconstructing an information bit stream according to a digital picture compressed and coded by the JPEG-compression method, and FIG. 3 shows an image of a ROI indicating the visual position an digital picture divided by 8×8 pixel blocks.
A coded bit stream c comprising the JPEG-compressed digital picture is received at a header parser 10 and a decoder 11. The header parser 10 reads out a header included in the coded bit stream c to provide information such as the code tables and the quantization values to the decoder 11 as well as to the quantizer 12 to be applied. Next, the decoder 11 decodes the intermediate coded bit stream including the runs and levels generated during the run length coding. As already mentioned, run length coding and statistic encoding are both coding schemes. However, the stress of the present patent application is directed to decoding methods in general, such that it might be more useful to omit one of the coding schemes used during the JPEG-compression. From now, the run-length encoding method is therefore not regarded within the JPEG-compression. This however does not limit the facts of the case. It is irrelevant whether one or a plurality of coding schemes is used. The run length encoding is only omitted for concise purposes and to get the circumstances better understandable. Therefore, the decoder 11 directly outputs an information bit stream DCT′ including quantized DCT-coefficients. The information bit stream DCT′ including the quantized DCT-coefficients is forwarded to the inverse quantizer 12 multiplying the quantized DCT-coefficients by the respective quantization values to create an information sequence DCT″ including multiplied quantized DCT-coefficients. Finally, an inverse transformation unit 13 reads out all multiplied quantized DCT-coefficients from the provided information bit stream DCT″ to construct first all MCUs based on the multiplied quantized DCT-coefficients and then the 8×8 pixel blocks. That is, the inverse transforming unit 13 creates an information bit stream including the pixels of a digital picture.
However, many applications aim to display only a section of a digital picture. Such a section is a region-of-interest, hereinafter called ROI. Especially in mobile communication it is a necessity to select ROIs, due to the limited hardware resources of the mobile devices. Recently, there has been a development to provide mobile devices with photo cameras. Photo cameras, themselves, are aimed to provide a high resolution shot photograph. However, the display of a mobile device only provides less resolution. Therefore, the display of an ROI is a suitable way to display a photograph with a higher resolution on a mobile device display.
The conventional decompressor needs to completely reconstruct an information bit stream comprising the pixels of a digital picture. Then, the ROI is selectable by an ROI selector 14. If it is wished to move the ROI through the digital picture, a large amount of data has to be managed, even if the information bit stream comprising the pixels of the digital picture is stored temporarily in a memory. That is, juddering could occur during the movement of the ROI through the digital picture.
A directory in the header of the coded bit stream c indicating the start positions of the single MCUs within the coded bit stream c might be an appropriate way to directly decode the quantized DCT-coefficient out of the information bit stream DCT′ needed to display the ROI. However, this is technically useless, as shown in the following example. A YCbCr 4:2:0 VGA-picture with a resolution of 640×480 pixels is divided into 1200 MCUs respectively 9600 8×8 blocks. On the other hand, the resulting coded bit stream c comprises around 500000 bits in case of a common compression rate of 10. In this case, each position in the header requires at least a 19 bit address such that the final directory size of the header extents to around 12 Kilobyte if the start positions of each 8×8 pixel block is indicated. The JPEG-compressed picture itself has a size of around 600 Kilobyte. That is, much more than 2% of the coded bit stream c is consumed for an appropriate directory. Such directory must be stored uncompressed. This means, even if the quality of the JPEG compressed digital picture is reduced to further minimize the size of the coded bit stream c, the directory size keeps constant.
That is, as shown on the example of a JPEG-compressed picture, there is a need to provide an apparatus and a method to decode data units out of a coded bit stream by an arbitrary coding scheme saving memory and resources on the one hand and being able to selectively decode at least one predetermined data unit out of the digital bit stream on the other hand.